Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decentralized Mutable Torrents #35

Merged
merged 23 commits into from
Aug 9, 2016
Merged

Conversation

lmatteis
Copy link
Contributor

Discussion in #34

Also included a reference implementation at the bottom.


Consumers issue a ``get`` request using the ID of the mutable torrent they are interested in downloading. Periodically polling such ID by issuing ``get`` requests to see whether the ``v`` property of the response has updated. If an update is found, the torrent can be updated using the new infohash.

Both publisher and consumer should periodically ``put`` the mutable items they have active to keep them alive in the DHT.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be appropriate to propose an algorithm for determining how frequently to re-announce here. For very popular pieces of content, it could be that having every consumer re-announcing would put too much load on the nodes storing it. I believe @the8472 described an algorithm like this when we originally discussed the put/get protocol for distributed RSS feeds, 5 years ago or so.

basically, if it's possible to estimate how many other peers are re-publishing it, or how frequently it is being republished, one could adjust the probabilities to lower the load.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a republishing tuning algorithm is general enough that it should go into BEP44.

For a single-item refresh the main idea would be to check how many nodes returned the current value and only refresh when it falls under the redundancy limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@lmatteis lmatteis Jul 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @the8472 that a "keep-alive" algorithm is general enough to go directly in bep44. After all, other DHT-store apps would need to keep things alive, for different use case (not just for feeds and mutable torrents).

I'm wondering though, doesn't regular DHT employ a similar algorithm for announces?

@ssiloti
Copy link
Contributor

ssiloti commented Aug 2, 2016

I'm nervous about defining the format of the salt parameter before any use cases for it are even proposed. If salts end up being mostly human readable then URL encoding is obviously appropriate, but if they are usually going to be hashes then it looks oddly inconsistent next to the hex encoded key. Does anyone have any use cases they see for salt values?

@the8472
Copy link
Contributor

the8472 commented Aug 2, 2016

Thinking about it, human-readable salts probably are not a good idea since they have a maximum length of 64 bytes, which is fairly short for a name, especially if we consider multi-byte characters.

If we want human-readable names it might be better to have a separate name field and hash that to produce a salt.

I guess the use-case would be a single content provider (same pubkey) providing a bunch of named things?

But if we really wanted to allow human-readable strings we need IRI examples, encoding, test vectors to produce the right salts. Otherwise things are likely to go wrong when someone tries unicode.

Salt in base would be a more conservative approach.

@lmatteis
Copy link
Contributor Author

lmatteis commented Aug 2, 2016

My problem with other N base encodings in my opinion is that we're asking some torrent clients to probably import a library just to do the encoding. For instance to use the base64url in WebTorrent, I have to use an external lib. Other bases are widely supported, but add ugly = chars for padding. So base16 (hex) is certainly the safest/easiest to implement.

I agree that also doing Unicode is hard, but certainly well supported. Also 64 bytes seem like a lot for a name:

> Buffer('My salt is pretty huge 漢字 foo bar. Can reach 64 bytes? Yup..', 'utf8').length
64

But it's true that we need tests if we want to go down this human-readable path. And since most magnet links are shared inside browsers, are we sure IRIs are fully supported by most common browsers?

@the8472
Copy link
Contributor

the8472 commented Aug 2, 2016

My problem with other N base encodings in my opinion is that we're asking some torrent clients to probably import a library

Not a strong argument in my opinion. Many things need a library or a custom implementation, I don't see how this is any different.

Anyway, we haven't ruled ruled out base16. I'm just agreeing with @ssiloti that urlencoding probably is not a good choice.

As someone mentioned in #34 the advantage of base64 for the pubkey is that one can brute-force some nice-looking keys if desired. And if we're already using base64 for the pubkey we may as well use it for the salt.

Also 64 bytes seem like a lot for a name:

English is not a good test because latin fits into 1 byte per character in utf8. Kanji may not be a good test either because the writing system is more information-dense per character, which may compensate the need for more bytes per character. Arabic, brahmi or hebrew scripts are less byte-dense.

Anyway, we don't have to do the human-readable at all. We can just go with an encoded salt and save ourselves from all the trouble.

But if we want to do it then we should go down the hashing route in my opinion.

are we sure IRIs are fully supported by most common browsers?

Basically, yes. And it's always possible to provide backwards-compatible, percent-encoded URIs if necessary.

@ssiloti
Copy link
Contributor

ssiloti commented Aug 3, 2016

I like the idea of having the salt use the same encoding as the pubkey with a separate, mutually exclusive, field which is explicitly human readable and hashed to produce the salt value.

@lmatteis
Copy link
Contributor Author

lmatteis commented Aug 3, 2016

I guess we don't need the actual salt in the link then?

magnet:?xs=urn:btpk:[ Public Key (Hex) ]&n=[ Name of channel (URL encoded) ]

From a publisher's UI perspective I imagine a section in the settings:

  • Enable publishing mutable torrents

<import/export keypair button>

Using public key: 8543d3e6115f0f98c944077a4493dcd543e49c739fd998550a1f614ab36ed63e

Name Torrent Generated magnet link
Archive.org-2016
Archive.org-2016-dump.torrent
magnet:
?xs=urn:btpk:8543d3e6115f0f98c944077a4493
dcd543e49c739fd998550a1f614ab36ed63e
&n=Archive.org-2016
Archive.org-2012
Archive.org-2012-dump.torrent
magnet:
?xs=urn:btpk:8543d3e6115f0f98c944077a4493
dcd543e49c739fd998550a1f614ab36ed63e
&n=Archive.org-2012

last published: 15m ago


I'm a little weary about making the name editable, as a publisher could easily lose the link between all of her consumers if she changes such field.

@lmatteis
Copy link
Contributor Author

lmatteis commented Aug 3, 2016

So guys I'm a little doubtful about assigning readable names to salts. The use case above seems fragile. Anyway the name of the mutable torrent is given by the actual torrent.

Just a hex salt ?s= parameter seems to make more sense. We can auto-increment it to allow multiple torrents under the same pub key.

magnet:?xs=urn:btpk:8543d3e6115f0f98c944077a4493dcd543e49c739fd998550a1f614ab36ed63e
magnet:?xs=urn:btpk:8543d3e6115f0f98c944077a4493dcd543e49c739fd998550a1f614ab36ed63e&s=1
magnet:?xs=urn:btpk:8543d3e6115f0f98c944077a4493dcd543e49c739fd998550a1f614ab36ed63e&s=2
magnet:?xs=urn:btpk:8543d3e6115f0f98c944077a4493dcd543e49c739fd998550a1f614ab36ed63e&s=3
etc...

Also I'm thinking of changing the name of the BEP into "Updating torrents via DHT mutable items", or "DHT mutable torrents". "Decentralization" is too strong of a word.

@feross
Copy link
Contributor

feross commented Aug 4, 2016

This is quite an exciting BEP -- thanks for putting in the work to hash this out, @lmatteis!

@ssiloti
Copy link
Contributor

ssiloti commented Aug 5, 2016

please label this BEP 46

@lmatteis
Copy link
Contributor Author

lmatteis commented Aug 6, 2016

@ssiloti done

Publishers should issue a mutable ``put`` request when they want to notify
consumers about an update of a torrent. The value of the payload ``v`` is the 20
byte infohash of such torrent. Note that there is a 1-to-1 mapping between a
mutable DHT item and a torrent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To allow for future extensions there should be a sentence here specifying that data beyond the first 20 bytes should be ignored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BEP44 accepts any bencoded structure, not just plain strings. so if we want extensibility it probably should be a dictionary with, for now, one key.

Copy link
Contributor Author

@lmatteis lmatteis Aug 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, apart from the magnet link, the client requesting the mutable item has no idea whether it is an updatable torrent.

So perhaps it can be:

{ v: { ih: <20 byte info hash> } }

@ssiloti let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bencoded dict is fine. I just want to avoid painting ourselves into a corner where we can't change the item's value without breaking existing clients.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssiloti ok check latest changes.

Archive.org could publish their database dumps using decentralized mutable
torrents, and benefit from not having to maintain a central HTTP feed server to
notify consumers about updates.
time in a more decentralized fashion. Consumers interested in publishers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the publisher's?

@ssiloti ssiloti merged commit a05d479 into bittorrent:master Aug 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants