Rendezvous Certificates #7517

InvictusRMC · 2023-06-28T11:43:13Z

This PR adds rendezvous certificates to Tribler. This serves as a method for determining the online time of fellow peers. The initial certificate piggybacks on the introduction message of the PopularityCommunity. Subsequent rendezvous pings are served through separate payloads. The implementation does not use a separate community in order to reduce overhead.
The design works as follows: Peer A sends a ping message including a nonce serving as a challenge. Peer B returns the signed nonce. Peer A then increases its counter for Peer B. Online time can be estimated by multiplying the number of pings by the interval between pings.

This reverts commit 91d360a.

InvictusRMC · 2023-06-28T11:43:54Z

src/tribler/core/components/popularity/popularity_component.py

@@ -22,13 +24,16 @@ async def run(self):
        metadata_store_component = await self.require_component(MetadataStoreComponent)
        torrent_checker_component = await self.require_component(TorrentCheckerComponent)

+        rendezvous_db = RendezvousDatabase(db_path=self.session.config.state_dir / STATEDIR_DB_DIR / PopularityCommunity.RENDEZVOUS_DB_NAME)


Does this require a separate Component instead?

No, it's perfectly acceptable to create a database here. However, don't forget to close the database if it's necessary.

tribler/src/tribler/core/components/knowledge/knowledge_component.py

Lines 57 to 58 in 3b760ab

if self.knowledge_db:

self.knowledge_db.shutdown()

drew2a

First of all, congratulations on your first contribution to the Tribler codebase!

Regarding the PR — it looks great. I've only reviewed the community part so far and we've already discussed some points offline. I assume that addressing these points could lead to changes in the code, so I will return to the review process later (please request a review when you are ready).

Besides that, I have a comment about defining the database. You've used the old "metadata"-style approach to define separate entities, which is as follows:

        self.MiscData = misc.define_binding(self.database)
        self.Certificate = certificate.define_binding(self.database)

I could be mistaken, but it seems clear that this approach may not be convenient for the end-developer. We have transitioned to a different approach, as utilized in KnowledgeDB. Please see the following:

tribler/src/tribler/core/components/knowledge/db/knowledge_db.py

Line 69 in 3b760ab

self.instance.bind('sqlite', filename or ':memory:', create_db=True)

@kozlovsky am I right regarding the DB definition?

drew2a · 2023-06-29T10:12:37Z

src/tribler/core/components/popularity/community/popularity_community.py


-    def __init__(self, *args, torrent_checker=None, **kwargs):
+    community_id = unhexlify('9aca62f878969c437da9844cba29a134917e1649')


Are you certain that altering the community ID is necessary? As I understand it, this change results in a fork in the Popularity Community.

drew2a · 2023-06-29T10:36:31Z

src/tribler/core/components/popularity/community/popularity_community.py

+        self.register_task("ping_rendezvous", self.ping_rendezvous,
+                           interval=PopularityCommunity.PING_INTERVAL_RENDEZVOUS)


NIT: I believe it might be slightly better to use 'self' here, as well as in the line above (which I understand isn't your change). Using 'self' offers more flexibility for testing (this is preferable because it allows you to alter its value without meddling with the class namespace, which could potentially affect other tests.):

Suggested change

self.register_task("ping_rendezvous", self.ping_rendezvous,

interval=PopularityCommunity.PING_INTERVAL_RENDEZVOUS)

self.register_task("ping_rendezvous", self.ping_rendezvous,

interval=self.PING_INTERVAL_RENDEZVOUS)

An example:

class A: NUMBER = 1 def __init__(self): print(A.NUMBER) class ChildA(A): NUMBER = 2 class B: NUMBER = 1 def __init__(self): print(self.NUMBER) class ChildB(B): NUMBER = 2 A() # prints 1 ChildA() # prints 1 B() # prints 1 ChildB() # prints 2

While the current design is functional, I believe it's not the most user-friendly as it doesn't provide an intuitive means for customizing the class's behavior through variables. I propose a slightly modified version which might be a bit more optimal:

class PopularityCommunity(RemoteQueryCommunity, VersionCommunityMixin): ... PING_INTERVAL_RENDEZVOUS = 60 # seconds def __init__(self, *args, torrent_checker=None, rendezvous_db=None, ping_rendezvous_interval: float = PING_INTERVAL_RENDEZVOUS, **kwargs): ... self.register_task("ping_rendezvous", self.ping_rendezvous, interval=ping_rendezvous_interval)

For instance, in the tests, you could easily pass a custom value to the class as follows:

community = PopularityCommunity( self._ipv8_component.peer, self._ipv8_component.ipv8.endpoint, Network(), ping_rendezvous_interval=0.1 )

drew2a · 2023-06-29T10:42:23Z

src/tribler/core/components/popularity/community/popularity_community.py

@@ -35,22 +42,98 @@ class PopularityCommunity(RemoteQueryCommunity, VersionCommunityMixin):
    GOSSIP_POPULAR_TORRENT_COUNT = 10
    GOSSIP_RANDOM_TORRENT_COUNT = 10

-    community_id = unhexlify('9aca62f878969c437da9844cba29a134917e1648')
+    PING_INTERVAL_RENDEZVOUS = 60  # seconds
+    RENDEZVOUS_DB_NAME = 'rendezvous.db'


NIT: it might be better to declare this variable within the component to align with the current methodology of separating responsibilities.

Components are more specific; they can interact with particular databases, specify particular file names, etc. On the other hand, communities are more abstract, and databases and other instances should ideally be passed to them.

Consequently, we would have two instances where communities are created:

During runtime, they are created by Components.

During test execution, they are created by the TestBase provided by ipv8.

drew2a · 2023-06-29T10:50:05Z

src/tribler/core/components/popularity/community/popularity_community.py


        # Init version community message handlers
        self.init_version_community()
+        self.rendezvous_cache = RendezvousCache()
+
+    def send_introduction_request(self, peer):


Perhaps the logic for rendezvous_request could be relocated to the on_introduction_response function. This could simplify interactions by eliminating the need to extend introduction_request. As a bonus, this change might allow us to retain the previous community ID.

def on_introduction_response(self, peer, dist, payload): super().on_introduction_response(peer, dist, payload) ... # preform rendezvous_request

The current approach is still compatible! Older peers will just ignore the extra bytes. I could drop this entire extra logic though. We can get it to work through only separate payloads.

drew2a · 2023-06-29T10:53:57Z

src/tribler/core/components/popularity/community/popularity_community.py

+        else:
+            # This nonce has been burned.
+            self.rendezvous_cache.clear_peer_challenge(peer)


Codacy is right :)

Suggested change

else:

# This nonce has been burned.

self.rendezvous_cache.clear_peer_challenge(peer)

# This nonce has been burned.

self.rendezvous_cache.clear_peer_challenge(peer)

drew2a · 2023-06-29T10:54:40Z

src/tribler/core/components/popularity/community/popularity_community.py

+            if not certificate:
+                certificate = self.rdb.Certificate(public_key=peer.mid, counter=0)
+            certificate.counter += 1
+        return


Unnecessary return :)

Suggested change

return

drew2a · 2023-06-29T12:08:37Z

src/tribler/core/components/popularity/popularity_component.py

@@ -22,13 +24,16 @@ async def run(self):
        metadata_store_component = await self.require_component(MetadataStoreComponent)
        torrent_checker_component = await self.require_component(TorrentCheckerComponent)

+        rendezvous_db = RendezvousDatabase(db_path=self.session.config.state_dir / STATEDIR_DB_DIR / PopularityCommunity.RENDEZVOUS_DB_NAME)


No, it's perfectly acceptable to create a database here. However, don't forget to close the database if it's necessary.

tribler/src/tribler/core/components/knowledge/knowledge_component.py

Lines 57 to 58 in 3b760ab

if self.knowledge_db:

self.knowledge_db.shutdown()

synctext

First of all, congratulations also from my side on one of those rare phd code contributions to Tribler! Much appreciated 🚀 🥇 🚀

Goals is said to be: determining the online time of fellow peers. Reading through the code I got inspiration for an "algorithm 1" type of innovation we require for solid publications. return RendezvousCertificate.get(public_key == pk).count() this code calculates the count of rendezvous certificates. In future, use an "algorithm 1" type of approach to calculate the probability of this identity being a Sybil, given the volume, age, and IPv4 diversity of the "rendezvous DAG". This goes a bit beyond MeritRank or is equal to meritRank? Should be N log N complexity.

InvictusRMC · 2023-07-03T08:18:38Z

Great idea! This first version is just datacollection. This scoring will go beyond MeritRank as it will require multiple dimensions. The idea for this is as follows: run MeritRank on the metrics, individually, to achieve a score for each metric. Next we introduce implementation specific weights to converge all scores to a single score.

kozlovsky · 2023-07-06T10:41:21Z

Besides that, I have a comment about defining the database. You've used the old "metadata"-style approach to define separate entities, which is as follows:
        self.MiscData = misc.define_binding(self.database)
        self.Certificate = certificate.define_binding(self.database)
I could be mistaken, but it seems clear that this approach may not be convenient for the end-developer. We have transitioned > to a different approach, as utilized in KnowledgeDB. Please see the following:

tribler/src/tribler/core/components/knowledge/db/knowledge_db.py

Line 69 in 3b760ab

self.instance.bind('sqlite', filename or ':memory:', create_db=True)

@kozlovsky am I right regarding the DB definition?

I think the current approach used in this PR has some benefits; it allows PyCharm IDE to understand the type of expressions like self.Certificate.

In the KnowledgeDatabase, all entities are defined inside a single define_binding method. This way it is possible to use less number of files, as all entities are defined in a single file. The drawback is that PyCharm can't deduce types of expressions like self.instance.StatementOp.

If you want to combine the benefits of both approaches, you can define all entities in a single RendezvousDatabase method called from the __init__, return a tuple of entity classes as a result value, and assign them as fields of the RendezvousDatabase:

class RendezvousDatabase:
    def __init__(self, db_path: Union[Path, type(MEMORY_DB)]):
        self.database = Database()
        self.Certificate, self.MiscData = self.define_binding()
        ...
    def define_binding(self):
        class Certificate(self.database.Entity):
            ...
            
        class MiscData(self.database.Entity):
            ...
            
        return Certificate, MiscData

Then, all entities can be defined in a single file (if it is considered beneficial), and PyCharm understands the types of expressions like rdb.Certificate

kozlovsky · 2023-07-06T10:54:58Z

src/tribler/core/components/popularity/rendezvous/db/orm_bindings/certificate.py

+        def get_count(cls, pk: bytes) -> int:
+            return RendezvousCertificate.get(public_key == pk).count()


It looks like the method is currently not used. It can probably be deleted.

If you want to keep the method, it should be fixed, as the code is incorrect: RendezvousCertificate.get(...) call returns a single certificate object, and a single certificate object does not have the count() method.

The correct code (if it is necessary) should probably looks like:

def get_count(cls, pk: bytes) -> int: certificate = RendezvousCertificate.get(public_key=pk) return 0 if certificate is None else certificate.counter

kozlovsky · 2023-07-06T10:57:26Z

src/tribler/core/components/popularity/community/popularity_community.py

+        with db_session:
+            certificate = self.rdb.Certificate.get(public_key=peer.mid)


To avoid possible database locking, it is better to either use here db_session(immediate=True) or Certificate.get_for_update(public_key=peer.mid) (the result is the same)

synctext · 2023-08-15T08:58:03Z

@InvictusRMC Can you address the comments and requested changes? {you will find out I guess that "production code polishing" is like eating oatmeal, brushing teeth, etc. 🤣 }
Please inquire about the release scheduling with @drew2a. We're stabilising for many months for a Tribler release. Your code would be the exclusive focus of the next-release possibly.

synctext · 2023-08-16T07:54:24Z

btw Advise of @qstokkink is to test the impact on performance using minimal test Gumby experiment. A single UDP message triggering a database write/commit is scary 😨 Something we had in 2013 Dispersy times

InvictusRMC · 2023-08-16T11:27:54Z

@InvictusRMC Can you address the comments and requested changes? {you will find out I guess that "production code polishing" is like eating oatmeal, brushing teeth, etc. 🤣 }
Please inquire about the release scheduling with @drew2a. We're stabilising for many months for a Tribler release. Your code would be the exclusive focus of the next-release possibly.

Comments addressed! Thank you for the reminder.

InvictusRMC · 2023-08-16T11:29:05Z

btw Advise of @qstokkink is to test the impact on performance using minimal test Gumby experiment. A single UDP message triggering a database write/commit is scary 😨 Something we had in 2013 Dispersy times

Discussed with @qstokkink and @kozlovsky and they advised me to implement batching logic, as a large number of transactions would slow things down considerably.

qstokkink · 2023-08-16T12:41:01Z

After getting a small lecture about MeritRank, my advice is as follows.

Hook up a PeerObserver to ipv8 before starting it (line 63) here:

tribler/src/tribler/core/components/ipv8/ipv8_component.py

Lines 60 to 63 in 737a7e7

    
           ipv8 = IPv8(ipv8_config_builder.finalize(), 
        
                       enable_statistics=config.ipv8.statistics and not config.gui_test_mode, 
        
                       endpoint_override=endpoint) 
        
           await ipv8.start()

Inside your new observer's remove_peer(self, peer: Peer) callback, store time.time() - peer.creation_time for the peer.public_key in a database. This database is probably best managed inside of the Ipv8Component itself.
Future work? Make a new community (or hook into MeritRank code) to share these entries.

synctext · 2023-10-10T08:10:34Z

This task is now taking over 4 months. @qstokkink indicated it is possible to re-factor the introduction-request and introduction-response with the ping features required for Sybil attack protection. We need a introduction-response message with both public keys, nounce beyond 16 bits(or OK, no replay attack vulnerability?), and signature.

Related work. This would make Tribler the first academically self-organising system with Sybil protection. See IPFS attack in a USENIX paper, DHT repair blog and DHT health reporting

Let's make the work by @InvictusRMC plus @grimadas the key feature of the upcoming 7.14 release. Preparing for MeritRank Production usage!

synctext · 2023-10-12T14:43:24Z

@qstokkink as you pointed out today: no ORM in IPv8 ❌ No database storage. Can you comment here a possible new API which would provide the signed certificates to the IPv8 community. Thus how can we request the rendezvous certificates (Pub-key-Them,Pub-key-OURS,nonce,signature-Them) from IPv8?
We are not trying to slow the network, so the default RATE_RENDEZVOUS_CERTIFICATES == 10 seconds inside IPv8. Meaning maximum 1 (new?) random certificate per 10 seconds (or some other simple rate limit mechanism).

InvictusRMC · 2023-10-16T10:07:50Z

@qstokkink rebased this into the ipv8 module of Tribler: #7630. Thank you for picking up the slack 🙏. Closing for now.

drew2a · 2023-10-20T08:05:17Z

Please be aware that #7630 is not a rebase of #7517; rather, it's an entirely distinct PR.

qstokkink · 2023-10-20T08:27:09Z

The commit e99dba8, which is part of #7630, is the rebase of #7517.

InvictusRMC added 8 commits May 31, 2023 11:35

Upgrade PyQt, Yarl, and LibTorrent dependencies

91d360a

Add initial rendezvous design

066a02b

Finalize rendezvous design

31dbcc8

Add rendezvous tests

b3e79ca

Move tests to seperate class

f1120ee

Use db name const + move tests

427f6c5

Revert "Upgrade PyQt, Yarl, and LibTorrent dependencies"

263b293

This reverts commit 91d360a.

Remove unnecessary override

eb0b667

InvictusRMC requested review from synctext, a team and drew2a and removed request for a team June 28, 2023 11:43

InvictusRMC commented Jun 28, 2023

View reviewed changes

drew2a suggested changes Jun 29, 2023

View reviewed changes

synctext approved these changes Jul 3, 2023

View reviewed changes

xoriole requested a review from qstokkink July 6, 2023 08:05

kozlovsky suggested changes Jul 6, 2023

View reviewed changes

qstokkink removed their request for review July 6, 2023 11:44

synctext mentioned this pull request Aug 15, 2023

PhD chapter?: Long-enduring Leaderless Circular Economy (MicroDAOs) #7452

Open

Address requested changes and comments

6dc9d4c

drew2a mentioned this pull request Sep 15, 2023

The big migration: from the Channels to the Knowledge Graph #7398

Closed

synctext mentioned this pull request Oct 12, 2023

phd placeholder: "Decentralized Machine Learning Systems for Information Retrieval" #7290

Open

qstokkink mentioned this pull request Oct 13, 2023

Rendezvous certificates rebased #7630

Merged

InvictusRMC closed this Oct 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rendezvous Certificates #7517

Rendezvous Certificates #7517

InvictusRMC commented Jun 28, 2023

InvictusRMC Jun 28, 2023

drew2a Jun 29, 2023

drew2a left a comment

drew2a Jun 29, 2023

drew2a Jun 29, 2023

drew2a Jun 29, 2023

drew2a Jun 29, 2023

InvictusRMC Aug 16, 2023

drew2a Jun 29, 2023

drew2a Jun 29, 2023

drew2a Jun 29, 2023

synctext left a comment •

edited

Loading

InvictusRMC commented Jul 3, 2023 •

edited

Loading

kozlovsky commented Jul 6, 2023 •

edited

Loading

kozlovsky Jul 6, 2023

kozlovsky Jul 6, 2023

synctext commented Aug 15, 2023

synctext commented Aug 16, 2023

InvictusRMC commented Aug 16, 2023

InvictusRMC commented Aug 16, 2023

qstokkink commented Aug 16, 2023

synctext commented Oct 10, 2023 •

edited

Loading

synctext commented Oct 12, 2023 •

edited

Loading

InvictusRMC commented Oct 16, 2023

drew2a commented Oct 20, 2023

qstokkink commented Oct 20, 2023


		def __init__(self, args, torrent_checker=None, *kwargs):
		community_id = unhexlify('9aca62f878969c437da9844cba29a134917e1649')

		self.register_task("ping_rendezvous", self.ping_rendezvous,
		interval=PopularityCommunity.PING_INTERVAL_RENDEZVOUS)

		def get_count(cls, pk: bytes) -> int:
		return RendezvousCertificate.get(public_key == pk).count()

		with db_session:
		certificate = self.rdb.Certificate.get(public_key=peer.mid)

Rendezvous Certificates #7517

Rendezvous Certificates #7517

Conversation

InvictusRMC commented Jun 28, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

drew2a left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

synctext left a comment • edited Loading

Choose a reason for hiding this comment

InvictusRMC commented Jul 3, 2023 • edited Loading

kozlovsky commented Jul 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

synctext commented Aug 15, 2023

synctext commented Aug 16, 2023

InvictusRMC commented Aug 16, 2023

InvictusRMC commented Aug 16, 2023

qstokkink commented Aug 16, 2023

synctext commented Oct 10, 2023 • edited Loading

synctext commented Oct 12, 2023 • edited Loading

InvictusRMC commented Oct 16, 2023

drew2a commented Oct 20, 2023

qstokkink commented Oct 20, 2023

synctext left a comment •

edited

Loading

InvictusRMC commented Jul 3, 2023 •

edited

Loading

kozlovsky commented Jul 6, 2023 •

edited

Loading

synctext commented Oct 10, 2023 •

edited

Loading

synctext commented Oct 12, 2023 •

edited

Loading