Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement deterministic subnets peer discovery #3648

Open
divagant-martian opened this issue Oct 20, 2022 · 6 comments
Open

Implement deterministic subnets peer discovery #3648

divagant-martian opened this issue Oct 20, 2022 · 6 comments
Assignees

Comments

@divagant-martian
Copy link
Collaborator

Description

After #2847 we still need to implement the changes required to search for peers that should be subscribed to their deterministic long lived subnets. This search is in the form of prefix searching, in contrast to our current predicate search. After this is implemented, and as a separate issue, we also need to think how to deal with peers that should be subscribed a long lived subnet but aren't.

Version

unstable

Present Behaviour

we can switch to using deterministic long lived subnets but this only changes the subscription process. The search process is still predicate based

Expected Behaviour

Add (as in, include without removing the current code for backwards compatibility) the logic to search peers based on a prefix
(see ethereum/consensus-specs#2749 )

Steps to resolve

Still need to think it more

@AgeManning
Copy link
Member

The issue here gives a decent overview: ethereum/consensus-specs#2749

Discovery is based on a kademlia-like routing table.

Currently when we do a discovery search, we look for a random target, which means we randomly search the DHT for new nodes and collect useful nodes as we traverse the the DHT.

See here in lighthouse code: https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/discovery/mod.rs#L348

After spec change, peers on specific subnets can now be found based on their node-ids. I.e the prefix of their node-id determines which subnet they should be subscribed to. This means that instead of doing random searches when looking for peers for a given subnet, we can target specific node-ids.

When we try to discover peers on a given subnet, we call this function: https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/discovery/mod.rs#L361

When we go to actually run a subnet query (i.e a search for peers for a given set of subnets) we run this function: https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/discovery/mod.rs#L697

We ultimately call this function: https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/discovery/mod.rs#L756

Which sets a random node id as the target for the query. It collects all nodes it finds that match the subnets we are looking for and reports them back. See here: https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/discovery/mod.rs#L777

Instead of using a random node id, we can be smarter and use a random node-id that is pre-fixed with the required bits for nodes on the subnet we are looking for.

There are a few tricky parts to this.

The first is that given a node-id we can compute the subnet-id at a given epoch that the node is supposed to be subscribed to using this function: https://github.com/sigp/lighthouse/blob/stable/consensus/types/src/subnet_id.rs#L78

We would like the inverse, given an epoch and a subnet, what is the node-prefix required? Failing to get this function we can construct a mapping of prefixes to the 64 subnets and use that (which changes per epoch).

This change therefore involves discovery having some notion of epochs.

Secondly - We currently group subnet discoveries. A grouped discovery is not going to work if we are shifting the target node. There has been some simulations that suggest parallel discoveries are just as efficient. So I think we need to remove the concept of grouped discoveries and only have single subnet discoveries but increase the parallelisation of number of searchers.

We can increase the parallel queries via: https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/discovery/mod.rs#LL62C48-L62C48 and https://github.com/sigp/lighthouse/blob/stable/beacon_node/lighthouse_network/src/config.rs#L289

@ackintosh
Copy link
Member

I'll take this on after reading the issue.

@divagant-martian
Copy link
Collaborator Author

Hey @ackintosh awesome you are taking this, let me know if I can help you in anyway

@ackintosh
Copy link
Member

We would like the inverse, given an epoch and a subnet, what is the node-prefix required?

I've been thinking about it for a while, but I couldn't come up with how to calculate the node-prefix. 🤔

https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#attestation-subnet-subscription

def compute_subscribed_subnet(node_id: NodeID, epoch: Epoch, index: int) -> SubnetID:
    node_id_prefix = node_id >> (NODE_ID_BITS - ATTESTATION_SUBNET_PREFIX_BITS)
    node_offset = node_id % EPOCHS_PER_SUBNET_SUBSCRIPTION
    permutation_seed = hash(uint_to_bytes(uint64((epoch + node_offset) // EPOCHS_PER_SUBNET_SUBSCRIPTION)))
    permutated_prefix = compute_shuffled_index(
        node_id_prefix,
        1 << ATTESTATION_SUBNET_PREFIX_BITS,
        permutation_seed,
    )
    return SubnetID((permutated_prefix + index) % ATTESTATION_SUBNET_COUNT)

permutation_seed is a hash calculated from some values including node_offset, and node_offset is based on node_id. Therefore, the meaning of a node-prefix depends on its node_id. We can not determine a node-prefix given an epoch and a subnet.

@divagant-martian @AgeManning Please correct me if I have misunderstood something.

@AgeManning
Copy link
Member

AgeManning commented Nov 6, 2023

No I think you're right. What we need is the permutation seed to be based on the node_id_prefix right?
We would want:

node_offset = node_id_prefix % EPOCHS_PER_SUBNET_SUBSCRIPTION

Right?
That should resolve the issue? We could then pre-calculate the 64 prefixes required.

@ackintosh
Copy link
Member

Yes, I think so! Then we could calculate a mapping as follows:

# Calculates a mapping of subnet_id to node_id_prefix.
def compute_node_id_prefix_map(epoch: Epoch, index: int):
    prefix_map = {}
    for node_id_prefix in range(2**ATTESTATION_SUBNET_PREFIX_BITS):
        node_offset = node_id_prefix % EPOCHS_PER_SUBNET_SUBSCRIPTION
        permutation_seed = hash(uint_to_bytes(uint64((epoch + node_offset) // EPOCHS_PER_SUBNET_SUBSCRIPTION)))
        subnet_id = compute_shuffled_index(
            node_id_prefix,
            1 << ATTESTATION_SUBNET_PREFIX_BITS,
            permutation_seed,
        )
        prefix_map[subnet_id] = node_id_prefix
    return prefix_map

I had a quick test with the python code, it works fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants