Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce bandwidth over the VC<>BN API using dependant roots #4157

Closed
paulhauner opened this issue Apr 3, 2023 · 4 comments
Closed

Reduce bandwidth over the VC<>BN API using dependant roots #4157

paulhauner opened this issue Apr 3, 2023 · 4 comments
Assignees
Labels
optimization Something to make Lighthouse run more efficiently.

Comments

@paulhauner
Copy link
Member

Description

Presently the VC will poll for attestation duties each slot. Our polling achieves:

  • On the first poll after the VC starts, we always learn the current- and next-epoch duties.
  • On the first poll of the epoch, we always learn the next-epoch duties.
  • Occasionally, during any poll, we might learn of a re-org than changes either the next- or current-epoch duties. Such epochs are very uncommon on mainnet.

Looking at the /eth/v1/validator/duties/attester/{epoch} endpoint, we see that its structure is:

Request Body: An array of the validator indices for which to obtain the duties.

[
  "1"
]

Note: we also included the epoch in the URL.

Response Body

{
  "dependent_root": "0xcf8e0d4e9587369b2301d0790347320302cc0943d5a1884560367e8208d920f2",
  "execution_optimistic": false,
  "data": [
    {
      "pubkey": "0x93247f2209abcacf57b75a51dafae777f9dd38bc7053d1af526f220a7489a6d3a2753e5f3e8b1cfe39b56f43611df74a",
      "validator_index": "1",
      "committee_index": "1",
      "committee_length": "1",
      "committees_at_slot": "1",
      "validator_committee_index": "1",
      "slot": "1"
    }
  ]
}

In our current implementation, the request will contain all validators managed by the VC. I claim that, in the best case, we could only send a request for one validator and know whether or not all the other validators need updating.

I claim this because a set of shuffling is uniquely identified by (epoch, dependent_root) (see "Background Info" below if this isn't clear to you). We already use this assumption in the VC:

local_pubkeys.contains(&duty.pubkey) && {
// Only update the duties if either is true:
//
// - There were no known duties for this epoch.
// - The dependent root has changed, signalling a re-org.
attesters.get(&duty.pubkey).map_or(true, |duties| {
duties
.get(&epoch)
.map_or(true, |(prior, _)| *prior != dependent_root)
})

With the above statement we're filtering out any duties that are already have the same (epoch, dependent_root) signature. So, our current flow goes like:

  • Download allduties/attester for all validators.
  • Filter out any duties with a known (epoch, dependent_root).
  • Update the duties_service.attesters with any new duties (reference).

I propose that we should instead:

  • Download duties/attester for INITIAL_DUTIES_QUERY_SIZE = 1 validators.
  • Find any validators which have conflicting (epoch, dependent_root) values.
  • If there are any conflicting validators, send a second duties/attester request for those validators.
  • Update the duties_service.attesters with any new duties which resulted from the first or second request.

In the case where we don't expect the duties to change (i.e., it's not the first request after VC boot, it's not the first request of an epoch and there wasn't a re-org) then we should reduce the bandwidth by a factor of the number of validators in that VC (e.g., if the VC has 100 validators then they request/response should be ~100x smaller).

Additional Details

There's some extra detail regarding the first request of INITIAL_DUTIES_QUERY_SIZE. I propose that we should actually make this query of size max(INITIAL_DUTIES_QUERY_SIZE, num_uninitialized_validators) where num_uninitialized_validators is the count of validators for which we don't already know their duties for that epoch. This will be all validators when booting for the first time or querying for the "next epoch". It'll avoid us doing the second request when we know we need the duties for all validators.

Background Info

Whilst the term "dependent root" doesn't appear in the specification, it exists as a concept in get_beacon_committee. We use dependant_root to refer to the block root at the same slot as get_seed uses to load the randao_mix as used as the input to compute_committee.

The argument is that any chain which has dependant_root in its history will always return the same result for get_beacon_committee given the same epoch.

Here's a scrappy diagram that might help:

IMG_1651

The term "dependent root" was introduced to the API some time after we'd implemented the concept in Lighthouse for keying our internal shuffling caches. Internally, we will sometimes refer to it as the "shuffling decision" root (example).

@paulhauner paulhauner added the optimization Something to make Lighthouse run more efficiently. label Apr 3, 2023
@michaelsproul
Copy link
Member

Something that Teku does which we could also consider is subscribing to the SSE event stream and using that to infer a change of dependent root. I guess the head event is sufficient, as it includes the previous and current dependent roots: https://ethereum.github.io/beacon-APIs/?urls.primaryName=dev#/Events/eventstream

This would be more of a major architectural change though, and may come with other complications (handling stream reconnects, etc). Perhaps the reduced polling approach is more pragmatic

@jimmygchen jimmygchen self-assigned this Apr 5, 2023
@jimmygchen
Copy link
Member

@paulhauner thanks a lot for the nice write up! I'd like to look into this.

@jimmygchen
Copy link
Member

PR created here #4170

bors bot pushed a commit that referenced this issue May 15, 2023
## Issue Addressed

#4157 

## Proposed Changes

See description in #4157.

In diagram form:

![reduce-attestation-bandwidth](https://user-images.githubusercontent.com/742762/230277084-f97301c1-0c5d-4fb3-92f9-91f99e4dc7d4.png)


Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
@jimmygchen
Copy link
Member

Completed in #4170

ghost pushed a commit to oone-world/lighthouse that referenced this issue Jul 13, 2023
## Issue Addressed

sigp#4157 

## Proposed Changes

See description in sigp#4157.

In diagram form:

![reduce-attestation-bandwidth](https://user-images.githubusercontent.com/742762/230277084-f97301c1-0c5d-4fb3-92f9-91f99e4dc7d4.png)


Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this issue Jan 6, 2024
## Issue Addressed

sigp#4157 

## Proposed Changes

See description in sigp#4157.

In diagram form:

![reduce-attestation-bandwidth](https://user-images.githubusercontent.com/742762/230277084-f97301c1-0c5d-4fb3-92f9-91f99e4dc7d4.png)


Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this issue Jan 6, 2024
## Issue Addressed

sigp#4157 

## Proposed Changes

See description in sigp#4157.

In diagram form:

![reduce-attestation-bandwidth](https://user-images.githubusercontent.com/742762/230277084-f97301c1-0c5d-4fb3-92f9-91f99e4dc7d4.png)


Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization Something to make Lighthouse run more efficiently.
Projects
None yet
Development

No branches or pull requests

3 participants