Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SDK Sampling interface #1877

Merged
merged 26 commits into from
Aug 23, 2023
Merged

Conversation

joshua-kim
Copy link
Contributor

@joshua-kim joshua-kim commented Aug 17, 2023

Why this should be merged

Allow the SDK clients to configure their node sampling strategy for AppRequestAny.

How this works

Moves the peer state in Router (which was questionable anyways) to an implementation of an interface. We now expose two different implementations, one to sample any peer and one to sample strictly validators. The VM is expected to keep a reference to a set of samplers, and re-use them for any clients it might create.

How this was tested

Added some unit tests.

network/p2p/client.go Outdated Show resolved Hide resolved
@joshua-kim joshua-kim self-assigned this Aug 18, 2023
@joshua-kim joshua-kim added the sdk This involves SDK tooling or frameworks label Aug 18, 2023
type NodeSampler interface {
// Sample returns at most [n] nodes. This may return fewer nodes if fewer
// than [n] are available.
Sample(n int) ([]ids.NodeID, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't Sample take a context? You are already needing to stub out TODO getRecentValidatorIDs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of a sample that would return an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't Sample take a context? You are already needing to stub out TODO getRecentValidatorIDs.

Yeah this makes sense, I can add this to the signature

Can you give an example of a sample that would return an error?

Hmm I guess an example would be some grpc failure? Some implementation might hit the p-chain for example

Copy link
Contributor Author

@joshua-kim joshua-kim Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the only reason we have this Sample interface is to use it for the Client node selection implementation. We could actually swallow this error and just return an empty slice of node ids, and have the Client report an error if it sees no nodes to send to, so the error is still visible and this interface signature is much cleaner since we can get rid of the error here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could actually swallow this error and just return an empty slice of node ids, and have the Client report an error if it sees no nodes to send to,

If we error and return on calls to v.validators.GetCurrentHeight(ctx) or v.validators.GetValidatorSet(ctx, height, v.subnetID) we won't update v.validatorIDs so future sampling won't return an empty slice of node IDs, it'll just return stale node IDs. FWIW the sampler implementations in the sampler package have errors in their return signature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a unit test for this specific scenario to make sure we return an empty slice instead of stale ids so the error is surfaced correctly in the Client.

network/p2p/node_sampler_test.go Outdated Show resolved Hide resolved
network/p2p/node_sampler_test.go Outdated Show resolved Hide resolved
network/p2p/node_sampler_test.go Outdated Show resolved Hide resolved
network/p2p/node_sampler_test.go Outdated Show resolved Hide resolved
network/p2p/router.go Show resolved Hide resolved
network/p2p/router_test.go Outdated Show resolved Hide resolved
network/p2p/router.go Show resolved Hide resolved
}

for disconnected := range tt.disconnected {
require.NoError(sampler.Disconnected(context.Background(), disconnected))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use context.WithCancel for the test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum can you elaborate? I didn't understand why this would be helpful here.

_ NodeSampler = (*Peers)(nil)
)

type NodeSampler interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this NodeSampler or PeerSampler

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems more like a node sampler to me. It samples nodes (which may or may not be connected)

Copy link
Contributor Author

@joshua-kim joshua-kim Aug 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah my thinking was that the terminology we use are:

  • node: a member in the network
  • peer: a node we are connected to
  • validator: a staking node

so Peers is a component that represents nodes are are connected to, and Validators is a component that represents staking nodes.

network/p2p/node_sampler.go Outdated Show resolved Hide resolved
network/p2p/node_sampler.go Outdated Show resolved Hide resolved
network/p2p/node_sampler_test.go Outdated Show resolved Hide resolved
network/p2p/node_sampler_test.go Outdated Show resolved Hide resolved
network/p2p/node_sampler.go Outdated Show resolved Hide resolved
joshua-kim and others added 3 commits August 21, 2023 17:20
Co-authored-by: Stephen Buttolph <stephen@avalabs.org>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Co-authored-by: Stephen Buttolph <stephen@avalabs.org>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Co-authored-by: Stephen Buttolph <stephen@avalabs.org>
Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
type NodeSampler interface {
// Sample returns at most [limit] nodes. This may return fewer nodes if
// fewer than [limit] are available.
Sample(ctx context.Context, limit int) []ids.NodeID
Copy link
Contributor Author

@joshua-kim joshua-kim Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I also considered making another type to return here like:

type Node struct {
    ID ids.NodeID
}

so we could add onto it without causing breaking changes. I felt like it was reasonable that in the future we might add some peer-level metadata to this, but it seemed premature to add for now.

@joshua-kim joshua-kim requested review from hexfusion and StephenButtolph and removed request for coffeeavax August 22, 2023 13:08
network/p2p/validators.go Outdated Show resolved Hide resolved
network/p2p/validators.go Show resolved Hide resolved
network/p2p/validators.go Outdated Show resolved Hide resolved
Comment on lines 45 to 51
if err != nil {
return
}
validatorSet, err := v.validators.GetValidatorSet(ctx, height, v.subnetID)
if err != nil {
return
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm is there something we could do other than swallow these errors? Adding error return values seems kind of annoying but also I dislike swallowing errors. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it does seem strange. Here's a link to the rationale behind this: #1877 (comment)

@@ -41,6 +41,7 @@ type Client struct {
handlerPrefix []byte

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not part of this PR but should we have a NewClient function? If this in intended to be used only inside this package, Client can be unexported. If it's intended to be used outside this package, then a user can't use Client because they can't set its inner fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is created via RegisterAppProtocol. It's not possible to use a Client without a corresponding Router because the Router holds the map of pending requests -> the callbacks that need to be fired on response. We could theoretically refactor this into a NewClient(Router, handlerID, ....) if we felt that looked cleaner.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just feels weird to export this if it's actually unusable outside this package

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client is usable outside of this package, RegisterAppProtocol initializes it here.

Maybe we could rename this to NewClient to be more verbose or remove the receiver on Router. This definitely might seem confusing coming from the existing paradigm of how we've developed VMs without the SDK component.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Yeah I think it might be a bit confusing to someone looking at this package how to create a new client. I think maybe even just a comment on Client might suffice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I could see an argument to make the constructor code NewClient(sender common.AppSender, router *Router, handlerID byte) *Client. This would have the added benefit of not having to keep around the sender as a field in Router which we currently do.

network/p2p/client.go Outdated Show resolved Hide resolved
Signed-off-by: Stephen Buttolph <stephen@avalabs.org>
@StephenButtolph StephenButtolph added this to the v1.10.9 milestone Aug 23, 2023
@StephenButtolph StephenButtolph merged commit a3c4649 into ava-labs:dev Aug 23, 2023
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sdk This involves SDK tooling or frameworks
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants