Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Create rust GRPC client for Runner #491

Merged
merged 8 commits into from
Jan 8, 2024

Conversation

morgsmccauley
Copy link
Collaborator

This PR creates a rust based GRPC client for Runner, for use within Coordinator V2. For now, this exists as its own crate within the top-level directory. The Coordinator PR was becoming quite large so I decided to separate this out.

Additionally, I've made a couple changes to the Runner proto:

  • Rename the package: spec -> runner
  • Remove executorId from StartExecutorRequest - it should be deterministic and assigned internally
  • Add version to the executor - this will be used to determine whether an executor should be restarted

@morgsmccauley morgsmccauley requested a review from a team as a code owner January 8, 2024 02:00
@morgsmccauley morgsmccauley linked an issue Jan 8, 2024 that may be closed by this pull request
Copy link
Collaborator

@darunrs darunrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it looks like you went with the approach of having client be a separate folder and keeping a copy of the runner proto file. I wasn't sure about that approach since we'd have duplicate proto files and I felt like the build process would be a bit wonky since (New) Coordinator would then have to copy the folder in, and then build it separately, and then I assume do something with the build output? I wasn't really sure how that would end up looking. What's the integration in Docker like? Have you tried using it in new coordinator, existing coordinator, or block streamer as a test?

runner/protos/runner.proto Outdated Show resolved Hide resolved
runner/protos/runner.proto Outdated Show resolved Hide resolved
if (validationResult !== null) {
callback(validationResult, null);
return;
}

const { accountId, functionName, code, schema, redisStream, version } = call.request;
const executorId = hashString(`${accountId}/${functionName}`);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're using accountId and function name anyway, is there a need for hashing here? Couldn't we just use accountId/functionName directly? I had envisioned executorId would be something coordinator decides (And it includes whatever information it wants to generate it, could be function name, account id, maybe machine ID, version, etc. in the future, whatever it is) and it simply passes it over to Runner so that it can be stored for access in that way. Coordinator can then pass that, in the future to one of many runners, or maybe a load balancer, depending on how we set things up. Did you have any ideas on leveraging determinism of the ID for anything or is it more just because its a good quality to have.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically the service (Runner) which manages the entities (Executors) assigns the IDs, as they have the most intimate knowledge of these entities and can ensure there are no collisions. I'm not sure there is benefit in Coordinator generating these IDs.

As for hashing, it's generally a good idea to keep IDs opaque. This allows us to change how these IDs are generated going forward without impacting clients, which I have been somewhat thinking about, we may not need deterministic IDs after all. Without opaque IDs, clients might construct the IDs themselves, and then run in to issues when the IDs change on the server side.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Yeah you're right that Coordinator managing the IDs is not particularly valuable since it will be given the ID back when it creates the executor, which it can use for stopping an executor later. This is actually better as Runner being the ID creator means Coordinator is less "stateful". Like, if Coordinator dies and we restart it and have it load the existing runner streams and their IDs, then having Coordinator assume those IDs makes more sense if Runner created them, rather than some previous instance of Coordinator.

And yeah your points on changing ID generation was what I was considering. Changing the way we ID the executors is probably not too difficult either, if we continue with the approach of Runner creating it and passing it back.

runner/src/server/runner-service.ts Outdated Show resolved Hide resolved
@morgsmccauley
Copy link
Collaborator Author

morgsmccauley commented Jan 8, 2024

So it looks like you went with the approach of having client be a separate folder and keeping a copy of the runner proto file. I wasn't sure about that approach since we'd have duplicate proto files and I felt like the build process would be a bit wonky since (New) Coordinator would then have to copy the folder in, and then build it separately, and then I assume do something with the build output? I wasn't really sure how that would end up looking. What's the integration in Docker like? Have you tried using it in new coordinator, existing coordinator, or block streamer as a test?

This is being used in #444 now. The Docker context needs to be at the root directory anyway since Coordinator depends on two separate crates.

Duplicate proto files might be something we just have to live with until we migrate Runner to Rust 😅.

Let me try symlinking them, that might be more convenient.

Copy link
Collaborator

@darunrs darunrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I understand what you mean by having the client at root. I'll learn whatever strategy you use for linking so I know it for the future. And yeah, relying on proto files for codegen is indeed one of the cons of using protobuf. We might have to do it again if we eventually allow developers to stop and start their indexers from the UI, but that's likely a long time from now, if we ever do so.

One last thing, I noticed you are using the hard coded host:port values. That's also fine for now, though in my PR I did address that by setting that up. Its up to you if you want to include that in this PR. I feel like it's a good place to do so. I or you can of course do so in a future PR too.

@morgsmccauley
Copy link
Collaborator Author

One last thing, I noticed you are using the hard coded host:port values. That's also fine for now, though in my PR I did address that by setting that up. Its up to you if you want to include that in this PR. I feel like it's a good place to do so. I or you can of course do so in a future PR too.

Let's get this merged and we can address in a future PR :)

@morgsmccauley morgsmccauley merged commit 7655aef into main Jan 8, 2024
7 checks passed
@morgsmccauley morgsmccauley deleted the 468-create-rust-grpc-client-for-runner branch January 8, 2024 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create Rust GRPC client for Runner
2 participants