Skip to content
This repository has been archived by the owner on Dec 7, 2023. It is now read-only.

Split up server and worker into different crates #43

Open
JeanMertz opened this issue Aug 20, 2019 · 0 comments
Open

Split up server and worker into different crates #43

JeanMertz opened this issue Aug 20, 2019 · 0 comments

Comments

@JeanMertz
Copy link
Contributor

Compilation times have become significantly worse due to the fact that the worker loads all processors, which by their very nature all depend on different crates, ballooning the total number of dependencies to 400+ crates.

The server doesn't do anything with these processors, other than needing to know their type definitions so that it knows how to expose the processor configurations via GraphQL.

I think it makes sense to start splitting all of this up, but I haven't yet come to a good design.

The worker and server depend on the processors for different reasons:

  • server depends on all processors to know their GraphQL input types
  • worker depends on all processors to run them

Aside from that, both server and worker need typed information about the database, as the server uses it to push new jobs, and fetch job status based on GraphQL requests, and the worker fetches pending jobs and pushes job updates to the database.

So if we were to split all that up into separate crates, so that they don't have to be re-compiled all the time, you'd have to end up with something like:

  • server – GraphQL API
  • storage – types related to storing and fetching data
  • worker – run jobs
  • processor-types – a set of processor type signatures
  • processor-shell-command/.../... – all the different processor implementations

In such as situation, the server would depend on storage and processor-types, but not on all the actual processor implementations, and also not on the worker.

The worker would depend on everything except the server itself.

This is still not great, as now creating a new processor involves not just writing one, but also adding it to the list of processor types in another crate, making it less easy to create a processor and add it to your set-up (which isn't possible right now either, since processors are compiled into the server/worker binaries).

Another thought I had was to actually change processors to become binaries, and have the server and worker communicate over RPC, which would solve most of these issues (and would allow processors to not be built in Rust), but would add an extra layer of complexity to the cross-binary communication, it would also reduce type safety.

One way to do that would be to have something like this:

  • when starting the server, you pass in a set of strings, representing the processor binaries you want to "enable"
  • on start-up, the server runs these binaries with some kind of signature argument to ask each processor for their type signature
  • the server then uses this type signature to configure GraphQL (however, the current GraphQL library we use doesn't support dynamic schemas), and to serialise the data before storing it in the database
  • When the worker needs a processor to do some work, it runs its binary with the correct data passed in.
  • this would already work reasonably easily, since we're only passing in JSON-serialised data to the processors as input, and get a string (or error) as output back, which translates well to passing in some JSON-formatted string to the binary, getting back an exit code + string output.

Still, it's quite some work, and there are still some gaps (such as dynamic schemas).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant