-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add gRPC Support #34
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great new feature addition here!
Should we add some new CI checks related to verifying the changes in .proto
files are safe/sane with something like buf?
Yeah we should have a CI check, I'll add that. |
Description
proto/
build/scripts/proto/compile.sh
internal/service
internal/sink
internal/file
to support loading a server certificate for the sink from one of many locationsMotivation and Context
This update is the first step in the project becoming a true "distributed system" (in the architectural sense) and enables users to deploy the system in new ways (described as future work below). This is quite big from an impact point of view, so we should address any open questions before approving the PR.
In the short term, this gives us the ability to support synchronous (sync) invocation in AWS Lambda. The problem we've had until now is that there was no way for the sink goroutine to send results back to the calling application -- this is a requirement in supporting sync invocations because the processed data must be returned by the Lambda handler. With this change we can send data from the sink to the caller by using a gRPC service for inter-process communication (IPC). This is shown in the diagram below where goroutines are represented by dotted lines and data flow is represented by solid arrows. (Using gRPC for IPC is described in more detail in the new example included in this PR.)
Not supporting sync invocation was a blocker to supporting some AWS services (like data transformation in Kinesis Firehose) and making the system behave like a data enrichment microservice -- it's not yet clear how we'll implement this, but these changes will make it possible to deploy Substation as an "enrichment node" within the context of a larger system that can be invoked using the existing Lambda processor.
Onto future work, the added benefit of gRPC is that, with little effort on our part, we can extend that functionality to systems beyond serverless AWS services. This PR adds a definition for a Sink service that mimics the internal Sink interface, but we could also add definitions that mimic other components of the system as well. For example, these definitions would turn every processor and inspector into configurable microservices:
Overall, defining Protobuf based on the system's structs and interfaces would be a relatively safe method for making components accessible to external services (including services not written in Go) and letting others build their own distributed data pipelines on non-serverless infrastructure. (I don't anticipate the team at Brex doing this any time soon since it would increase complexity and we're happy with AWS Lambda, but if others are interested, then it's easy to support).
How Has This Been Tested?
The new example included in this PR acts as an integration test for all of the new features, including the proto, the internal gRPC service, the internal gRPC server, and the gRPC sink.
Types of changes
Checklist: