Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add gRPC Support #34

Merged
merged 10 commits into from
Nov 22, 2022
Merged

feat: Add gRPC Support #34

merged 10 commits into from
Nov 22, 2022

Conversation

jshlbrd
Copy link
Collaborator

@jshlbrd jshlbrd commented Oct 25, 2022

Description

  • Adds initial Protobuf definitions to proto/
    • Adds build dependencies in all Dockerfile, including devcontainer
    • Adds build script in build/scripts/proto/compile.sh
  • Adds gRPC server and service package to internal/service
  • Adds gRPC sink to internal/sink
    • Adds internal/file to support loading a server certificate for the sink from one of many locations
  • Adds example application utilizing the gRPC Sink service
  • Updates devcontainer settings, including an install script

Motivation and Context

This update is the first step in the project becoming a true "distributed system" (in the architectural sense) and enables users to deploy the system in new ways (described as future work below). This is quite big from an impact point of view, so we should address any open questions before approving the PR.

In the short term, this gives us the ability to support synchronous (sync) invocation in AWS Lambda. The problem we've had until now is that there was no way for the sink goroutine to send results back to the calling application -- this is a requirement in supporting sync invocations because the processed data must be returned by the Lambda handler. With this change we can send data from the sink to the caller by using a gRPC service for inter-process communication (IPC). This is shown in the diagram below where goroutines are represented by dotted lines and data flow is represented by solid arrows. (Using gRPC for IPC is described in more detail in the new example included in this PR.)

graph TD
handler -.- gRPC
handler -.- transform
handler -.- sink
handler --> ingest
ingest --> transform
transform --> sink
sink --> gRPC
gRPC --> handler
Loading

Not supporting sync invocation was a blocker to supporting some AWS services (like data transformation in Kinesis Firehose) and making the system behave like a data enrichment microservice -- it's not yet clear how we'll implement this, but these changes will make it possible to deploy Substation as an "enrichment node" within the context of a larger system that can be invoked using the existing Lambda processor.

Onto future work, the added benefit of gRPC is that, with little effort on our part, we can extend that functionality to systems beyond serverless AWS services. This PR adds a definition for a Sink service that mimics the internal Sink interface, but we could also add definitions that mimic other components of the system as well. For example, these definitions would turn every processor and inspector into configurable microservices:

// Applicator mirrors the Applicator interface defined in process
service Applicator {
  rpc Apply(Capsule) returns (Capsule) {}
}

// Inspector mirrors the Inspector interface defined in condition
service Inspector {
  rpc Inspect(Capsule) returns (Decision) {}
}

Overall, defining Protobuf based on the system's structs and interfaces would be a relatively safe method for making components accessible to external services (including services not written in Go) and letting others build their own distributed data pipelines on non-serverless infrastructure. (I don't anticipate the team at Brex doing this any time soon since it would increase complexity and we're happy with AWS Lambda, but if others are interested, then it's easy to support).

How Has This Been Tested?

The new example included in this PR acts as an integration test for all of the new features, including the proto, the internal gRPC service, the internal gRPC server, and the gRPC sink.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@jshlbrd jshlbrd marked this pull request as ready for review October 25, 2022 15:35
@jshlbrd jshlbrd requested a review from a team as a code owner October 25, 2022 15:35
Copy link
Contributor

@shellcromancer shellcromancer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great new feature addition here!

Should we add some new CI checks related to verifying the changes in .proto files are safe/sane with something like buf?

internal/service/server.go Outdated Show resolved Hide resolved
internal/service/sink.go Outdated Show resolved Hide resolved
internal/file/file.go Show resolved Hide resolved
@jshlbrd
Copy link
Collaborator Author

jshlbrd commented Nov 2, 2022

Great new feature addition here!

Should we add some new CI checks related to verifying the changes in .proto files are safe/sane with something like buf?

Yeah we should have a CI check, I'll add that.

@jshlbrd jshlbrd merged commit 04b4917 into main Nov 22, 2022
@jshlbrd jshlbrd deleted the jshlbrd/grpc branch November 22, 2022 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants