Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Pass at an IO layer for the Rust SDK #107

Closed
wants to merge 61 commits into from
Closed

Conversation

rcoh
Copy link
Collaborator

@rcoh rcoh commented Jan 12, 2021

This has middleware, endpoint providers & signing middleware & retries. Operation shapes are codegennerated against these traits.
Review this and give your thoughts—over the coming weeks I'll be refactoring and pulling off individual pieces into separate PRs as I clean things up.

rcoh added 4 commits January 5, 2021 10:18
Review this and give your thoughts—over the coming weeks I'll be refactoring and pulling off individual pieces into separate PRs as I clean things up.
@rcoh rcoh requested review from davidbarsky, LucioFranco and a team January 12, 2021 23:41

// TODO: consider field privacy, builders, etc.
pub struct Operation<H> {
pub base: http::Request<SdkBody>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid pub fields like this in general.

Copy link

@jasdel jasdel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the SDK implement, or have a design for request retry? Will retry exist as a middleware, Client's call method, or some other outside behavior? Benefit of having retry as a middleware post serialization, is that the client only needs to serialize the request once across all retry attempts. (Using "serialize" to refer to serialization of input, not request signing)

Comment on lines 83 to 86
#[derive(Eq, PartialEq)]
pub enum SigningAlgorithm {
SigV4,
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to consider not definingSigningAlgorithm in a generic auth core library. The concept of a v4 signer probably should be its own module. Smithy allows API modules to define which signing algorithm(s) they support. Defining this in a central module will create more coupling than you probably want.

Consider moving this module to a SigV4 namespace. Its debatable of Credentials and ProvideCredentials should be generic or specific to SigV4. SigV2 did use the same credentials.

service_config,
request_config,
double_uri_encode: false,
normalize_uri_path: true,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is relatively risky to do generically. You might want to consider pushing this into the behavior of the generate API clients. The generated client's middleware that serializes the HTTP URL, probably should own any url path normalization that is needed.

Primary reason for this is an SDK should be able to send request without signing, and that URL to be correct. e.g. S3 and STS operations are examples.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is copied pretty much directly from the CRT signing config options—I guess it's needed somewhere?

Comment on lines 139 to 142
#[derive(Clone)]
pub struct HttpSigner {}

impl HttpSigner {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to scope this as a Sigv4 HTTP signer.

Comment on lines 51 to 96
impl<S> Client<S>
where
S: Service<http::Request<SdkBody>, Response = http::Response<hyper::Body>> + Clone,
S::Error: std::error::Error + 'static,
{
pub async fn call<O, R, E>(&self, input: Operation<O>) -> Result<SdkResponse<R>, SdkError<E>>
where
O: ParseHttpResponse<hyper::Body, Output = Result<R, E>>,
{
let ready_service = ReadyOneshot::new(self.inner.clone())
.await
.map_err(|e| SdkError::DispatchFailure(e.into()))?;

let signer = OperationRequestMiddlewareLayer::for_middleware(SigningMiddleware::new());
let endpoint_resolver = OperationRequestMiddlewareLayer::for_middleware(EndpointMiddleware);
let mut ready_service = ServiceBuilder::new()
.layer(signer)
.layer(endpoint_resolver)
.layer(DispatchLayer)
.service(ready_service);
// TODO: enable operations to specify their own extra middleware to add
let handler = input.response_handler;
let mut response: http::Response<hyper::Body> = ready_service
.call(input.request)
.await
.map_err(|e| match e {
OperationError::DispatchError(e) => SdkError::DispatchFailure(Box::new(e)),
OperationError::ConstructionError(e) => SdkError::DispatchFailure(e),
})?;

let parsed = handler.parse_unloaded(&mut response);
let mut response = match parsed {
Some(Ok(r)) => {
return Ok(SdkResponse {
raw: response,
parsed: r,
});
}
Some(Err(e)) => {
return Err(SdkError::ServiceError {
raw: response,
err: e,
});
}
None => response,
};
Copy link

@jasdel jasdel Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd caution against using an omni call function for all API clients to use to invoke operations. This will potentially create more coupling and ridged design that may make it difficult to implement future API client customizations. e.g. eventstream, presigning, custom serde, automatic file of input parameters, input parameter modifications, etc.

For example how would an API client be generated for MQTT, CBOR, or other non HTTP transports?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • non-http clients use a totally different client—eg. aws-mqtt
  • custom serde gets handled in the service crates on an operation-by operation basis (in the process of constructing an Operation
  • eventstream is handled by handler.parse_unloaded() returning an event stream struct than manages the stream
  • idempotency tokens are handled (already) during the construction of the Operation

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For eventstream, how would the stream of input events be handled? Is that spun off from the parse_unloaded as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stream of input events are handled when the operation is built:
(The HTTP body is a channel, we end the send-end to the end user, the receive end feeds the HTTP body)
roughly:

let (tx, operation) = TranscribeStreaming::builder().build();
let response = client.call(operation);
tx.send(TranscriptStreamingEvent::build().audio(<some audio bytes>).await;
let transcription = response.next_event().await?;

Copy link
Collaborator Author

@rcoh rcoh Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also handle it as part of the EventStream spun out from parse unloaded a .send_event on the EventStream. This would enable code like:

let operation = TranscribeStreaming::builder().build();
let event_stream = client.call(operation).await.expect("failed to start event stream");
let (tx, rx) = response.channels();
tx.send(TranscriptStreamingEvent::build().audio(<some audio bytes>).await;
let transcription = rx.recv().await?;

Comment on lines 16 to 20
pub enum SdkBody {
Once(Option<Bytes>),
}

impl http_body::Body for SdkBody {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty specific to HTTP, considering organizing protocol specific things out of the core, so that the SDK's clients can be generated for other protocols such as MQTT, etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this crate should really be called http_operation (Or go probably move into smithy-http)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, see if you can separate HTTP and operation from any AWS flavoring. This will make it easier to create alternative clients in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, the nice part of the property-bag style design is that Operation only contains an HTTP request by default so it can actually be defined in what provide as smithy whitelabel

Comment on lines 18 to 31
impl StaticEndpoint {
pub fn uri(&self) -> &Uri {
&self.0
}
pub fn from_service_region(svc: impl AsRef<str>, region: impl AsRef<str>) -> Self {
StaticEndpoint(
Uri::from_str(&format!(
"https://{}.{}.amazonaws.com",
svc.as_ref(),
region.as_ref()
))
.unwrap(),
)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this type intended for users to be able to configure custom endpoints? If so consider how the user will provide custom endpoints that should or should not be modified by the SDK. For example if customer provides an endpoint for the S3 API client, how will the Client know if it is allowed to modify the endpoint for things like dualstack, accelerate, accesspoints, buckets, etc?

This concept should also be considered with behavior such as Endpoint Discovery, where the API client must first request an endpoint from the API before making select operation calls.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I need to sit down and design it, but that's a good point—there needs to be a concept of endpoints that are prefixable by the S3 middleware vs. endpoints that are immutable. Good call out.

Comment on lines 149 to 156
pub fn sign<B>(
&self,
signing_config: &HttpSigningConfig,
credentials: &Credentials,
request: &mut http::Request<B>,
payload: impl AsRef<[u8]>,
) -> Result<(), SigningError> {
if signing_config.algorithm != SigningAlgorithm::SigV4
Copy link

@jasdel jasdel Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also consider how the Signer will behave with endpoint URLs containing common ports, e.g. 80, 443. These should not be signed as a part of the request signature, but may be persisted in the URL's endpoint when the request is made.

aws/aws-sdk-go#1618

Comment on lines 12 to 34
#[derive(Clone)]
pub struct SigningMiddleware {
signer: HttpSigner,
}

impl Default for SigningMiddleware {
fn default() -> Self {
SigningMiddleware::new()
}
}

impl SigningMiddleware {
pub fn new() -> Self {
SigningMiddleware {
signer: HttpSigner {},
}
}
}

pub trait SigningConfigExt {
fn signing_config(&self) -> Option<&SigningConfig>;
fn insert_signing_config(&mut self, signing_config: SigningConfig) -> Option<SigningConfig>;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving this to a sigv4 specific module. While today signing is SigV4, this could change in the future. API clients should be generated with references to the specific signing capabilities they support instead of implying the specific signing versions are the standard.

Comment on lines 29 to 33
ServiceConfig.BuilderPreamble -> rust(
// TODO: design a config that enables resolving the default region chain
// clone because the region is also used
"""let region = self.region.unwrap_or_else(|| "us-east-1".to_string());""",
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to require the API client always be configured with a region. and never default to some "built in". Auto selecting regions has been a constant issue for SDKs that default to some region/endpoint, assuming that that endpoint is a valid fallback.

For all intensive purposes SDKs require an API region, even APIs that have a global endpoint.

Region is required for two components of an API client. 1.) endpoint resolving, 2.) Signing. You could consider to make only those two components lazy validate if the client is configured with a region. This allows your API clients to work with customer's providing custom endpoints, and operations that don't need signing.

Comment on lines 49 to 58
pub trait ProvideEndpoint: Send + Sync {
fn set_endpoint(&self, request_uri: &mut Uri);
}

impl ProvideEndpoint for StaticEndpoint {
fn set_endpoint(&self, request_uri: &mut Uri) {
let new_uri = self.apply(request_uri);
*request_uri = new_uri;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Endpoints providers need to be able to provide of at least URL, signing name, signing region, and potentially other metadata when resolving endpoints. endpoints can be modeled with all of these. There is no contract that endpoint prefix, nor serviceId (aka sdkId) matches the signing name. Same goes for region identifier passed into the endpoint resolver, e.g. aws-global, or S3's fips-us-gov-west-1.

@rcoh
Copy link
Collaborator Author

rcoh commented Jan 19, 2021

Does the SDK implement, or have a design for request retry? Will retry exist as a middleware, Client's call method, or some other outside behavior? Benefit of having retry as a middleware post serialization, is that the client only needs to serialize the request once across all retry attempts. (Using "serialize" to refer to serialization of input, not request signing)

There is a design but is hasn't been implemented yet. Retry will be a middleware (although for various reasons, it's something that gets plugged in late). The retry layer happens post serialization (creation of the Operation object), but before running the signing / endpoint resolution middleware.

Performing an async sleep in Rust requires picking a runtime—since we don't want to force users into a runtime, the retry middleware will be per-runtime (although it will basically just be 4-10 lines, calling a method to check to see if a response is retryable, asking the retry controller how long to sleep for, and executing the sleep)

@jasdel
Copy link

jasdel commented Jan 19, 2021

Performing an async sleep in Rust requires picking a runtime—since we don't want to force users into a runtime, the retry middleware will be per-runtime

Would the middleware default to a "common" pattern, and allow users to override the runtime when creating the client? Or would the user be required to specify the runtime up front? Or would the middleware have handling for the various different kinds of runtime? (Sorry fairly ignorant on various runtime in rust :) )

@rcoh
Copy link
Collaborator Author

rcoh commented Jan 19, 2021

you can't really "override" the runtime after compile time. Probably we would provide clients where we've chosen the runtime for you (eg. Tokio, CRT), these clients would include their own retry middleware implementation (so for the end user, this isn't really visible).

It only becomes visible if you're writing your own IO layer, a la what is currently inside of aws-hyper (which I expect a small but non-zero of customers to do).

In general, the runtime is a project-wide choice that you make—we keep as much of our code runtime-agnostic as possible. The code that isn't runtime agnostic is brought in when you make the runtime choice (by choosing a specific dependency)

@rcoh
Copy link
Collaborator Author

rcoh commented Mar 12, 2021

everything in this PR has landed in some shape or form

@rcoh rcoh closed this Mar 12, 2021
@rcoh rcoh deleted the cred-providers branch August 3, 2021 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants