Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add operation orchestrator #694

Merged
merged 5 commits into from
May 8, 2024
Merged

feat: add operation orchestrator #694

merged 5 commits into from
May 8, 2024

Conversation

milesziemer
Copy link
Contributor

@milesziemer milesziemer commented Apr 9, 2024

Companion PR: awslabs/aws-sdk-swift#1435

Description of changes

Client libraries

Adds Orchestrator which implements the operation execution pipeline.
It uses the following new "runtime components" SelectAuthScheme, ApplyEndpoint,
ApplySigner, and ExecuteRequest, in addition to the existing RetryStrategy and
Interceptor components. Orchestrator supports executing a full request, in
addition to creating a presigned request - same as OperationStack.

I say "runtime components" because these new interfaces are essentially proxies
for the actual components. For example, SelectAuthScheme is just a convenient
way of using identity and auth within Orchestrator. This makes it easier to
migrate to Interceptors & Orchestrator, because the middlewares that currently
implement these components can just conform to the new interfaces. Again take
as an example SelectAuthScheme - AuthSchemeMiddleware currently does all the
work, so we just adapt it to the SelectAuthScheme interface and we can pass it
directly to Orchestrator. In the future, we may not need these interfaces at
all because there will be no middleware. But for now they help with transition.

All existing middleware (except for RetryMiddleware) have been extended to
either implement one of the new runtime component interfaces, or Interceptor.
RetryMiddleware's implementation was just copied (roughly) into Orchestrator because
it ended up being simpler than trying to abstract it into a pseudo component.
This does have the unfortunate consequence that there are essentially two
implementations of the core retry functionality for the time being. I also
adapted SdkHttpClient to be useable as ExecuteRequest.

OrchestratorBuilder is used in generated operation method bodies to build
the operation Orchestrator with the required runtime components. This is
analagous to how an OperationStack is currently used to tie together the
middleware needed for an operation.

There are a few other new interfaces too, under Message. These are the starting
points for Requests and Responses, and will be a unifying interface for all
'transport' message types. They were added here because Orchestrator actually
needs to use some pieces of the request, like the host for example, and it also
helps with the transition from middleware. Currently, serialization is done in
multiple steps, operation on a request builder. In order to replicate this
in Orchestrator, I've added RequestMessageSerializer which is an interface that
takes an operation input, and a RequestMessageBuilder, and modifies the builder
with whatever serialized data it pulled from the input. Because Orchestrator
needs to store the serializers, we also need a type for the builder, which is
where RequestMessageBuilder comes in. These interfaces are minimal right now,
but will be expanded in the future so we can move more middleware/interceptors
to being protocol-agnostic (right now they're all http specific).
but will be expanded in the future so we can move more middleware/interceptors
to being protocol-agnostic (right now they're all http specific).

A comparison of a current generated operation body with what it will look like
using Orchestrator (I've omitted some details for readability):

public func getFoo(input: GetFooInput) async throws GetFooOutput {
    let context = HttpContextBuilder()
        ...
        .build()
    var operation = OperationStack(id: "getFoo")
    operation.initializeStep.intercept(position: .after, middleware: URLPathMiddleware(...))
    operation.initializeStep.intercept(position: .after, middleware: URLHostMiddleware())
    operation.serializeStep.intercept(position: .after, middleware: QueryItemMiddleware(...))
    operation.serializeStep.intercept(position: .after, middleware: HeaderMiddleware(...))
    operation.buildStep.intercept(position: .before, middleware: EndpointResolverMiddleware(...))
    operation.buildStep.intercept(position: .before, middleware: AuthSchemeMiddleware())
    operation.finalizeStep.intercept(position: .after, middleware: RetryMiddleware(...))
    operation.finalizeStep.intercept(position: .before, middleware: SignerMiddleware())
    operation.deserializeStep.intercept(position: .after, middleware: DeserializeMiddleware(...))
    operation.deserializeStep.intercept(position: .after, middleware: LoggerMiddleware(...))
    return try await operation.handleMiddleware(context: context, input: input, next: client.getHandler())
}
public func getFoo(input: GetFooInput) async throws GetFooOutput {
    let context = HttpContextBuilder()
        ...
        .build()
    let builder = OrchestratorBuilder()
    builder.interceptors.add(URLPathMiddleware(...))
    builder.interceptors.add(URLHostMiddleware())
    builder.serialize(QueryItemMiddleware(...))
    builder.serialize(HeaderMiddleware(...))
    builder.applyEndpoint(EndpointResolverMiddleware(...))
    builder.selectAuthScheme(AuthSchemeMiddleware())
    // These used to be passed to retry middleware
    builder.retryStrategy(...)
    builder.retryErrorInfoProvider(...)
    builder.applySigner(SignerMiddleware())
    builder.deserialize(DeserializeMiddleware(...))
    builder.interceptors.add(LoggerMiddleware(...))
    let op = builder.attributes(context) // context can be changed in the function body
        .executeRequest(client)
        .build()
    return try await op.execute(input: input)
}

Some other changes were made to the existing interceptors interfaces:

  • Made InterceptorContext a reference type so we don't need to use inout everywhere
  • Switch InterceptorContext 'output' -> 'result' which is a Result type containing
    the operation output. This allows us to capture error responses and other errors
    that occur during execution, so we don't immediately throw from wherever the error
    occurs.

Codegen

Updates the codegenerator to be able to generate operations that use
Orchestrator and Interceptors instead of OperationStack. This is
configurable using the useInterceptors codegen setting. Because most
middleware was extended to be useable by Orchestrator, codegen only
needed to be able to render middleware a little differently based on
whether useInterceptors was set.

Ultimately we won't need useInterceptors, and we can fully refactor
the codegenerator + libraries, but for now this makes it easier and
safer to migrate.

Some test cases had to be updated to expect generic params where they
were previously inferred (they can't be inferred with interceptors,
but the middleware initializer is the same for both code paths)

Scope

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Comment on lines 188 to 190
// TODO: This only throws because getting the partitionId and initial token are fallible.
// That's fine, but where do we want to go if they fail? Do we just skip retries entirely
// and go to an attempt?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should just fail. I'm not sure where partitionId is populated, but failing to acquire the initial retry token is an indication of a buggy retry strategy implementation.

@milesziemer milesziemer force-pushed the interceptors-impl branch 2 times, most recently from 66d995a to 3c81a81 Compare May 1, 2024 15:59
Comment on lines 180 to 181
/// These methods throw the last error that occurred when executing a given hook, so if multiple
/// interceptors fail, only the last error is thrown. The rest are logged (TODO)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a goal to remove all TODOs in our SDK, if we plan to log the additional errors let's either include that logic or create a tracking ticket for that work and pull it into sprint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, forgot about this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 184 to 186
var error: Error?
for i in interceptors {
do {
try await i.readBeforeExecution(context: context)
} catch let e {
error = e
}
}
if let error = error {
throw error
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 184-194 are repeated in each of these funcs with only one line differing, can we create a higher-order function that takes the relevant function as input? Should reduce code duplication and shorten this file by a lot

Ex. I didn't test this but maybe something like the following (also note potential suggestion on accomplishing logging)

internal func executeInterceptors<T>( // can make the name better
    context: InterceptorType.InterceptorContextType, 
    operation: @escaping (InterceptorType) async throws -> T //
) async throws {
    var error: Error? // suggestion: you could change this to a list 
    for i in interceptors {
        do {
            try await operation(i)
        } catch let e {
            error = e // append to list
        }
    }
   if let error = error {
        throw error // loop through list of errors and log the errors then throw the last one
   }
}

then would modify these functions like so:

internal func readBeforeExecution(context: InterceptorType.InterceptorContextType) async throws {
    try await executeInterceptors(context: context) { interceptor in
        try await interceptor.readBeforeExecution(context: context)
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

case .data(let data):
input.headers.update(name: "Content-Length", value: String(data?.count ?? 0))
builder.headers.update(name: "Content-Length", value: String(data?.count ?? 0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could move data?.count ?? 0 but I'll leave that up to you

let contentLength = data?.count ?? 0
builder.headers.update(name: "Content-Length", value: String(contentLength))


public func apply(input: OperationStackInput, builder: SdkHttpRequestBuilder, attributes: HttpContext) throws {
let bodyString = input[keyPath: keyPath]?.rawValue ?? ""
builder.withBody(.data(Data(bodyString.utf8)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can use a variable for readability

// Convert to utf8
let bodyData = Data(bodyContent.utf8)

builder.withBody(.data(bodyData))

Comment on lines +22 to +44
if (ctx.settings.useInterceptors) {
writer.write(
"""
builder.applyEndpoint({ (request, scheme, attributes) in
return request.toBuilder()
.withMethod(attributes.getMethod())
.withPath(attributes.getPath())
.withHost("\(attributes.getHostPrefix() ?? "")\(attributes.getHost() ?? "")")
.build()
})
""".trimMargin()
)
} else {
writer.openBlock(
"$operationStackName.${middlewareStep.stringValue()}.intercept(position: ${position.stringValue()}, id: \"${name}\") { (context, input, next) -> \$N<$outputShapeName> in", "}",
ClientRuntimeTypes.Middleware.OperationOutput
) {
writer.write("input.withMethod(context.getMethod())")
writer.write("input.withPath(context.getPath())")
writer.write("let host = \"\\(context.getHostPrefix() ?? \"\")\\(context.getHost() ?? \"\")\"")
writer.write("input.withHost(host)")
writer.write("return try await next.handle(context: context, input: input)")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very basic Q: Because I am unfamiliar with middlewares vs interceptors, is interceptors something that wont always be used (hence the ctx.settings.useInterceptors)? In which case, do customizations need to follow this same code logic of if interecptors then x else y for middleware?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the short term, yes. Ultimately we won't have useInterceptors config setting, and the SDK will only use interceptors. But to make the transition easier, I made it configurable here.

@syall syall mentioned this pull request May 6, 2024
4 tasks
Comment on lines 185 to 186
for i in interceptors {
do {
try await i.readBeforeExecution(context: context)
} catch let e {
error = e
}
}
if let error = error {
throw error
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something I didn't catch, but is there a reason why we don't fail on the first error encountered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its a safer contract than failing on the first error encountered, in case you have interceptors dependent on something occurring in a previous interceptor, like setting up some resources. There's still issues, but at least you know your interceptor will run.

@milesziemer milesziemer force-pushed the interceptors-impl branch 2 times, most recently from b30177c to 804529c Compare May 7, 2024 15:50
Adds Orchestrator which implements the operation execution pipeline.
It uses the following new "runtime components" SelectAuthScheme, ApplyEndpoint,
ApplySigner, and ExecuteRequest, in addition to the existing RetryStrategy and
Interceptor components. Orchestrator supports executing a full request, in
addition to creating a presigned request - same as OperationStack.

I say "runtime components" because these new interfaces are essentially proxies
for the actual components. For example, SelectAuthScheme is just a convenient
way of using identity and auth within Orchestrator. This makes it easier to
migrate to Interceptors & Orchestrator, because the middlewares that currently
implement these components can just conform to the new interfaces. Again take
as an example SelectAuthScheme - AuthSchemeMiddleware currently does all the
work, so we just adapt it to the SelectAuthScheme interface and we can pass it
directly to Orchestrator. In the future, we may not need these interfaces at
all because there will be no middleware. But for now they help with transition.

All existing middleware (except for RetryMiddleware) have been extended to
either implement one of the new runtime component interfaces, or Interceptor.
RetryMiddleware's implementation was just copied into Orchestrator because
it ended up being simpler than trying to abstract it into a pseudo component.
This does have the unfortunate consequence that there are essentially two
implementations of the core retry functionality for the time being. I also
adapted SdkHttpClient to be useable as ExecuteRequest.

OrchestratorBuilder is used in generated operation method bodies to build
the operation Orchestrator with the required runtime components. This is
analagous to how an OperationStack is currently used to tie together the
middleware needed for an operation.

There are a few other new interfaces too, under Message. These are the starting
points for Requests and Responses, and will be a unifying interface for all
'transport' message types. They were added here because Orchestrator actually
needs to use some pieces of the request, like the host for example, and it also
helps with the transition from middleware. Currently, serialization is done in
multiple steps, operation on a request builder. In order to replicate this
in Orchestrator, I've added RequestMessageSerializer which is an interface that
takes an operation input, and a RequestMessageBuilder, and modifies the builder
with whatever serialized data it pulled from the input. Because Orchestrator
needs to store the serializers, we also need a type for the builder, which is
where RequestMessageBuilder comes in. These interfaces are minimal right now,
but will be expanded in the future so we can move more middleware/interceptors
to being protocol-agnostic (right now they're all http specific).
but will be expanded in the future so we can move more middleware/interceptors
to being protocol-agnostic (right now they're all http specific).

A comparison of a current generated operation body with what it will look like
using Orchestrator (I've omitted some details for readability):
```
public func getFoo(input: GetFooInput) async throws GetFooOutput {
    let context = HttpContextBuilder()
        ...
        .build()
    var operation = OperationStack(id: "getFoo")
    operation.initializeStep.intercept(position: .after, middleware: URLPathMiddleware(...))
    operation.initializeStep.intercept(position: .after, middleware: URLHostMiddleware())
    operation.serializeStep.intercept(position: .after, middleware: QueryItemMiddleware(...))
    operation.serializeStep.intercept(position: .after, middleware: HeaderMiddleware(...))
    operation.buildStep.intercept(position: .before, middleware: EndpointResolverMiddleware(...))
    operation.buildStep.intercept(position: .before, middleware: AuthSchemeMiddleware())
    operation.finalizeStep.intercept(position: .after, middleware: RetryMiddleware(...))
    operation.finalizeStep.intercept(position: .before, middleware: SignerMiddleware())
    operation.deserializeStep.intercept(position: .after, middleware: DeserializeMiddleware(...))
    operation.deserializeStep.intercept(position: .after, middleware: LoggerMiddleware(...))
    return try await operation.handleMiddleware(context: context, input: input, next: client.getHandler())
}
```
```
public func getFoo(input: GetFooInput) async throws GetFooOutput {
    let context = HttpContextBuilder()
        ...
        .build()
    let builder = OrchestratorBuilder()
    builder.interceptors.add(URLPathMiddleware(...))
    builder.interceptors.add(URLHostMiddleware())
    builder.serialize(QueryItemMiddleware(...))
    builder.serialize(HeaderMiddleware(...))
    builder.applyEndpoint(EndpointResolverMiddleware(...))
    builder.selectAuthScheme(AuthSchemeMiddleware())
    // These used to be passed to retry middleware
    builder.retryStrategy(...)
    builder.retryErrorInfoProvider(...)
    builder.applySigner(SignerMiddleware())
    builder.deserialize(DeserializeMiddleware(...))
    builder.interceptors.add(LoggerMiddleware(...))
    let op = builder.attributes(context) // context can be changed in the function body
        .executeRequest(client)
        .build()
    return try await op.execute(input: input)
}
```

Note: some middleware, like EndpointResolverMiddleware, are codegenerated,
which will be handled in a followup commit.

Some other changes were made to the existing interceptors interfaces:
- Made InterceptorContext a reference type so we don't need to use `inout` everywhere
- Switch InterceptorContext 'output' -> 'result' which is a Result type containing
  the operation output. This allows us to capture error responses and other errors
  that occur during execution, so we don't immediately throw from wherever the error
  occurs.

One other change was made to the existing interceptors interface InterceptorContext,
to make it require a reference type, so we don't need to use `inout` everywhere.
Updates the codegenerator to be able to generate operations that use
Orchestrator and Interceptors instead of OperationStack. This is
configurable using the `useInterceptors` codegen setting. Because most
middleware was extended to be useable by Orchestrator, codegen only
needed to be able to render middleware a little differently based on
whether `useInterceptors` was set.

Ultimately we won't need `useInterceptors`, and we can fully refactor
the codegenerator + libraries, but for now this makes it easier and
safer to migrate.

Some test cases had to be updated to expect generic params where they
were previously inferred (they can't be inferred with interceptors,
but the middleware initializer is the same for both code paths)
@milesziemer milesziemer merged commit cf52198 into main May 8, 2024
17 checks passed
@milesziemer milesziemer deleted the interceptors-impl branch May 8, 2024 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants