Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for flexible checksums on streaming payloads #659

Merged
merged 80 commits into from
Mar 21, 2024

Conversation

dayaffe
Copy link
Contributor

@dayaffe dayaffe commented Feb 7, 2024

Issue #

awslabs/aws-sdk-swift#1319

Description of changes

  • FlexibleChecksumsRequestMiddleware
    • Sets headers response for enabling chunked encoding
    • Checks if a checksum is requested, if so it will either calculate it and set appropriate header (for normal payload), or set the necessary data for to calculate it during chunk writing for streams
    • Calculation workflow will be skipped if user provides their own checksum or does not opt-in to use checksums
  • FlexibleChecksumsResponseMiddleware
    • For normal payloads this will validate the checksum
    • For streaming payloads stream will becoming a Validating stream that validates the checksum when the stream is read

Also:

  • Fixes a race condition in FoundationStreamBridge where data could be written to its output out-of-order.

Scope

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link
Contributor

@sichanyoo sichanyoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple questions.

Copy link
Contributor

@sichanyoo sichanyoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will approve once all swift lint warnings get addressed.

Copy link
Contributor

@sichanyoo sichanyoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, after corrresponding PR in aws-sdk-swift passes all CIs, should be good to go.

Copy link
Contributor

@jbelkins jbelkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed several issues in this feature, both in this PR and in awslabs/aws-sdk-swift#1329.

Some refactoring & performance improvements are required, but will be done in a separate PR.

import AwsCommonRuntimeKit

/// Reads data from the input stream, chunks it, and passes it out through its output stream.
class AWSChunkedStream {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to be a protocol, I changed it to a concrete stream type that chunks its input and outputs an AWS-chunked stream.

self.outputStream.isSeekable
}

func read(upToCount count: Int) throws -> Data? {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read() and readAsync() will use the input stream and reader to create a new chunk once the output stream has been exhausted. This will prevent chunks from building up in memory well in advance of when they can be transmitted.

}
}

public func setAwsChunkedBody(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method now just wraps the body stream in a AWSChunkedStream when the request is chunked.

}
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chunk method that used to be here is removed; AWSChunkedStream and the CRT HTTP client now have their own code for this.

@@ -165,6 +165,49 @@ public class CRTClientEngine: HTTPClient {
do {
let stream = try connection.makeRequest(requestOptions: requestOptions)
try stream.activate()
if request.isChunked {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block of code now incorporates the code that used to be in the static method for chunking a body.

/// Actor used to isolate the stream status from multiple concurrent accesses.
actor ReadableStreamStatus {
/// A Logger for logging events.
private let logger: LogAgent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put a Logger in this type, and I also fixed the issue where stream writes can get out-of-order.

@@ -130,7 +130,7 @@ public final class URLSessionHTTPClient: HTTPClient {
func urlSession(
_ session: URLSession, dataTask: URLSessionDataTask, didReceive response: URLResponse
) async -> URLSession.ResponseDisposition {
logger.debug("urlSession(_:dataTask:didReceive:) called")
logger.debug("urlSession(_:dataTask:didReceive response:) called")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging changes in this type, plus I adjusted the buffer size for streams a bit.

@@ -61,6 +61,13 @@ public class BufferedStream: Stream {
/// Access this value only while `lock` is locked, to prevent simultaneous access.
private var _buffer: Data

/// The number of bytes currently in the buffer, awaiting read.
public var bufferCount: Int {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this bufferCount property (to see if the stream is holding any data in-memory) and a little refactoring in this type.

import SmithyTestUtil
@testable import ClientRuntime

class FlexibleChecksumsMiddlewareTests: XCTestCase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reordered some of the steps in these tests since the middleware doesn't call next() at the very end of execution anymore.

@@ -102,7 +111,7 @@ public struct FlexibleChecksumsResponseMiddleware<OperationStackOutput>: Middlew
throw ClientError.dataNotFound("Cannot calculate the checksum of an empty body!")
}

return try await next.handle(context: context, input: input)
return output
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rearranged where next.handle() is called in this middleware so that multiple GET requests are not made to the server.

} else {
let currentChunkBody = awsChunkedStream.chunkedReader.getCurrentChunkBody()
if !currentChunkBody.isEmpty {
try await http1Stream.writeChunk(chunk: awsChunkedStream.chunkedReader.getCurrentChunk(), endOfStream: false)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lint warning

@jbelkins jbelkins merged commit 4d0ffae into main Mar 21, 2024
12 checks passed
@jbelkins jbelkins deleted the day/aws-chunked-encoding branch March 21, 2024 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants