Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise S3 interaction #2466

Closed
pomadchin opened this issue Nov 2, 2017 · 1 comment · Fixed by #2911
Closed

Optimise S3 interaction #2466

pomadchin opened this issue Nov 2, 2017 · 1 comment · Fixed by #2911
Milestone

Comments

@pomadchin
Copy link
Member

We figured out that S3 connections are not parallelized well in the Java AWS SDK.
It was measured that on slow connections it makes sense to parallelize all connections by spawning more threads and splitting the request data into smaller chunks. The slower internet connection is the bigger difference is easier to notice. According to this fact we can investigate how it is possible to speed up already fast (but probably not enough) S3 connections (how to speed up getObject(request).getObjectContent calls).

  1. How we can use Futures and how it is possible to determine an optimal way to split data into chunks to parallelize everything using setRange queries. Does this approach makes sense at all?
  2. There is an interesting TransferManager API, which works faster (or should work faster as my tests were limited) but it makes a good highly parallelized downloads into files. The disadvantage of this API that it works only with files (downloads data from S3 into files). We can consider building an in-memory version of it and verifying that it makes sense and it indeed effects on objects downloads. There is already an issue in their repo: S3 TransferManager Should Allow Downloading to Stream aws/aws-sdk-java#893

It is a bit a small R&D issue to clarify AWS S3 API and to double check that we use it efficient.

@pomadchin
Copy link
Member Author

Would be resolved via #2302

@moradology moradology mentioned this issue Apr 25, 2019
3 tasks
@echeipesh echeipesh added this to the 3.0 milestone May 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants