You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We figured out that S3 connections are not parallelized well in the Java AWS SDK.
It was measured that on slow connections it makes sense to parallelize all connections by spawning more threads and splitting the request data into smaller chunks. The slower internet connection is the bigger difference is easier to notice. According to this fact we can investigate how it is possible to speed up already fast (but probably not enough) S3 connections (how to speed up getObject(request).getObjectContent calls).
How we can use Futures and how it is possible to determine an optimal way to split data into chunks to parallelize everything using setRange queries. Does this approach makes sense at all?
There is an interesting TransferManager API, which works faster (or should work faster as my tests were limited) but it makes a good highly parallelized downloads into files. The disadvantage of this API that it works only with files (downloads data from S3 into files). We can consider building an in-memory version of it and verifying that it makes sense and it indeed effects on objects downloads. There is already an issue in their repo: S3 TransferManager Should Allow Downloading to Stream aws/aws-sdk-java#893
It is a bit a small R&D issue to clarify AWS S3 API and to double check that we use it efficient.
The text was updated successfully, but these errors were encountered:
We figured out that
S3
connections are not parallelized well in the JavaAWS SDK
.It was measured that on slow connections it makes sense to parallelize all connections by spawning more threads and splitting the request data into smaller chunks. The slower internet connection is the bigger difference is easier to notice. According to this fact we can investigate how it is possible to speed up already fast (but probably not enough)
S3
connections (how to speed upgetObject(request).getObjectContent
calls).Futures
and how it is possible to determine an optimal way to split data into chunks to parallelize everything using setRange queries. Does this approach makes sense at all?S3
into files). We can consider building an in-memory version of it and verifying that it makes sense and it indeed effects on objects downloads. There is already an issue in their repo: S3 TransferManager Should Allow Downloading to Stream aws/aws-sdk-java#893It is a bit a small R&D issue to clarify AWS S3 API and to double check that we use it efficient.
The text was updated successfully, but these errors were encountered: