[S3 upload] API to upload outputstream #1268

16srana · 2017-08-08T13:53:45Z

Current API in sdk accepts either file or inputStream object to upload content, which is more sort of a pull mechanism from s3 to upload content.
Can we have a API more sort of a push mechanism, like an API which accepts outputStream object to upload any content.

Actually we are using s3 to upload large files in GBs to be distributed to our users, which we generate based on our users request, currently we save the file locally and then upload it, in ideal case we would like to generate/write the zip file directly to s3.

millems · 2017-08-10T23:10:11Z

Can you describe your use-case in a little bit more depth? You'd like us to return an OutputStream to which you can push content?

Something like this? (Ignore syntax for now, just trying to understand)

SdkOutputStream<PutObjectResponse> contentStream = s3.initiatePutObject(request);
contentStream.write("My Data".getBytes());
PutObjectResponse response = contentStream.complete();

millems · 2017-08-11T20:06:34Z

See also: #1139 (comment)

16srana · 2017-08-25T07:17:02Z

Sorry for Late reply...
@millems Yeah the sudo code above is what expected, also the issue #1139 mentioned also has similar ask.
Can we Expect this API?
If yes then when ?
Actually in my service I have to generate huge zip files ranging from 2~12 GB in size, which I have to first write to filesystem and than upload it to s3, I would really love to merge these steps.

SdkOutputStream<PutObjectResponse> contentStream = s3.initiatePutObject(request);
ZipOutputStream os = new ZipOutputStream(contentStream.getOutputStream());
os.write("My Data".getBytes());
PutObjectResponse response = contentStream.complete();

Actually I will be wrapping the contentStream into my zip output stream object
I am expecting following two benefits:

Less memory use.
Less time taken to make the file available to my users.
Less IO operations.

shorea · 2017-08-25T14:20:56Z

Would a PipedInputStream and output stream work for you? You'd still need to know the content length in this case and set in on the PutObjectRequest otherwise the SDK will buffer it into memory.

If you don't know the content length then I think your best option is to do a multipart upload and buffer each part into memory. The minimum for part size is 5MB so memory consumption is manageable. I don't believe S3 supports chunked encoding which would allow for dynamic content, if that's desired than you'll have to make a feature request to the S3 service team.

16srana · 2017-08-28T11:16:38Z

@shorea I have tried the Piped Input/Output Stream, it involves making a reader thread and a writer thread and synchronizing them and messaging between them, that wasn't quite effective and had few performance hits as well, also there is lot of boiler plate code, I am already considering this as my backup option.

I was hoping an option from the sdk itself, but since you have mentioned that the support is not from the s3 itself, than I think I will have to rely on workarounds.
I was reading the s3 upload documentations and it does mention that, If the length of the content is unknown than the upload would happen in a single thread, which would take forever to upload large files.

shorea · 2017-08-28T22:50:39Z

Yeah if we don't know the content length we have to buffer contents into memory which is obviously not ideal. Doing the multipart upload makes the buffering less of a problem but adds complexity to the upload.

shorea · 2017-09-05T16:06:25Z

Going to close this and open a feature request in our V2 repo. I think there are a couple of things we can do to make streaming easier via TransferManager

shorea · 2017-09-05T16:12:59Z

aws/aws-sdk-java-v2#139

millems added the Question label Aug 10, 2017

millems self-assigned this Aug 10, 2017

shorea closed this as completed Sep 5, 2017

srchase added guidance Question that needs advice or information. and removed Question labels Jan 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[S3 upload] API to upload outputstream #1268

[S3 upload] API to upload outputstream #1268

16srana commented Aug 8, 2017

millems commented Aug 10, 2017

millems commented Aug 11, 2017

16srana commented Aug 25, 2017 •

edited

Loading

shorea commented Aug 25, 2017

16srana commented Aug 28, 2017 •

edited

Loading

shorea commented Aug 28, 2017

shorea commented Sep 5, 2017

shorea commented Sep 5, 2017

[S3 upload] API to upload outputstream #1268

[S3 upload] API to upload outputstream #1268

Comments

16srana commented Aug 8, 2017

millems commented Aug 10, 2017

millems commented Aug 11, 2017

16srana commented Aug 25, 2017 • edited Loading

shorea commented Aug 25, 2017

16srana commented Aug 28, 2017 • edited Loading

shorea commented Aug 28, 2017

shorea commented Sep 5, 2017

shorea commented Sep 5, 2017

16srana commented Aug 25, 2017 •

edited

Loading

16srana commented Aug 28, 2017 •

edited

Loading