Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transfer Manager Issues #1288

Closed
rdifalco opened this issue Sep 2, 2017 · 9 comments
Closed

Transfer Manager Issues #1288

rdifalco opened this issue Sep 2, 2017 · 9 comments
Labels
investigating This issue is being investigated and/or work is in progress to resolve the issue.

Comments

@rdifalco
Copy link

rdifalco commented Sep 2, 2017

I'm using the latest SDK as of this issue creation.

I'm trying to download a 7GB file and I keep getting this exception. I have an 8 minute socket timeout, 5 minute read timeout. And I have many retries configured. My code is simple:

Halp!

            Download download =
                s3TransferManager.download(getBucketName(), srcKey, file, Duration.ofHours(2).toMillis());

            try {
                download.waitForCompletion();

            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                throw new RuntimeException(e);
            }
om.amazonaws.SdkClientException: Unable to store object contents to disk: Connection reset
  at com.amazonaws.services.s3.internal.ServiceUtils.downloadToFile(ServiceUtils.java:313) 
  at com.amazonaws.services.s3.internal.ServiceUtils.downloadObjectToFile(ServiceUtils.java:270) 
  at com.amazonaws.services.s3.internal.ServiceUtils.retryableDownloadS3ObjectToFile(ServiceUtils.java:402) 
  at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1455) 
  at com.amazonaws.services.s3.transfer.internal.DownloadPartCallable.call(DownloadPartCallable.java:59) 
  at com.amazonaws.services.s3.transfer.internal.DownloadPartCallable.call(DownloadPartCallable.java:31) 
  at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_101]
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_101]
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_101]
  at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]
Caused by: java.net.SocketException: Connection reset
  at java.net.SocketInputStream.read(SocketInputStream.java:209) ~[?:1.8.0_101]
  at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_101]
  at sun.security.ssl.InputRecord.readFully(InputRecord.java:465) ~[?:1.8.0_101]
  at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593) ~[?:1.8.0_101]
  at sun.security.ssl.InputRecord.read(InputRecord.java:532) ~[?:1.8.0_101]
  at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973) ~[?:1.8.0_101]
  at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930) ~[?:1.8.0_101]
  at sun.security.ssl.AppInputStream.read(AppInputStream.java:105) ~[?:1.8.0_101]
  at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) 
  at org.apache.http.impl.io.SessionInputBufferImpl.read(SessionInputBufferImpl.java:198) 
  at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:176) 
  at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:135) 
  at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82) 
  at com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180) 
  at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82) 
  at com.amazonaws.services.s3.internal.S3AbortableInputStream.read(S3AbortableInputStream.java:125) 
  at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82) 
  at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82) 
  at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82) 
  at com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180) 
  at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82) 
  at com.amazonaws.util.LengthCheckInputStream.read(LengthCheckInputStream.java:107) 
  at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82) 
  at java.io.FilterInputStream.read(FilterInputStream.java:107) ~[?:1.8.0_101]
  at com.amazonaws.services.s3.internal.ServiceUtils.downloadToFile(ServiceUtils.java:307) 
  ... 9 more
@rdifalco
Copy link
Author

rdifalco commented Sep 2, 2017

I am using the VPC endpoint FWIW.

@rdifalco
Copy link
Author

rdifalco commented Sep 2, 2017

It would also be nice if you cleaned up all the temporary part files. :)

@dagnir dagnir added the investigating This issue is being investigated and/or work is in progress to resolve the issue. label Sep 5, 2017
@dagnir
Copy link
Contributor

dagnir commented Sep 5, 2017

Hi @rdifalco Unfortunately, tweaking retries at the client level would not help in this case as the IOException is happening outside of the HTTP client request execution so the retry configurations would not apply; see #485 for a relevant discussion on this.

retryableDownloadS3ObjectToFile does do an automatic retry when it encounters a SocketException but it only retries once, and the number of retries is not configurable. This not exactly the best solution, and we'll be actively addressing this issue directly in v2: aws/aws-sdk-java-v2#139

Are you seeing this issue every time you attempt to download the object?

@rdifalco
Copy link
Author

rdifalco commented Sep 6, 2017

I'm not sure why, but yeah, it happens every time I do this download on EC2. When I use my local dev machine I can make it work (although it almost always gets the one retry).

FWIW, if I simply Stream the file to disk it works every time even on EC2. And if I use a giant direct buffer with file channel it's almost as fast as the multi-part download. And no socket resets.

So I dunno, maybe I have too many threads and parts and that is causing the SDK connection pool to get exhausted in some way that creates the connection reset? That's my best guess at the moment.

How long before the S3 server times out a client connection and causes a connection reset on the client side?

@dagnir
Copy link
Contributor

dagnir commented Sep 6, 2017

Hmm, it seems strange that this works on your dev machine. Have you perhaps tried this on an EC2 instance not in a VPC to see if that changes anything?

According to #373, this page used to say that S3 would accept a maximum of 100 requests on a connection be resetting the connection but it no longer says that so it may have changed.

Can you give me more details on your environment? SDK version, configuration, OS, etc?

@rdifalco
Copy link
Author

rdifalco commented Sep 6, 2017

It could be that the S3 client connection pool on my local test was less contended than the one on my EC2 server. I haven't tried putting a simple case onto S3 yet. I assumed that the S3 servers would kill a connection (causing a client reset) if a connection was open for some period of time without any activity. Hence my concern about TransferManager getting into kind of deadlock issue due to it needing to lease another connection from the connection pool but the connection pool being exhausted. This cases other threads in TransferManager to wait for too long because maybe it's a control thread that requires that connection to mark some monitor that will let the blocked thread proceed. So, IOW, running out of client connections creates the deadlock (or long pause) not the usual use of a Bounded Thread Pool with TransferManager. I'll try to add more data later, but for now I have this working by avoiding TransferManager and writing the file myself.

@rdifalco rdifalco closed this as completed Sep 6, 2017
@rdifalco rdifalco reopened this Sep 6, 2017
@dagnir
Copy link
Contributor

dagnir commented Sep 6, 2017

Okay, glad to see you at least have a workaround going.

FWIW, it's probably not due to connection pool contention/stale connections since the connection reset from the provided stacktrace is happening once S3 has already responded to the request.

@dagnir
Copy link
Contributor

dagnir commented Dec 1, 2017

Going to close this for now. Please feel free to reopen where there's more data.

@dagnir dagnir closed this as completed Dec 1, 2017
@NuonDara
Copy link

NuonDara commented Jul 28, 2018

Hi guys,
I am trying upload an file to s3, but shows me an error "Unable to store object contents to disk".
The file's size is 50MB.
What is wrong?
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigating This issue is being investigated and/or work is in progress to resolve the issue.
Projects
None yet
Development

No branches or pull requests

3 participants