-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have to choose between deadlocks, unbounded executors, or internal APIs #939
Comments
What features are you using in TransferManager? We've been actively trying to resolve any deadlock issues that come up whilst using TransferManager (latest one was #896). In the next major-version of the SDK we'll be looking to make use of Java 7's new ForkJoinPool to better manage tasks/sub-tasks. |
All the code that uses TransferManager can be seen here: https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java. We're seeing deadlocks when S3AFileSystem.rename(), which ultimately calls TransferManager.copy(), is called from multiple threads. |
Does using a bounded queue with a |
Using CallerRunsPolicy fixed this specific issue as long as it was used directly and not nested in the BlockingThreadPoolExecutorService, but that would leave us with the original that the back-pressure was meant to fix. For now, we've gone with having a separate unbounded threadpool just for the TransferManager, and the previous design still in use for everything else. |
@mackrorysd I don't see much we can do in the current I'm going to close this for now because it seems that you found a way forward that allowed you to get unblocked. We'd love to get feedback on v2 of the SDK including ideas for how to make |
TransferManager documents the following:
An unbounded executor is going to be a tough sell in some cases. The alternative I see is an executor that has distinct resource pools for tasks that are dependent on others. However that's also not currently possible without depending on the contents of the transfer.internal.* package, which is not ideal either. I'd like to request a better mechanism be made available for having strict control over resource limits without risking deadlock. For some context on where I'm coming from, see https://issues.apache.org/jira/browse/HADOOP-13826.
The text was updated successfully, but these errors were encountered: