-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-4307] Initialize FileDescriptor lazily in FileRegion. #3172
Conversation
@@ -22,6 +22,9 @@ import java.nio.ByteBuffer | |||
import java.util.concurrent.ConcurrentLinkedQueue | |||
import java.util.concurrent.atomic.AtomicInteger | |||
|
|||
import org.apache.spark.network.netty.SparkTransportConf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: import order
Test build #23111 has started for PR 3172 at commit
|
this.conf = conf; | ||
} | ||
|
||
@VisibleForTesting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably don't need 2 VisibleForTesting constructors, do we? If someone wants to provide an explicit Executor, they can damn well create their own TransportConf. This is only done in a single unit test anyway.
LGTM, just a few minor style nits. |
Test build #23111 has finished for PR 3172 at commit
|
Test FAILed. |
Test build #23143 has started for PR 3172 at commit
|
Test build #23143 has finished for PR 3172 at commit
|
Test FAILed. |
Test build #23147 has started for PR 3172 at commit
|
Test build #23147 has finished for PR 3172 at commit
|
Test FAILed. |
Test build #515 has started for PR 3172 at commit
|
Test build #515 has finished for PR 3172 at commit
|
@aarondav comments addressed. |
Cool, everything LGTM. Would you mind adding a unit test for LazyFileRegion, though, in case people try to modify it in the future? |
Do you think we can add this directly to Netty? Doing this at Spark level means Epoll won't support this feature. More specifically, I was referring to LazyFileRegion.java. |
Test build #23190 has started for PR 3172 at commit
|
Test build #23190 has finished for PR 3172 at commit
|
Test PASSed. |
Merging into master and branch-1.2.0. |
Netty's DefaultFileRegion requires a FileDescriptor in its constructor, which means we need to have a opened file handle. In super large workloads, this could lead to too many open files due to the way these file descriptors are cleaned. This pull request creates a new LazyFileRegion that initializes the FileDescriptor when we are sending data for the first time. Author: Reynold Xin <rxin@databricks.com> Author: Reynold Xin <rxin@apache.org> Closes #3172 from rxin/lazyFD and squashes the following commits: 0bdcdc6 [Reynold Xin] Added reference to Netty's DefaultFileRegion d4564ae [Reynold Xin] Added SparkConf to the ctor argument of IndexShuffleBlockManager. 6ed369e [Reynold Xin] Code review feedback. 04cddc8 [Reynold Xin] [SPARK-4307] Initialize FileDescriptor lazily in FileRegion. (cherry picked from commit ef29a9a) Signed-off-by: Aaron Davidson <aaron@databricks.com>
@rxin, @normanmaurer told me via IM that he has a patch pending. Could you file an issue and CC him? |
Done - netty/netty#3129 |
Netty's DefaultFileRegion requires a FileDescriptor in its constructor, which means we need to have a opened file handle. In super large workloads, this could lead to too many open files due to the way these file descriptors are cleaned. This pull request creates a new LazyFileRegion that initializes the FileDescriptor when we are sending data for the first time.