Replies: 2 comments
-
It looks like Alluxio ? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Velox local caching feature is very helpful when reading from slow storage(S3 or remote HDFS w/ HDDs), it does not support sharing between executors on one compute node. For Spark applications one typical setup is to have multiple executors running on one compute node, e.g. on a 96-core server people may allocate 24 executors with 4 core per executor. Currently if local SSD cache is enable in this case, the same content maybe access by different executors thus the data will be cached multiple times on that compute node. The cache is not efficiently used. A shared local SSD cache can fix this issue. Here's a rough design for this:
CC: @oerling @mbasmanova
thanks, -yuan
Beta Was this translation helpful? Give feedback.
All reactions