forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
SPARK-1676 Cache Hadoop UGIs by default to prevent FileSystem leak
UserGroupInformation objects (UGIs) are used for Hadoop security. A relatively recent PR (apache#29) makes Spark always use UGIs when executing tasks. Unfortunately, this causes HDFS-3545, which causes the FileSystem cache to continuously create new FileSystems, as the UGIs look different (even though they're logically identical). This causes a memory and sometimes file descriptor leak for FileSystems (like S3N) which maintain open connections. This solution is to introduce a config option (enabled by default) which reuses a single Spark user UGI, rather than creating new ones for each task. The downside to this approach is that UGIs cannot be safely cached (see the notes in HDFS-3545). For example, if a token expires, it will never be cleared from the UGI but may be used anyway (usage of a particular token on a UGI is nondeterministic as it is backed by a Set). This setting is enabled by default because the memory leak can become serious very quickly. In one benchmark, attempting to read 10k files from an S3 directory caused 45k connections to remain open to S3 after the job completed. These file descriptors are never cleaned up, nor the memory used by the associated FileSystems. Conflicts: docs/configuration.md yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
- Loading branch information
Showing
5 changed files
with
43 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters