Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add manual exit at the end of FUSE life cycle #18211

Merged
merged 1 commit into from
Sep 26, 2023

Conversation

jiacheliu3
Copy link
Contributor

@jiacheliu3 jiacheliu3 commented Sep 26, 2023

What changes are proposed in this pull request?

Added a manual double-safety System.exit() at the end of FUSE lifecycle, to be absolutely sure we exit the JVM and terminate all non-daemon & daemon threads.

Before this change, after we alluxio-fuse umount or kill (not -9), the FUSE process may fail to quit. One reason is a non-daemon thread dangling around:

# This is a non daemon thread in AlluxioEtcdClient
"vert.x-eventloop-thread-0" #21 prio=5 os_prio=31 cpu=482.23ms elapsed=722.11s tid=0x00007fde79251800 nid=0x8c03 runnable  [0x00007000102ae000]
   java.lang.Thread.State: RUNNABLE
   at sun.nio.ch.KQueue.poll(java.base@11.0.11/Native Method)
   at sun.nio.ch.KQueueSelectorImpl.doSelect(java.base@11.0.11/KQueueSelectorImpl.java:122)
   at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@11.0.11/SelectorImpl.java:124)
   - locked <0x00000007c0603938> (a io.netty.channel.nio.SelectedSelectionKeySet)
   - locked <0x00000007c06038d8> (a sun.nio.ch.KQueueSelectorImpl)
   at sun.nio.ch.SelectorImpl.select(java.base@11.0.11/SelectorImpl.java:136)
   at io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:62)
   at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:883)
   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:526)
   at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
   at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
   at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)

The JVM shutdownHook executes after the LAST non-daemon thread dies or when System.exit() triggers. So if some libraries we use or some thread pools of ours accidentally introduce non-daemon threads, System.exit() is our last resort to trigger shutdownHook.

FUSE does handle signals and exit, but there are some corner cases which don't seem to trigger that successfully. So this PR serves as the last resort.

Why are the changes needed?

See above

Does this PR introduce any user facing changes?

No

@jiacheliu3 jiacheliu3 added type-code-quality code quality improvement type-bug This issue is about a bug area-fuse Alluxio fuse integration labels Sep 26, 2023
@jiacheliu3
Copy link
Contributor Author

alluxio-bot, merge this please

@alluxio-bot alluxio-bot merged commit 926f393 into Alluxio:main Sep 26, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-fuse Alluxio fuse integration type-bug This issue is about a bug type-code-quality code quality improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants