We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
高并发请求时, 偶发复现
间隔指定时间, 比如 10 分钟, 再进行 threaddump
下面是服务器上的日志记录, 可以看到 dubbo 线程池耗尽时的 threaddump 并没有按照预期的间隔 10分钟进行
-rw-rw-r-- 1 root root 695929 Sep 1 19:30 Dubbo_JStack.log.2023-09-01_19:30:39 -rw-rw-r-- 1 root root 1068779 Sep 3 09:00 Dubbo_JStack.log.2023-09-03_09:00:35 -rw-rw-r-- 1 root root 2326331 Sep 3 23:53 Dubbo_JStack.log.2023-09-03_23:53:21 -rw-rw-r-- 1 root root 1081798 Sep 5 09:00 Dubbo_JStack.log.2023-09-05_09:00:35 -rw-rw-r-- 1 root root 1652182 Sep 5 09:00 Dubbo_JStack.log.2023-09-05_09:00:38 -rw-rw-r-- 1 root root 956693 Sep 9 19:30 Dubbo_JStack.log.2023-09-09_19:30:34 -rw-rw-r-- 1 root root 1515179 Sep 9 19:30 Dubbo_JStack.log.2023-09-09_19:30:37 -rw-rw-r-- 1 root root 981556 Sep 10 09:00 Dubbo_JStack.log.2023-09-10_09:00:38 -rw-rw-r-- 1 root root 1443054 Sep 10 09:00 Dubbo_JStack.log.2023-09-10_09:00:42 -rw-rw-r-- 1 root root 999105 Sep 11 09:00 Dubbo_JStack.log.2023-09-11_09:00:38 -rw-rw-r-- 1 root root 1500287 Sep 11 09:00 Dubbo_JStack.log.2023-09-11_09:00:43
按照以下顺序获取锁就可能导致前面的现象, 线程切换 + 错误的顺序设置 lastPrintTime 共同导致了这个问题
thread1: 1. 检查10分钟间隔通过 3. 获取锁 4. threaddump(stw) 5. 释放锁 7. 设置 lastPrintTime thread2: 2. 检查10分钟间隔通过 6. 获取锁 8. threaddump(stw) 9. 释放锁
源码如下 https://github.com/giraffe-tree/dubbo/blob/3.2/dubbo-common/src/main/java/org/apache/dubbo/common/threadpool/support/AbortPolicyWithReport.java
//dump every 10 minutes if (now - lastPrintTime < TEN_MINUTES_MILLS) { return; } if (!guard.tryAcquire()) { return; } ExecutorService pool = Executors.newSingleThreadExecutor(); pool.execute(() -> { // ....代码省略 try (FileOutputStream jStackStream = new FileOutputStream( new File(dumpPath, "Dubbo_JStack.log" + "." + dateStr))) { jstack(jStackStream); } catch (Exception t) { logger.error(COMMON_UNEXPECTED_CREATE_DUMP, "", "", "dump jStack error", t); } finally { guard.release(); } lastPrintTime = System.currentTimeMillis(); }); //must shutdown thread pool ,if not will lead to OOM pool.shutdown();
if (now - lastPrintTime < TEN_MINUTES_MILLS) { return; } if (!guard.tryAcquire()) { return; } // 1. 此处再次判断时间 if (System.currentTimeMillis() - lastPrintTime < TEN_MINUTES_MILLS) { return; } ExecutorService pool = Executors.newSingleThreadExecutor(); pool.execute(() -> { // ....代码省略 try (FileOutputStream jStackStream = new FileOutputStream( new File(dumpPath, "Dubbo_JStack.log" + "." + dateStr))) { jstack(jStackStream); } catch (Exception t) { logger.error(COMMON_UNEXPECTED_CREATE_DUMP, "", "", "dump jStack error", t); } finally { // 2. 在释放锁前设置 lastPrintTime lastPrintTime = System.currentTimeMillis(); guard.release(); } }); //must shutdown thread pool ,if not will lead to OOM pool.shutdown();
The text was updated successfully, but these errors were encountered:
fix: AbortPolicyWithReport concurrency issues apache#13042
982eb4c
fix: AbortPolicyWithReport concurrency issues #13042 (#13043)
d9db031
Successfully merging a pull request may close this issue.
Environment
Steps to reproduce this issue
高并发请求时, 偶发复现
Expected Behavior
间隔指定时间, 比如 10 分钟, 再进行 threaddump
Actual Behavior
下面是服务器上的日志记录, 可以看到 dubbo 线程池耗尽时的 threaddump 并没有按照预期的间隔 10分钟进行
原因分析
按照以下顺序获取锁就可能导致前面的现象, 线程切换 + 错误的顺序设置 lastPrintTime 共同导致了这个问题
源码如下 https://github.com/giraffe-tree/dubbo/blob/3.2/dubbo-common/src/main/java/org/apache/dubbo/common/threadpool/support/AbortPolicyWithReport.java
修复方案
The text was updated successfully, but these errors were encountered: