-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
出现主节点丢失,不能重新自动选举leader的问题 #111
Comments
请确定原因,并提供dump文件。 |
触发导致的条件单实例断开和zookeeper的链接,过了一段时间恢复网络链接,应该是没有获取到RECONNECT事件导致的 ,具体dump文件后续复现一下再提供 |
我们在生产环境也发现了这个问题,在elastic-job-console-1.0.2上看到任务是正常的,但是实际这个分片一直没跑,dump日志如下: 2016-07-12 14:05:36 "Attach Listener" #12237 daemon prio=9 os_prio=0 tid=0x00007f5d6c001000 nid=0x2135 waiting on condition [0x0000000000000000] "Curator-LeaderLatch-0" #12236 daemon prio=5 os_prio=0 tid=0x00007f5d3400b800 nid=0x41a7 in Object.wait() [0x00007f5d60b80000] "Curator-LeaderLatch-0" #12235 daemon prio=5 os_prio=0 tid=0x00007f5d2c2d5800 nid=0x4129 in Object.wait() [0x00007f5d60a7f000] "ThreadPoolTaskExecutor-10" #40 prio=5 os_prio=0 tid=0x00007f5d2c312000 nid=0x6df1 waiting on condition [0x00007f5d60c81000] "ThreadPoolTaskExecutor-9" #38 prio=5 os_prio=0 tid=0x00007f5d2c21c000 nid=0x6a43 waiting on condition [0x00007f5d60d82000] "ThreadPoolTaskExecutor-8" #37 prio=5 os_prio=0 tid=0x00007f5d2c29f000 nid=0x68ae waiting on condition [0x00007f5d60e83000] "ThreadPoolTaskExecutor-7" #35 prio=5 os_prio=0 tid=0x00007f5d2c29e000 nid=0x6861 waiting on condition [0x00007f5d60f84000] "ThreadPoolTaskExecutor-6" #34 prio=5 os_prio=0 tid=0x00007f5d2c01a000 nid=0x67eb waiting on condition [0x00007f5d61085000] "ThreadPoolTaskExecutor-5" #33 prio=5 os_prio=0 tid=0x00007f5d2c2a1800 nid=0x5fa4 waiting on condition [0x00007f5d61186000] "ThreadPoolTaskExecutor-4" #32 prio=5 os_prio=0 tid=0x00007f5d2c2a0800 nid=0x5f3f waiting on condition [0x00007f5d61287000] "ThreadPoolTaskExecutor-3" #29 prio=5 os_prio=0 tid=0x00007f5d2c15c800 nid=0x59de waiting on condition [0x00007f5d61388000] "ThreadPoolTaskExecutor-2" #27 prio=5 os_prio=0 tid=0x00007f5d2c309000 nid=0x58d6 waiting on condition [0x00007f5d6178a000] "commons-pool-EvictionTimer" #26 daemon prio=5 os_prio=0 tid=0x00007f5d20138800 nid=0x4deb in Object.wait() [0x00007f5d61689000] "ThreadPoolTaskExecutor-1" #24 prio=5 os_prio=0 tid=0x00007f5d2c274800 nid=0x4de6 waiting on condition [0x00007f5d6216d000] "Timer-1" #22 daemon prio=5 os_prio=0 tid=0x00007f5d2c04a000 nid=0x4d39 in Object.wait() [0x00007f5d7c313000] "Abandoned connection cleanup thread" #21 daemon prio=5 os_prio=0 tid=0x00007f5d2c05c800 nid=0x4d38 in Object.wait() [0x00007f5d7c414000] "DestroyJavaVM" #20 prio=5 os_prio=0 tid=0x00007f5d9c009000 nid=0x4d1b waiting on condition [0x0000000000000000] "Timer-0" #19 daemon prio=5 os_prio=0 tid=0x00007f5d9c7ad000 nid=0x4d33 in Object.wait() [0x00007f5d7c715000] "DEFAULT.processFeedDataJob_Scheduler_QuartzSchedulerThread" #18 prio=5 os_prio=0 tid=0x00007f5d9c792800 nid=0x4d32 in Object.wait() [0x00007f5d7c816000] "DEFAULT.processFeedDataJob_Scheduler_Worker-1" #17 prio=5 os_prio=0 tid=0x00007f5d9c783800 nid=0x4d31 in Object.wait() [0x00007f5d7c917000] "pool-7-thread-1" #16 prio=5 os_prio=0 tid=0x00007f5d9c76c800 nid=0x4d30 waiting on condition [0x00007f5d7ca1d000] "Curator-TreeCache-0" #14 daemon prio=5 os_prio=0 tid=0x00007f5d3c006800 nid=0x4d2e in Object.wait() [0x00007f5d7cd1e000] "Curator-Framework-0" #13 daemon prio=5 os_prio=0 tid=0x00007f5d9c6f7000 nid=0x4d2d waiting on condition [0x00007f5d7ce1f000] "main-EventThread" #12 daemon prio=5 os_prio=0 tid=0x00007f5d9c737000 nid=0x4d2c waiting on condition [0x00007f5d7d120000] "main-SendThread(10.144.156.103:2181)" #11 daemon prio=5 os_prio=0 tid=0x00007f5d9c736000 nid=0x4d2b runnable [0x00007f5d7d221000] "Curator-ConnectionStateManager-0" #10 daemon prio=5 os_prio=0 tid=0x00007f5d9c6df000 nid=0x4d2a waiting on condition [0x00007f5d7d533000] "Service Thread" #8 daemon prio=9 os_prio=0 tid=0x00007f5d9c0c1000 nid=0x4d28 runnable [0x0000000000000000] "C1 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007f5d9c0bb800 nid=0x4d27 waiting on condition [0x0000000000000000] "C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f5d9c0ba000 nid=0x4d26 waiting on condition [0x0000000000000000] "C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f5d9c0b7000 nid=0x4d25 waiting on condition [0x0000000000000000] "Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f5d9c0b5800 nid=0x4d24 runnable [0x0000000000000000] "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f5d9c07e000 nid=0x4d22 in Object.wait() [0x00007f5d7e648000] "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f5d9c07b800 nid=0x4d21 in Object.wait() [0x00007f5d7e749000] "VM Thread" os_prio=0 tid=0x00007f5d9c076800 nid=0x4d20 runnable "GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f5d9c01e000 nid=0x4d1c runnable "GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f5d9c020000 nid=0x4d1d runnable "GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f5d9c021800 nid=0x4d1e runnable "GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f5d9c023800 nid=0x4d1f runnable "VM Periodic Task Thread" os_prio=0 tid=0x00007f5d9c0c3800 nid=0x4d29 waiting on condition JNI global references: 292 |
请提供esjob的dump并说明版本号 |
版本是elastic-job-core-1.0.2,使用curator-framework-2.8.0.jar,zookeeper-3.4.6.jar |
会不会是,你的任务下次执行时间还没有到?印象分片是在任务执行的时候触发的 |
任务是10秒钟触发一次,执行时间肯定到了,现在是任务hang住,不动了 |
请升级至1.0.6之后的版本,1.0.2之前确实有这个问题 |
好的,谢谢 |
启动一个定时任务,如果是主节点会在zk相应server目录下创建leader/election/host路径,不知道什么原因,这个host节点丢失了,框架也不能自动去选举主节点,导致任务hang住,一直等待选举
The text was updated successfully, but these errors were encountered: