Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Wisp] Coroutine hangs permanently in AbstractQueuedSynchronizer #236

Closed
joeyleeeeeee97 opened this issue Jul 6, 2021 · 0 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@joeyleeeeeee97
Copy link
Contributor

Description
May hangs permanently in rare cases

Steps to Reproduce

import java.util.concurrent.ScheduledThreadPoolExecutor;
import java.util.concurrent.ThreadLocalRandom;
import java.util.concurrent.TimeUnit;

public class ScheduleTest {

    public static void main(String[] args) throws InterruptedException {

        ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(16);

        for (int i = 0; i < 16; i++) {
            int finalI = i;
            executor.scheduleAtFixedRate(new Runnable() {
                int count = 0;
                @Override
                public void run() {
                    try {
                        Thread.sleep(5);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                    System.out.println("thread:" + Thread.currentThread().getName() + ", schedule " + finalI + ": " + count++);
                }
            }, 100, ThreadLocalRandom.current().nextInt(5) + 10, TimeUnit.MILLISECONDS);
        }

        Thread thread = new Thread(new Runnable() {
            @Override
            public void run() {
                while(true) {
                    try {
                        final int[] count = {0};
                        while (true) {
                            count[0]++;
                            executor.submit(new Runnable() {
                                @Override
                                public void run() {
                                    System.out.println("submit: " + count[0]);
                                }
                            });

                            Thread.sleep(1);
                        }
                    } catch (Throwable e) {
                    }
                }
            }
        });

        thread.start();

        while (true) {
            thread.interrupt();

            Thread.sleep(10);
        }

    }
}

Soon we will see few threads are active and most threads are waiting in an illegal state:

- Coroutine [0x7f92705bd790] "hread-2" #172 active=61697 steal=11137       steal_fail=30 preempt=0 park=0/-1 cg=0/0 ttr=0 7395         at java.dyn.CoroutineSupport.unsafeSymmetricYieldTo(CoroutineSupport.java:140)
 7396         - parking to wait for  <0x00000004fec97e60> (a java.util.concurrent.locks.ReentrantLock$NonfairS      ync)
 7397         at com.alibaba.wisp.engine.WispTask.switchTo(WispTask.java:335)
 7398         at com.alibaba.wisp.engine.WispCarrier.yieldTo(WispCarrier.java:427)
 7399         at com.alibaba.wisp.engine.WispCarrier.schedule(WispCarrier.java:265)
 7400         at com.alibaba.wisp.engine.WispTask.parkInternal(WispTask.java:426)
 7401         at com.alibaba.wisp.engine.WispTask.jdkPark(WispTask.java:479)
 7402         at com.alibaba.wisp.engine.WispEngine$5.park(WispEngine.java:273)
 7403         at sun.misc.Unsafe.park(Unsafe.java:1029)
 7404         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:176)
 7405         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:842)
 7406         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:876)
 7407         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2092)
 7408         at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
 7409         at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)

Expected behavior
submit thread keep running

JDK version
8.7.7

Execution environment

  • OS and version:
  • CPU model:
  • Number of CPU cores:
  • Size of physical memory:
  • Inside Linux container?
    • Linux container name (docker, pouch, etc):
    • Linux container version:
@joeyleeeeeee97 joeyleeeeeee97 self-assigned this Jul 6, 2021
@joeyleeeeeee97 joeyleeeeeee97 added the bug Something isn't working label Jul 6, 2021
joeyleeeeeee97 pushed a commit to joeyleeeeeee97/dragonwell8_jdk that referenced this issue Jul 6, 2021
Summary: In wisp1 we use direct scheduler wakeup instead
of park/unpark for coroutine timed-waiting, so for historical
reason exists a manual status fix to lazy set status to free.
But in wisp2 when timer is waken up by unpark, a PERMITTED
may be overwriten and cause problem.

Test Plan: com/alibaba/rcm/

Reviewed-by: leiyu, zhengxiaolinX, sanhong.lsh

Issue: dragonwell-project/dragonwell8#236
joeyleeeeeee97 pushed a commit to joeyleeeeeee97/dragonwell8_jdk that referenced this issue Jul 6, 2021
Summary: In wisp1 we use direct scheduler wakeup instead
of park/unpark for coroutine timed-waiting, so for historical
reason exists a manual status fix to lazy set status to free.
But in wisp2 when timer is waken up by unpark, a PERMITTED
may be overwriten and cause problem.

Test Plan: com/alibaba/rcm/

Reviewed-by: leiyu, zhengxiaolinX, sanhong.lsh

Issue: dragonwell-project/dragonwell8#236
joeyleeeeeee97 pushed a commit to joeyleeeeeee97/dragonwell8_jdk that referenced this issue Jul 8, 2021
Summary: In wisp1 we use direct scheduler wakeup instead
of park/unpark for coroutine timed-waiting, so for historical
reason exists a manual status fix to lazy set status to free.
But in wisp2 when timer is waken up by unpark, a PERMITTED
may be overwriten and cause problem.

Test Plan: com/alibaba/rcm/

Reviewed-by: leiyu, zhengxiaolinX, sanhong.lsh

Issue: dragonwell-project/dragonwell8#236
joeyleeeeeee97 pushed a commit to joeyleeeeeee97/dragonwell8_jdk that referenced this issue Jul 8, 2021
Summary: In wisp1 we use direct scheduler wakeup instead
of park/unpark for coroutine timed-waiting, so for historical
reason exists a manual status fix to lazy set status to free.
But in wisp2 when timer is waken up by unpark, a PERMITTED
may be overwriten and cause problem.

Test Plan: com/alibaba/rcm/

Reviewed-by: leiyu, zhengxiaolinX, sanhong.lsh

Issue: dragonwell-project/dragonwell8#236
joeyleeeeeee97 pushed a commit to joeyleeeeeee97/dragonwell8_jdk that referenced this issue Jul 9, 2021
Summary: In wisp1 we use direct scheduler wakeup instead
of park/unpark for coroutine timed-waiting, so for historical
reason exists a manual status fix to lazy set status to free.
But in wisp2 when timer is waken up by unpark, a PERMITTED
may be overwriten and cause problem.

Test Plan: com/alibaba/rcm/

Reviewed-by: leiyu, zhengxiaolinX, sanhong.lsh

Issue: dragonwell-project/dragonwell8#236
joeyleeeeeee97 pushed a commit to dragonwell-project/dragonwell8_jdk that referenced this issue Jul 9, 2021
Summary: In wisp1 we use direct scheduler wakeup instead
of park/unpark for coroutine timed-waiting, so for historical
reason exists a manual status fix to lazy set status to free.
But in wisp2 when timer is waken up by unpark, a PERMITTED
may be overwriten and cause problem.

Test Plan: com/alibaba/rcm/

Reviewed-by: leiyu, zhengxiaolinX, sanhong.lsh

Issue: dragonwell-project/dragonwell8#236
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant