Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove concurrent doubly linked list from kotlinx.coroutines codebase #3886

Open
qwwdfsad opened this issue Sep 15, 2023 · 1 comment
Open
Assignees

Comments

@qwwdfsad
Copy link
Collaborator

We have LockFreeLinkedListNode and co. based on the "Lock-Free and Practical Doubly Linked List-Based Deques Using Single-Word Compare-and-Swap" paper.

The implementation has a long-standing tail of issues:

  1. The paper itself is known for having non-linearizability issues that trigger various failures when the data structure is stressed enough
  2. DCLL is too generic: it allows all operations a trivial double-linked list (bi-directional iterations, mid-section removals etc.), which makes the reasoning and the maintenance of the concurrent invariants a tough task. Attempts to bisect a compact subset of only required invariants all failed.
  3. The implementation is slower than it could be: any mutating operation implies at least 4 CASes; any added element corresponds to a separate object with multiple atomic fields (prev, next, removal marker)
  4. Due to 2 & 3, the correct implementation is bloated, which contributes ~10% of optimized DEX size of kotlinx.coroutines.

The proposed solution is straightforward -- get rid of DCLL and replace it with recently added FADD-based ConcurrentLinkedList that semaphore, mutex and channels leverage

@SIMULATAN
Copy link

I just spotted a 1-core 100% CPU usage (I've 8 cores in total) after hours of idling at 0% in my application.
CPU graph

According to top -H, 100% of my application CPU usage is caused by the DefaultDispatcher thread. Looking at the thread dump, it only has this one stack:

kotlinx.coroutines.internal.LockFreeLinkedListNode.removeOrNext(LockFreeLinkedList.kt:208)	DefaultDispatcher-worker-6

After profiling, JMC shows the most sampled method to be, by far, LockFreeLinkedListNode.getNext().
JMC

Could this be related to these described issues and will it be fixed with the removal of the implementation?
I'm using Kotlin 2.0.10 + kotlinx.coroutines 1.8.1 - will upgrade in a sec and see if I can reproduce it with 1.9.0. Application source

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants