Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU 100% in asyncProducer.newBrokerProducer.func2 #2251

Closed
MoilkNepho opened this issue Jun 10, 2022 · 4 comments
Closed

CPU 100% in asyncProducer.newBrokerProducer.func2 #2251

MoilkNepho opened this issue Jun 10, 2022 · 4 comments
Labels
bug needs-investigation Issues that require followup from maintainers stale Issues and pull requests without any recent activity

Comments

@MoilkNepho
Copy link

MoilkNepho commented Jun 10, 2022

Version:

Sarama Kafka Go
1.33.0 0.10.2.1 1.16.3

Problem

CPU 100% when closing broker, here is CPU profiling:

(pprof) top100                      
Showing nodes accounting for 150.58s, 99.69% of 151.05s total
Dropped 3 nodes (cum <= 0.76s)
      flat  flat%   sum%        cum   cum%
    64.28s 42.56% 42.56%    144.92s 95.94%  runtime.selectgo
    23.17s 15.34% 57.89%     23.17s 15.34%  runtime.lock2
    20.25s 13.41% 71.30%     20.25s 13.41%  runtime.unlock2
    14.32s  9.48% 80.78%     37.83s 25.04%  runtime.sellock
    13.52s  8.95% 89.73%     33.78s 22.36%  runtime.selunlock
     4.37s  2.89% 92.62%    151.05s   100%  github.com/Shopify/sarama.(*asyncProducer).newBrokerProducer.func2
     3.87s  2.56% 95.19%      3.87s  2.56%  runtime.fastrand (inline)
     1.91s  1.26% 96.45%      5.78s  3.83%  runtime.fastrandn
     1.69s  1.12% 97.57%      1.69s  1.12%  github.com/eapache/queue.(*Queue).Peek
     1.69s  1.12% 98.69%      2.85s  1.89%  runtime.typedmemclr
     1.16s  0.77% 99.46%      1.16s  0.77%  runtime.memclrNoHeapPointers
     0.24s  0.16% 99.62%     23.41s 15.50%  runtime.lockWithRank (inline)
     0.10s 0.066% 99.68%     23.51s 15.56%  runtime.lock
     0.01s 0.0066% 99.69%     20.26s 13.41%  runtime.unlock (inline)
         0     0% 99.69%    151.05s   100%  github.com/Shopify/sarama.withRecover
         0     0% 99.69%     20.25s 13.41%  runtime.unlockWithRank (inline)
(pprof) 
(pprof) list newBrokerProducer.func2
Total: 151.05s
ROUTINE ======================== github.com/Shopify/sarama.(*asyncProducer).newBrokerProducer.func2 in /data/root/go/pkg/mod/github.com/!shopify/sarama@v1.33.0/async_producer.go
     4.37s    151.05s (flat, cum)   100% of Total
         .          .    738:   // In order to avoid a deadlock when closing the broker on network or malformed response error
         .          .    739:   // we use an intermediate channel to buffer and send pending responses in order
         .          .    740:   // This is because the AsyncProduce callback inside the bridge is invoked from the broker
         .          .    741:   // responseReceiver goroutine and closing the broker requires such goroutine to be finished
         .          .    742:   go withRecover(func() {
         .       60ms    743:           buf := queue.New()
     500ms      500ms    744:           for {
      80ms       90ms    745:                   if buf.Length() == 0 {
         .          .    746:                           res, ok := <-pending
         .          .    747:                           if !ok {
         .          .    748:                                   // We are done forwarding the last pending response
         .          .    749:                                   close(responses)
         .          .    750:                                   return
         .          .    751:                           }
         .          .    752:                           buf.Add(res)
         .          .    753:                   }
         .          .    754:                   // Send the head pending response or buffer another one
         .          .    755:                   // so that we never block the callback
     550ms      2.24s    756:                   headRes := buf.Peek().(*brokerProducerResponse)
     2.03s    146.95s    757:                   select {
     740ms      740ms    758:                   case res, ok := <-pending:
      70ms       70ms    759:                           if !ok {
         .          .    760:                                   continue
         .          .    761:                           }
         .          .    762:                           buf.Add(res)
         .          .    763:                           continue
     400ms      400ms    764:                   case responses <- headRes:
         .          .    765:                           buf.Remove()
         .          .    766:                           continue
         .          .    767:                   }
         .          .    768:           }
         .          .    769:   })

Reference

@MoilkNepho
Copy link
Author

@slaunay hi~ can you please check this issue?

@github-actions github-actions bot added the stale Issues and pull requests without any recent activity label Aug 17, 2023
@IBM IBM deleted a comment from github-actions bot Aug 17, 2023
@dnwe
Copy link
Collaborator

dnwe commented Aug 17, 2023

@MoilkNepho did you resolve this issue with newer versions or is it something you still see occurring?

@github-actions github-actions bot removed the stale Issues and pull requests without any recent activity label Aug 17, 2023
@dethi
Copy link

dethi commented Aug 18, 2023

I was doing some testing of Sarama v1.38.1 today, trying to debug a deadlock in the AsyncProducer code that happens during broker rolling restart (ever since the v1.31.x release, unsure if it's in our code or Sarama for now), and I noticed the exact same behaviour from time to time: 100% CPU usage in that function.

So I would say it's still occurring yes.

@dnwe dnwe added bug needs-investigation Issues that require followup from maintainers labels Aug 18, 2023
Copy link

Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur.
Please check if the main branch has already resolved the issue since it was raised. If you believe the issue is still valid and you would like input from the maintainers then please comment to ask for it to be reviewed.

@github-actions github-actions bot added the stale Issues and pull requests without any recent activity label Nov 16, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug needs-investigation Issues that require followup from maintainers stale Issues and pull requests without any recent activity
Projects
None yet
Development

No branches or pull requests

3 participants