Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: add USR1 signal handler to dump goroutine #7587

Merged
merged 10 commits into from
Sep 3, 2018
Merged

server: add USR1 signal handler to dump goroutine #7587

merged 10 commits into from
Sep 3, 2018

Conversation

lysu
Copy link
Contributor

@lysu lysu commented Sep 3, 2018

What problem does this PR solve?

relate to #7558, we add USR1 signal handler to dump goroutine stack into stdout.

What is changed and how it works?

  • doesn't exit signal handler goroutine until cleanup down
  • handle USR1 signal and dump groutine into stdout

Check List

Tests

  • Manual test (add detailed scripts or steps below)
  1. pkill -USR1 tidb-server to send signal and check stdout see
2018/09/03 11:20:06.150  [info] === Got signal [user defined signal 1] to thread dump===
*** goroutine dump...
goroutine 107 [running]:
main.setupSignalHandler.func1(0xc4200f8e40, 0x14cdd80, 0xc4200dfec0)
	/home/robi/Code/go/src/github.com/pingcap/tidb/tidb-server/main.go:453 +0x2d8
created by main.setupSignalHandler
	/home/robi/Code/go/src/github.com/pingcap/tidb/tidb-server/main.go:446 +0x12e

goroutine 1 [sleep, 10 minutes]:
time.Sleep(0x1176592e000)
	/home/robi/go1.10/src/runtime/time.go:102 +0x166
main.cleanup()
	/home/robi/Code/go/src/github.com/pingcap/tidb/tidb-server/main.go:526 +0x3b
main.main()
	/home/robi/Code/go/src/github.com/pingcap/tidb/tidb-server/main.go:147 +0xbc

goroutine 18 [syscall]:
os/signal.signal_recv(0x14c2840)
	/home/robi/go1.10/src/runtime/sigqueue.go:139 +0xa6
os/signal.loop()
	/home/robi/go1.10/src/os/signal/signal_unix.go:22 +0x22
created by os/signal.init.0
	/home/robi/go1.10/src/os/signal/signal_unix.go:28 +0x41

goroutine 114 [chan receive]:
github.com/pingcap/tidb/util/systimemon.StartMonitor(0x1410138, 0x140f6a8, 0xc420038f20)
	/home/robi/Code/go/src/github.com/pingcap/tidb/util/systimemon/systime_mon.go:30 +0x154
created by main.setupMetrics
	/home/robi/Code/go/src/github.com/pingcap/tidb/tidb-server/main.go:493 +0xb5
......
  1. temporal add Sleep(time.Mintue*20) in cleanup(main.go).
  2. pkill -QUIT tidb-server to shutdown tidb-server
  3. pkill -USR1 tidb-server again and got dump output too

Code changes

  • signal handler change

Side effects

  • no

Related changes

  • no

This change is Reviewable

@lysu lysu changed the title server: add USR1 signal handler to goroutine dump server: add USR1 signal handler to dump goroutine Sep 3, 2018
@lysu
Copy link
Contributor Author

lysu commented Sep 3, 2018

/run-all-tests

Copy link
Contributor

@crazycs520 crazycs520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@crazycs520 crazycs520 added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 3, 2018
select {
case sig := <-sc:
if sig == syscall.SIGUSR1 {
buf := make([]byte, 1<<20)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may cause a deadlock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make([]byte, 1<<20) ??

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about it, see golang/go#23360, for more details.

buf := make([]byte, 1<<16)
for {
sig := <-dumpSignalChan
if sig == syscall.SIGUSR1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to check the signal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe future we need other signal?

Copy link
Contributor Author

@lysu lysu Sep 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this if and sig local variable is in purpose

if sig == syscall.SIGUSR1 {
stackLen := runtime.Stack(buf, true)
log.Printf("=== Got signal [%s] to goroutine dump===\n*** goroutine dump...\n%s\n*** end\n", sig, buf[:stackLen])
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continue can be removed

@@ -48,7 +48,7 @@ import (
"github.com/pingcap/tidb/util/printer"
"github.com/pingcap/tidb/util/systimemon"
"github.com/pingcap/tidb/x-server"
binlog "github.com/pingcap/tipb/go-binlog"
"github.com/pingcap/tipb/go-binlog"
Copy link
Member

@jackysp jackysp Sep 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This useless alias is used for some contributors to read the code easily. You could check its commit log.
Btw, I want to remove it either.

Copy link
Contributor Author

@lysu lysu Sep 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.....I didn't modify it...it seem IDE or make fmt change it 🤣 I will rollback it.

@coocood
Copy link
Member

coocood commented Sep 3, 2018

LGTM

@lysu lysu added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 3, 2018
@lysu
Copy link
Contributor Author

lysu commented Sep 3, 2018

@coocood manual test seems be pass too

sig := <-dumpSignalChan
if sig == syscall.SIGUSR1 {
stackLen := runtime.Stack(buf, true)
log.Printf("=== Got signal [%s] to goroutine dump===\n*** goroutine dump...\n%s\n*** end\n", sig, buf[:stackLen])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about:

=== Got signal [%s] to dump goroutine stack. ===
%s
=== Finished dumping goroutine stack. ===

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done~~

@coocood coocood merged commit f270b10 into pingcap:master Sep 3, 2018
@lysu lysu deleted the dev-add-thread-dump-signal branch September 27, 2018 04:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/tools status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants