Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time: excessive CPU usage when using Ticker and Sleep #27707

Open
lni opened this issue Sep 17, 2018 · 127 comments
Open

time: excessive CPU usage when using Ticker and Sleep #27707

lni opened this issue Sep 17, 2018 · 127 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@lni
Copy link

lni commented Sep 17, 2018

What version of Go are you using (go version)?

go1.11 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/lni/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/lni/golang_ws"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build440058908=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I need to call a function roughly every millisecond, there is no real time requirement, as long as it is called roughly every millisecond everything is fine. However, I realised that both time.Ticker and time.Sleep() are causing excessive CPU overhead, probably due to runtime scheduling.

The following Go program uses 20-25% %CPU as reported by top on Linux.

package main

import (
  "time"
)

func main() {
  ticker := time.NewTicker(time.Millisecond)
  defer ticker.Stop()
  for {
    select {
    case <-ticker.C:
    }
  }
}

for loop with range is causing similar overhead

package main

import (
  "time"
)

func main() {
  ticker := time.NewTicker(time.Millisecond)
  defer ticker.Stop()
  for range ticker.C {
  }
}

while the following program is showing 10-15% %CPU in top

package main

import (
  "time"
)

func main() {
  for {
    time.Sleep(time.Millisecond)
  }
}

To workaround the issue, I had to move the ticker/sleep part to C and let the C code to call the Go function that need to be invoked every millisecond. Such a cgo based ugly hack reduced %CPU in top to 2-3%. Please see the proof of concept code below.

ticker.h

#ifndef TEST_TICKER_H
#define TEST_TICKER_H

void cticker();

#endif // TEST_TICKER_H

ticker.c

#include <unistd.h>
#include "ticker.h"
#include "_cgo_export.h"

void cticker()
{
  for(int i = 0; i < 30000; i++)
  {
    usleep(1000);
    Gotask();
  }
}

ticker.go

package main

/*
#include "ticker.h"
*/
import "C"
import (
  "log"
)

var (
  counter uint64 = 0
)

//export Gotask
func Gotask() {
  counter++
}

func main() {
  C.cticker()
  log.Printf("Gotask called %d times", counter)
}

What did you expect to see?

Much lower CPU overhead when using time.Ticker or time.Sleep()

What did you see instead?

20-25% %CPU in top

@agnivade
Copy link
Contributor

I am unable to reproduce this. Both programs show 5-6% CPU for me. And I have 4 CPUs. When you say 20-25%, how many CPUs do you have ?

@lni
Copy link
Author

lni commented Sep 17, 2018

As described, all reported % figures are per process %CPU value reported in top.

I tried two servers, one with dual E5-2690v2 (10 cores X 2 with HT) and another server with a single E5-2696v4 (22 cores X 1 with HT). Other people have confirmed high CPU usage on their Mac OS machines as well, see below.

https://groups.google.com/forum/#!topic/golang-nuts/_XEuq69kUOY

@djadala
Copy link
Contributor

djadala commented Sep 17, 2018

Hi,
Same here,
Linux 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u4 (2018-08-21) x86_64 GNU/Linux,
4x AMD Phenom(tm) II X4 965 Processor,

top:

Tasks: 362 total,   1 running, 361 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.8 us,  0.4 sy,  0.0 ni, 98.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  8142260 total,   607488 free,  2363780 used,  5170992 buff/cache
KiB Swap:  4194300 total,  4144460 free,    49840 used.  5385516 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                   
 1689 djadala   20   0    2476    708    520 S  22.8  0.0   0:01.90 ticker1                                                   

go version:
go version go1.10.3 linux/amd64

ticker1.go:

package main

import (
	"time"
)

func main() {
	ticker := time.NewTicker(time.Millisecond)
	defer ticker.Stop()
	for range ticker.C {
	}
}

@agnivade agnivade changed the title Excessive CPU usage when using time.Ticker and time.Sleep time: excessive CPU usage when using Ticker and Sleep Sep 17, 2018
@agnivade agnivade added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 17, 2018
@agnivade agnivade added this to the Go1.12 milestone Sep 17, 2018
@bcmills
Copy link
Contributor

bcmills commented Sep 17, 2018

Dup of #25471?

@mbyio
Copy link

mbyio commented Sep 17, 2018

Percentage of CPU use is not very specific. Can you describe how it is actually affecting you? Is it preventing the program from working correctly somehow? Go is not necessarily specifically optimized for minimal CPU usage.

@bcmills
Copy link
Contributor

bcmills commented Sep 17, 2018

Can you describe how it is actually affecting you? Is it preventing the program from working correctly somehow?

On a laptop or other mobile device, percentage of CPU is more-or-less equivalent to battery lifetime. We should not waste users' battery capacity needlessly.

(Using a lot more CPU than expected in a real program is a problem in and of itself: it doesn't need extra justification.)

@choleraehyq
Copy link
Contributor

@bcmills I'd like to implement time.Ticker with timerfd to solve this issue, but timerfd_* APIs are merged into kernel 2.6.24, as our minimal kernel requirement is 2.6.23. When can we drop support of kernel 2.6.23?

@bcmills
Copy link
Contributor

bcmills commented Nov 19, 2018

When can we drop support of kernel 2.6.23?

For that, you should probably file a proposal.

@ianlancetaylor
Copy link
Member

@choleraehyq Can you sketch out how using the timerfd functions will help? It's not immediately obvious to me. Thanks.

@gmox
Copy link

gmox commented Dec 1, 2018

I can confirm this behavior. It looks as if NewTicker is leaving futex locks open, and that drives up CPU/memory usage. Using After does not result in increased CPU/memory usage.

@djadala
Copy link
Contributor

djadala commented Dec 1, 2018

Hi,

Using After does not result in increased CPU/memory usage

Please include code that you use to measure.
Here is my code, sleep & timer use 10%cpu, ticker use 20%cpu :

package main

import (
	"flag"
	"fmt"
	"time"
)

func ticker() {

	ticker := time.NewTicker(time.Millisecond)
	defer ticker.Stop()
	for range ticker.C {
	}
}

func timer() {
	t := time.NewTimer(time.Millisecond)
	for range t.C {
		t.Reset(time.Millisecond)
	}
}

func sleep() {
	for {
		time.Sleep(time.Millisecond)
	}
}

func main() {
	t := flag.String("t", "timer", "use timer")
	flag.Parse()
	switch *t {
	case "timer":
		timer()
	case "ticker":
		ticker()
	case "sleep":
		sleep()
	default:
		fmt.Println("use  timer, ticker or sleep")
	}
}

I think that 10% is also high cpu usage.

@lni
Copy link
Author

lni commented Dec 1, 2018

I tried djadala's program above on an Amazon EC2 a1.medium node with 1 ARM Cortex-A72 core, OS is Ubuntu 18.04LTS with a 4.15.0-1028-aws kernel, Go arm64 version 1.11.2. Interestingly, the CPU usage is only around 3% for ticker/timer/sleep.

@djadala
Copy link
Contributor

djadala commented Dec 20, 2018

for reference i included syscall and cgo variants, cgo use 2% cpu, and syscall use 12%

package main

import (
	"flag"
	"fmt"
	"time"

	"syscall"
)

// #include <unistd.h>
// //#include <errno.h>
// //int usleep(useconds_t usec);
import "C"

func usleep() {
	for {
		C.usleep(1000)
	}
}

func sysns() {
	for {
		v := syscall.Timespec{
			Sec:  0,
			Nsec: 1000,
		}
		syscall.Nanosleep(&v, &v)
	}
}

func ticker() {

	ticker := time.NewTicker(time.Millisecond)
	defer ticker.Stop()
	for range ticker.C {
	}
}

func timer() {
	t := time.NewTimer(time.Millisecond)
	for range t.C {
		t.Reset(time.Millisecond)
	}
}

func sleep() {
	for {
		time.Sleep(time.Millisecond)
	}
}

func main() {
	t := flag.String("t", "timer", "use timer")
	flag.Parse()
	switch *t {
	case "timer":
		timer()
	case "ticker":
		ticker()
	case "sleep":
		sleep()
	case "cgo":
		usleep()
	case "sys":
		sysns()
	default:
		fmt.Println("use  timer, ticker, sys, cgo or sleep")
	}
}

@robaho
Copy link

robaho commented Dec 20, 2018

I did some testing on this, comparing Go, C, and Java. The Go uses time.Sleep(), C uses usleep(), and Java uses LockSupport.ParkNanos(). Interestingly, Java and Go use the same 'timeout on semaphore' to implement. The following shows CPU utilization (on OSX).

System 1 us 100 us 1 ms
Go 130% 23.0% 4.6%
Java 67% 6.0 % 1.5%
C 42% 0.6 % 0.0%

I believe the performance issue in Go is due to multiple threads being involved (the timerproc, and the user routine) causing a lot of overhead.

The simplest solution, although it doesn't work for channel select timeout, would be to have short duration time.Sleep() just invoke usleep(). There are probably very few threads using timers of this duration, so the overhead of allocating and blocking the thread shouldn't be that great. I would expect the resulting runtime to be superior to Java and on par with C.

In fact, it is doubtful a sleep request for less than 10 us should even sleep at all - just call runtime.Gosched() since the overhead of a context switch is probably longer than the sleep request.

@robaho
Copy link

robaho commented Dec 28, 2018

I did some more testing by writing a CGo function USleep that calls C's usleep directly.

System 1 us 100 us 1 ms
CGo 50% 5.3% 1.3%

So this appears to be a viable alternative for polling with a short interval and low polling routine count.

@andybons andybons modified the milestones: Go1.12, Go1.13 Feb 12, 2019
@ls0f

This comment has been minimized.

@ianlancetaylor

This comment has been minimized.

@networkimprov
Copy link

@ianlancetaylor maybe this could be assigned to someone?

@ianlancetaylor
Copy link
Member

I can't tell anybody to do anything. I can assign it to someone but that doesn't mean anything will happen.

Rereading this I would like to double-check that there is still something to do here. @dvyukov sped up the timer code quite a bit in the 1.12 release while working on #25729. To what extent is this still a problem?

@dvyukov
Copy link
Member

dvyukov commented Mar 8, 2019

We still handle timers on a separate thread, so there are like 4 excessive thread context switches per timer. Timers need to be part of scheduler/netpoll, i.e. call epoll_wait with the timers timeout.

@ianlancetaylor
Copy link
Member

I wonder if it would make sense to say that if we call time.Sleep for a time <= 10 milliseconds, and the number of active goroutines is <= GOMAXPROCS, then we should just sleep rather than going through the timer thread. That might handle microbenchmarks like this, including client programs, without compromising servers. (I chose 10 milliseconds based on sysmon).

@robaho
Copy link

robaho commented Mar 8, 2019 via email

@ChrisHines
Copy link
Contributor

This issue sounds similar to what I described in #28808. In that issue a lot of the unexpected CPU load was due to the work stealing code in the scheduler after goroutines go to sleep.

It would be nice if we could reduce that CPU load when there are short lived timers pending.

I wonder if it would make sense to say that if we call time.Sleep for a time <= 10 milliseconds, and the number of active goroutines is <= GOMAXPROCS, then we should just sleep rather than going through the timer thread.

What do you mean by "active goroutines <= GOMAXPROCS"? Does active mean "existing" or does it mean "in a runnable state"?

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/204800 mentions this issue: runtime: wake netpoller when dropping P, don't sleep too long in sysmon

gopherbot pushed a commit that referenced this issue Nov 1, 2019
If multiple goroutines call time.(*Timer).Reset then the timer will go
from timerWaiting to timerDeleted to timerModifying to timerModifiedLater.
The timer can be on a different P, meaning that simultaneously cleantimers
could change it from timerDeleted to timerRemoving to timerRemoved.
If Reset sees timerRemoved, it was doing an atomic.Store of timerWaiting,
meaning that it did not necessarily see the other values set in the timer,
so the timer could appear to be in an inconsistent state. Use atomic.Cas
to avoid that possibility.

Updates #6239
Updates #27707
Fixes #35272

Change-Id: I1d59a13dc4f2ff4af110fc6e032c8c9d59cfc270
Reviewed-on: https://go-review.googlesource.com/c/go/+/204717
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
gopherbot pushed a commit that referenced this issue Nov 4, 2019
When dropping a P, if it has any timers, and if some thread is
sleeping in the netpoller, wake the netpoller to run the P's timers.
This mitigates races between the netpoller deciding how long to sleep
and a new timer being added.

In sysmon, if all P's are idle, check the timers to decide how long to sleep.
This avoids oversleeping if no thread is using the netpoller.
This can happen in particular if some threads use runtime.LockOSThread,
as those threads do not block in the netpoller.

Also, print the number of timers per P for GODEBUG=scheddetail=1.

Before this CL, TestLockedDeadlock2 would fail about 1% of the time.
With this CL, I ran it 150,000 times with no failures.

Updates #6239
Updates #27707
Fixes #35274
Fixes #35288

Change-Id: I7e5193e6c885e567f0b1ee023664aa3e2902fcd1
Reviewed-on: https://go-review.googlesource.com/c/go/+/204800
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/206938 mentions this issue: runtime: acquire timersLocks around moveTimers

gopherbot pushed a commit that referenced this issue Nov 13, 2019
In the discussion of CL 171828 we decided that it was not necessary to
acquire timersLock around the call to moveTimers, because the world is
stopped. However, that is not correct, as sysmon runs even when the world
is stopped, and it calls timeSleepUntil which looks through the timers.
timeSleepUntil acquires timersLock, but that doesn't help if moveTimers
is running at the same time.

Updates #6239
Updates #27707
Updates #35462

Change-Id: I346c5bde594c4aff9955ae430b37c2b6fc71567f
Reviewed-on: https://go-review.googlesource.com/c/go/+/206938
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/207348 mentions this issue: runtime: release timersLock while running timer

gopherbot pushed a commit that referenced this issue Nov 19, 2019
Dan Scales pointed out a theoretical deadlock in the runtime.

The timer code runs timer functions while holding the timers lock for a P.
The scavenger queues up a timer function that calls wakeScavenger,
which acquires the scavenger lock.

The scavengeSleep function acquires the scavenger lock,
then calls resetTimer which can call addInitializedTimer
which acquires the timers lock for the current P.

So there is a potential deadlock, in that the scavenger lock and
the timers lock for some P may both be acquired in different order.
It's not clear to me whether this deadlock can ever actually occur.

Issue 35532 describes another possible deadlock.

The pollSetDeadline function acquires pd.lock for some poll descriptor,
and in some cases calls resettimer which can in some cases acquire
the timers lock for the current P.

The timer code runs timer functions while holding the timers lock for a P.
The timer function for poll descriptors winds up in netpolldeadlineimpl
which acquires pd.lock.

So again there is a potential deadlock, in that the pd lock for some
poll descriptor and the timers lock for some P may both be acquired in
different order. I think this can happen if we change the deadline
for a network connection exactly as the former deadline expires.

Looking at the code, I don't see any reason why we have to hold
the timers lock while running a timer function.
This CL implements that change.

Updates #6239
Updates #27707
Fixes #35532

Change-Id: I17792f5a0120e01ea07cf1b2de8434d5c10704dd
Reviewed-on: https://go-review.googlesource.com/c/go/+/207348
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
gopherbot pushed a commit that referenced this issue Nov 19, 2019
Updates #6239
Updates #27707

Change-Id: I65e6471829c9de4677d3ac78ef6cd7aa0a1fc4cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/171884
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
@ianlancetaylor
Copy link
Member

The new timer code is committed for 1.14. This code reduces lock contention and context switches for timers. This seems to have a clearly beneficial effect for real programs.

With regard to this issue, these changes reduce the CPU requirements for the simple ticker program. They don't seem to affect the simple sleep program very much. Leaving this issue open for further investigation.

@robaho
Copy link

robaho commented Nov 19, 2019 via email

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/209580 mentions this issue: runtime: use current P's race context in timer code

gopherbot pushed a commit that referenced this issue Dec 2, 2019
We were using the race context of the P that held the timer,
but since we unlock the P's timers while executing a timer
that could lead to a race on the race context itself.

Updates #6239
Updates #27707
Fixes #35906

Change-Id: I5f9d5f52d8e28dffb88c3327301071b16ed1a913
Reviewed-on: https://go-review.googlesource.com/c/go/+/209580
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/232298 mentions this issue: runtime: reduce timer latency

@ChrisHines
Copy link
Contributor

ChrisHines commented Sep 21, 2020

While responding to review comments by @ianlancetaylor on https://golang.org/cl/232298 I discovered that it appears to improve the issue reported here. Specifically I compiled the following program with Go 1.13, 1.14 and that CL.

package main

import (
  "time"
)

func main() {
  for {
    time.Sleep(time.Millisecond)
  }
}

I ran all three builds on the same machine at the same time. The machine has 56 hardware threads and was mostly idle. The OS is CentOS 7.4.1708.

With that setup top reported the following typical %CPU (of one hardware thread) for each program:

  • Go 1.13: 9.6%
  • Go 1.14: 11.6%
  • CL 232298: 6.9%

So there is at least some evidence that the CL provides an improvement over both old and new timers.

@bukowa
Copy link

bukowa commented Sep 24, 2020

I don't know if this is related but i myself experienced high CPU usage while not using sleep. This is one of my loops. Without sleep in default case and 50 goroutines running below loop all 24 threads are at 100%. With sleep it's less than 0.2%.

for {
			select {
			default:
				// unnoticeable sleep to prevent high cpu usage
				time.Sleep(time.Microsecond)

			case <-exit:
				// crawler has completed
				return

			case response := <-crawl.Crawler.ResponseChan:
}

@andig
Copy link
Contributor

andig commented Sep 24, 2020

Without the sleep your for/select loop will spin the default case, so this is expected. The code above would probably work fine with the default case removed entirely.

fishy added a commit to fishy/thrift that referenced this issue Oct 7, 2020
Client: go

This is a follow up to 4db7a0a.

Due to the go runtime bug [1], the 1ms ticker used by this feature could
cause excessive cpu usage. So until the bug is fixed in go it's better
to leave this behavior disabled by default. The instructions in the
README file has been updated to how to enable this feature if needed.

[1] golang/go#27707
fishy added a commit to fishy/thrift that referenced this issue Oct 7, 2020
Client: go

This is a follow up to 4db7a0a.

Due to the go runtime bug [1], the 1ms ticker used by this feature could
cause excessive cpu usage. So until the bug is fixed in go it's better
to leave this behavior disabled by default. The instructions in the
README file has been updated to how to enable this feature if needed.

[1] golang/go#27707
fishy added a commit to fishy/thrift that referenced this issue Oct 7, 2020
Client: go

Because of the Go runtime bug [1], the previous default value of 1ms is
not a great default as it could cause excessive cpu usage. Use 5ms
instead as a balance between being useful and not causing too much cpu
overhead.

It's still configurable.

[1]: golang/go#27707
fishy added a commit to fishy/thrift that referenced this issue Oct 7, 2020
Client: go

This is a follow up to 4db7a0a.

Because of the Go runtime bug [1], the previous default value of 1ms is
not a great default as it could cause excessive cpu usage. Use 5ms
instead as a balance between being useful and not causing too much cpu
overhead.

It's still configurable.

[1]: golang/go#27707
fishy added a commit to apache/thrift that referenced this issue Oct 8, 2020
Client: go

This is a follow up to 4db7a0a.

Because of the Go runtime bug [1], the previous default value of 1ms is
not a great default as it could cause excessive cpu usage. Use 5ms
instead as a balance between being useful and not causing too much cpu
overhead.

It's still configurable.

[1]: golang/go#27707
gopherbot pushed a commit that referenced this issue Oct 27, 2020
Change the scheduler to treat expired timers with the same approach it
uses to steal runnable G's.

Previously the scheduler ignored timers on P's not marked for
preemption. That had the downside that any G's waiting on those expired
timers starved until the G running on their P completed or was
preempted. That could take as long as 20ms if sysmon was in a 10ms
wake up cycle.

In addition, a spinning P that ignored an expired timer and found no
other work would stop despite there being available work, missing the
opportunity for greater parallelism.

With this change the scheduler no longer ignores timers on
non-preemptable P's or relies on sysmon as a backstop to start threads
when timers expire. Instead it wakes an idle P, if needed, when
creating a new timer because it cannot predict if the current P will
have a scheduling opportunity before the new timer expires. The P it
wakes will determine how long to sleep and block on the netpoller for
the required time, potentially stealing the new timer when it wakes.

This change also eliminates a race between a spinning P transitioning
to idle concurrently with timer creation using the same pattern used
for submission of new goroutines in the same window.

Benchmark analysis:

CL 232199, which was included in Go 1.15 improved timer latency over Go
1.14 by allowing P's to steal timers from P's not marked for preemption.
The benchmarks added in this CL measure that improvement in the
ParallelTimerLatency benchmark seen below. However, Go 1.15 still relies
on sysmon to notice expired timers in some situations and sysmon can
sleep for up to 10ms before waking to check timers. This CL fixes that
shortcoming with modest regression on other benchmarks.

name \ avg-late-ns                                        go14.time.bench  go15.time.bench  fix.time.bench
ParallelTimerLatency-8                                         17.3M ± 3%        7.9M ± 0%       0.2M ± 3%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=1-8        53.4k ±23%       50.7k ±31%     252.4k ± 9%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=2-8         204k ±14%         90k ±58%       188k ±12%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=3-8        1.17M ± 0%       0.11M ± 5%      0.11M ± 2%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=4-8        1.81M ±44%       0.10M ± 4%      0.10M ± 2%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=5-8        2.28M ±66%       0.09M ±13%      0.08M ±21%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=6-8        2.84M ±85%       0.07M ±15%      0.07M ±18%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=7-8        2.13M ±27%       0.06M ± 4%      0.06M ± 9%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=8-8        2.63M ± 6%       0.06M ±11%      0.06M ± 9%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=9-8        3.32M ±17%       0.06M ±16%      0.07M ±14%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=10-8       8.46M ±20%       4.37M ±21%      5.03M ±23%
StaggeredTickerLatency/work-dur=2ms/tickers-per-P=1-8          1.02M ± 1%       0.20M ± 2%      0.20M ± 2%

name \ max-late-ns                                        go14.time.bench  go15.time.bench  fix.time.bench
ParallelTimerLatency-8                                         18.3M ± 1%        8.2M ± 0%       0.5M ±12%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=1-8         141k ±19%        127k ±19%      1129k ± 3%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=2-8        2.78M ± 4%       1.23M ±15%      1.26M ± 5%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=3-8        6.05M ± 5%       0.67M ±56%      0.81M ±33%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=4-8        7.93M ±20%       0.71M ±46%      0.76M ±41%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=5-8        9.41M ±30%       0.92M ±23%      0.81M ±44%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=6-8        10.8M ±42%        0.8M ±41%       0.8M ±30%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=7-8        9.62M ±24%       0.77M ±38%      0.88M ±27%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=8-8        10.6M ±10%        0.8M ±32%       0.7M ±27%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=9-8        11.9M ±36%        0.6M ±46%       0.8M ±38%
StaggeredTickerLatency/work-dur=300µs/tickers-per-P=10-8       36.8M ±21%       24.7M ±21%      27.5M ±16%
StaggeredTickerLatency/work-dur=2ms/tickers-per-P=1-8          2.12M ± 2%       1.02M ±11%      1.03M ± 7%

Other time benchmarks:
name \ time/op          go14.time.bench  go15.time.bench  fix.time.bench
AfterFunc-8                  137µs ± 4%       123µs ± 4%      131µs ± 2%
After-8                      212µs ± 3%       195µs ± 4%      204µs ± 7%
Stop-8                       165µs ± 6%       156µs ± 2%      151µs ±12%
SimultaneousAfterFunc-8      260µs ± 3%       248µs ± 3%      284µs ± 2%
StartStop-8                 65.8µs ± 9%      64.4µs ± 7%     67.3µs ±15%
Reset-8                     13.6µs ± 2%       9.6µs ± 2%      9.1µs ± 4%
Sleep-8                      307µs ± 4%       306µs ± 3%      320µs ± 2%
Ticker-8                    53.0µs ± 5%      54.5µs ± 5%     57.0µs ±11%
TickerReset-8                                9.24µs ± 2%     9.51µs ± 3%
TickerResetNaive-8                            149µs ± 5%      145µs ± 5%

Fixes #38860
Updates #25471
Updates #27707

Change-Id: If52680509b0f3b66dbd1d0c13fa574bd2d0bbd57
Reviewed-on: https://go-review.googlesource.com/c/go/+/232298
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Trust: Ian Lance Taylor <iant@golang.org>
@LeGamerDc
Copy link

LeGamerDc commented Dec 7, 2022

maybe heap structure is too heavy incase many goroutine is using timer/ticker/sleep and each with small sleep time(like < 10ms). i see much time used in siftdownTimer maintain the heap structure.
what if use timer wheel structure for small timer, or new timer lib might be same good.

also, i noticed that much time used in runtime.casgstatus doing cas, comment said its give way to goroutine gc. however, even i disable gc, the cas loop still use much time.

simple code are:

import _ "net/http/pprof"

func main() {
        for i := 0; i < 10000; i++ {
                go func() {
                        for {
                                time.Sleep(time.Millisecond * 5)
                        }
                }()
        }
        debug.SetGCPercent(-1)
        runtime.GC()
        fmt.Println(http.ListenAndServe("localhost:8081", nil))
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests