Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make several optimizations to worker pool #680

Merged
merged 7 commits into from
Oct 27, 2019
Merged

Make several optimizations to worker pool #680

merged 7 commits into from
Oct 27, 2019

Conversation

panjf2000
Copy link
Contributor

@panjf2000 panjf2000 commented Oct 23, 2019

  1. Since ready that stores workers is sequential, the process of cleaning up expired workers can be speeded up by means of the binary-search algorithm;
  2. Speed it up when iterating the slice of workerChan, avoiding copy from range;
  3. Use sync.Pool as a more canonical way.

Signed-off-by: Andy Pan panjf2000@gmail.com

@panjf2000 panjf2000 changed the title Use binary-search algorithm to speed up cleaning up workers Utilize binary-search algorithm to speed up cleaning up workers Oct 24, 2019
@panjf2000
Copy link
Contributor Author

@erikdubbelboer @kirillDanshin
Please take a look.

@erikdubbelboer
Copy link
Collaborator

Thank you for the pull request. Interesting idea but can you show some benchmark results that this code is actually worth it? I have this feeling this will barely be any faster as there usually won't be many entries in ready and even if there are it will usually only clean a couple from the front.
I'm just wondering at which point this is faster. Maybe benchmark with different lengths of ready slices and with different offsets in the slice at which criticalTime.After(lastUseTime) is true.

@panjf2000
Copy link
Contributor Author

panjf2000 commented Oct 24, 2019

@erikdubbelboer

I have this feeling this will barely be any faster as there usually won't be many entries in ready and even if there are it will usually only clean a couple from the front.

I don't think so, in your case, http servers should be receiving high-traffic and continual http requests, which keeps workers active. Yet I believe there are more general scenarios that http servers get idle times frequently and it leads to many workers in ready where binary-search will make a higher performance in cleaning expired workers, in my opinion.

Besides, binary-search ensures its time cost will be reduced from N to log2(N) on the average, I think we ought to give consideration to various scenarios as fasthttp is a framework, right?

@panjf2000 panjf2000 changed the title Utilize binary-search algorithm to speed up cleaning up workers Optimizations in worker pool Oct 24, 2019
@panjf2000 panjf2000 changed the title Optimizations in worker pool Optimizations to worker pool Oct 24, 2019
@panjf2000 panjf2000 changed the title Optimizations to worker pool Make several optimizations to worker pool Oct 24, 2019
@erikdubbelboer
Copy link
Collaborator

Can you show a small benchmark that for example shows that the binary search is faster than just the linear search for a slice of 10,100,1000,10000 entries for example. And then for each of those sizes have variations where criticalTime.After(lastUseTime) happens at 10%, 50% and 90% of the size of the slice. In my experience binary search isn't always faster because of CPU cache and branch prediction.

@panjf2000
Copy link
Contributor Author

panjf2000 commented Oct 26, 2019

Hi @erikdubbelboer
I've added benchmarks between binary-search and linear-search of cleaning up workers and It turns out that binary-search does beat linear-search under various cases:

BenchmarkCleanWorkers1/10,10%/binary_search-4         	50000000	        26.1 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/10,10%/linear_search-4         	100000000	        24.9 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/100,10%/binary_search-4        	30000000	        46.3 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/100,10%/linear_search-4        	30000000	        66.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/1000,10%/binary_search-4       	30000000	        55.6 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/1000,10%/linear_search-4       	 3000000	       421 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/10000,10%/binary_search-4      	20000000	        71.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/10000,10%/linear_search-4      	  300000	      4052 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/10,50%/binary_search-4         	50000000	        25.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/10,50%/linear_search-4         	50000000	        31.3 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/100,50%/binary_search-4        	30000000	        40.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/100,50%/linear_search-4        	10000000	       235 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/1000,50%/binary_search-4       	30000000	        57.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/1000,50%/linear_search-4       	 1000000	      1999 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/10000,50%/binary_search-4      	20000000	        70.9 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/10000,50%/linear_search-4      	  100000	     20265 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/10,90%/binary_search-4         	50000000	        25.4 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/10,90%/linear_search-4         	30000000	        43.4 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/100,90%/binary_search-4        	30000000	        35.6 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/100,90%/linear_search-4        	 5000000	       381 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/1000,90%/binary_search-4       	30000000	        56.0 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/1000,90%/linear_search-4       	  300000	      3603 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/10000,90%/binary_search-4      	20000000	        77.1 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/10000,90%/linear_search-4      	   30000	     42476 ns/op	       0 B/op	       0 allocs/op

Benchmark code has been committed to this PR, please check it out, thanks~

@panjf2000
Copy link
Contributor Author

@erikdubbelboer

BenchmarkCleanWorkers1/10,10%/binary_search-4         	50000000	        20.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/10,10%/linear_search-4         	100000000	        16.2 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/100,10%/binary_search-4        	30000000	        41.1 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/100,10%/linear_search-4        	30000000	        58.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/1000,10%/binary_search-4       	30000000	        56.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/1000,10%/linear_search-4       	 3000000	       453 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/10000,10%/binary_search-4      	20000000	        71.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers1/10000,10%/linear_search-4      	  300000	      4328 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/10,50%/binary_search-4         	50000000	        25.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/10,50%/linear_search-4         	50000000	        39.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/100,50%/binary_search-4        	30000000	        41.3 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/100,50%/linear_search-4        	 5000000	       235 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/1000,50%/binary_search-4       	30000000	        56.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/1000,50%/linear_search-4       	  500000	      2311 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/10000,50%/binary_search-4      	20000000	        73.1 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers2/10000,50%/linear_search-4      	   50000	     22120 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/10,90%/binary_search-4         	50000000	        27.2 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/10,90%/linear_search-4         	30000000	        46.0 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/100,90%/binary_search-4        	50000000	        35.6 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/100,90%/linear_search-4        	 3000000	       409 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/1000,90%/binary_search-4       	30000000	        58.2 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/1000,90%/linear_search-4       	  300000	      4339 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/10000,90%/binary_search-4      	20000000	        75.9 ns/op	       0 B/op	       0 allocs/op
BenchmarkCleanWorkers3/10000,90%/linear_search-4      	   50000	     42352 ns/op	       0 B/op	       0 allocs/op

@panjf2000
Copy link
Contributor Author

@erikdubbelboer
Any other comments?

Copy link
Collaborator

@erikdubbelboer erikdubbelboer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Those are exactly the benchmarks I wanted to see. It's much faster than I expected to be honest. We should definitely merge this change.

But I'm sorry, I meant I just want to see the benchmarks to make sure this would actually be a big improvement. Actually committing them in the code seems a bit overkill. Especially since you're not really benchmarking the function itself so any future changes won't affect the benchmarks and won't help us find issues.

Can you please remove the benchmarks, then I can merge the pull request. Thanks!

workerpool.go Outdated Show resolved Hide resolved
@panjf2000
Copy link
Contributor Author

@erikdubbelboer
Please take a look at my new commit.

@erikdubbelboer erikdubbelboer merged commit 9f11af2 into valyala:master Oct 27, 2019
@erikdubbelboer
Copy link
Collaborator

Thanks!

@panjf2000 panjf2000 deleted the worker-pool-opt branch October 27, 2019 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants