Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of endpoints deduplication #299

Merged
merged 2 commits into from
Dec 5, 2022

Conversation

pleshakov
Copy link
Contributor

Proposed changes

We use a map as a set to deduplicate endpoints. Before deduplicating, we can calculate the total number of the endpoints in the input and assume that most of those endpoints are unique. Then, we can use that number when initializing the map. That will improve the performance, as it will help to reduce the cost of growing the map to accommodate all the endpoints.

The benchmarks are included.

The output of running the benchmarks on my machine:

BenchmarkResolve/1_endpoints-4                                   2177905               560.7 ns/op           632 B/op          4 allocs/op
BenchmarkResolve/1_endpoints_with_optimization-4                 2098209               578.2 ns/op           632 B/op          4 allocs/op
BenchmarkResolve/2_endpoints-4                                   1868185               630.5 ns/op           656 B/op          4 allocs/op
BenchmarkResolve/2_endpoints_with_optimization-4                 1840216               650.4 ns/op           656 B/op          4 allocs/op
BenchmarkResolve/5_endpoints-4                                   1457457               866.1 ns/op           736 B/op          4 allocs/op
BenchmarkResolve/5_endpoints_with_optimization-4                 1363773               861.5 ns/op           736 B/op          4 allocs/op
BenchmarkResolve/10_endpoints-4                                   659551              1714 ns/op            1268 B/op          5 allocs/op
BenchmarkResolve/10_endpoints_with_optimization-4                 731506              1489 ns/op            1060 B/op          4 allocs/op
BenchmarkResolve/25_endpoints-4                                   278122              3889 ns/op            2739 B/op          6 allocs/op
BenchmarkResolve/25_endpoints_with_optimization-4                 355548              2991 ns/op            2060 B/op          4 allocs/op
BenchmarkResolve/50_endpoints-4                                   150068              8093 ns/op            5475 B/op          9 allocs/op
BenchmarkResolve/50_endpoints_with_optimization-4                 192177              5640 ns/op            3748 B/op          5 allocs/op
BenchmarkResolve/100_endpoints-4                                   74073             15507 ns/op           11112 B/op         11 allocs/op
BenchmarkResolve/100_endpoints_with_optimization-4                109096             10806 ns/op            7269 B/op          5 allocs/op
BenchmarkResolve/500_endpoints-4                                   10000            101384 ns/op           75413 B/op         24 allocs/op
BenchmarkResolve/500_endpoints_with_optimization-4                 21944             56614 ns/op           42824 B/op          5 allocs/op
BenchmarkResolve/1000_endpoints-4                                   5320            201992 ns/op          150373 B/op         43 allocs/op
BenchmarkResolve/1000_endpoints_with_optimization-4                11083            106729 ns/op           85448 B/op          5 allocs/op

@pleshakov pleshakov requested a review from a team as a code owner November 8, 2022 18:32
@github-actions github-actions bot added the chore Pull requests for routine tasks label Nov 8, 2022
We use a map as a set to deduplicate endpoints. Before deduplicating, 
we can calculate the total number of the endpoints in the input
and assume that most of those endpoints are unique. Then, we can use
that number when initializing the map. That will improve 
the performance, as it will help to reduce the cost of growing the map
to accommodate all the endpoints.

The benchmarks are included.
@pleshakov pleshakov force-pushed the chore/imrpove-endpoints-dedup-performance branch from a372ca1 to ebaae58 Compare November 8, 2022 18:33
@pleshakov pleshakov merged commit 3507fc2 into main Dec 5, 2022
@pleshakov pleshakov deleted the chore/imrpove-endpoints-dedup-performance branch December 5, 2022 23:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Pull requests for routine tasks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants