-
-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor performance improvements #14
Conversation
This establishes baseline performance of the algorithm using some resonable default settings. On my local machine, this yeilds the following bechmark results: goos: linux goarch: amd64 pkg: github.com/esimov/triangle cpu: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz BenchmarkDraw-4 21 3074890062 ns/op PASS ok github.com/esimov/triangle 67.892s Signed-off-by: Chris Waldon <christopher.waldon.dev@gmail.com>
Signed-off-by: Chris Waldon <christopher.waldon.dev@gmail.com>
This commit switches several types from taking 4-bytes-per-pixel to a single byte per pixel. The behavior (so far as I can tell) is unchanged, but the speedup is noticable: goos: linux goarch: amd64 pkg: github.com/esimov/triangle cpu: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz BenchmarkDraw-4 25 2865398063 ns/op PASS ok github.com/esimov/triangle 74.604s Signed-off-by: Chris Waldon <christopher.waldon.dev@gmail.com>
This brings another few percent of performance improvement: goos: linux goarch: amd64 pkg: github.com/esimov/triangle cpu: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz BenchmarkDraw-4 27 2498826294 ns/op PASS ok github.com/esimov/triangle 70.112s Signed-off-by: Chris Waldon <christopher.waldon.dev@gmail.com>
This tweak appears to have a minor performance improvemnt associated, as no dynamic allocation has to occur for the slice embedded within the type. goos: linux goarch: amd64 pkg: github.com/esimov/triangle cpu: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz BenchmarkDraw-4 28 2474394774 ns/op PASS ok github.com/esimov/triangle 71.881s Signed-off-by: Chris Waldon <christopher.waldon.dev@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @whereswaldon thanks a lot for your contribution, I really appreciate. I left a comment on the benchmark test, otherwise everything is just fine and the speed improvement is quite considerable. Please do that slight adjustment and I will merge it back your PR into the master branch.
This preserves support for older go versions. Signed-off-by: Chris Waldon <christopher.waldon.dev@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good for me. I have merged it back to the main branch. Thanks a lot.
Hello! Thanks so much for this amazing library. It's been really fun to play with over the last day or so.
This PR is a series of tweaks I made to boost performance. These were guided by
pprof
and the included benchmark, but only tested on a single x86_64 Linux machine. They may not be an improvement everywhere.That being said, they appear to represent a 24% speedup on my test hardware without compromising the output in any way.
I have never worked with these algorithms before, and it is definitely possible that my work contains mistakes that compromise the algorithms under some edge case that I failed to consider or did not understand.
I also don't know enough about the algorithms to know if the parameters being benchmarked represent the normal use-case well enough. Perhaps the benchmark should be using different values?
I did all of this because I have a toy program that transforms my webcam output using this library, and I'm trying to boost the realtime processing capability of this library. Even with all of these enhancements applied, I can only get 15-20FPS for very low-res images.
Regardless, I hope these are a useful reference point for optimization, even if they ultimately aren't merged.