Allow Dev to Run Comparison Benchmarks On Two Branches #213

stanbrub · 2023-11-10T00:24:05Z

The "ask" is to run a set of benchmarks against two Deephaven core branches/versions and provide comparative results. Below are two approaches with some idea of work.

Related Tickets:

Option 1 (higher up-front cost but easier to support)
It would take significant time, I think. Some of the pieces are there, but there are always complications. Off the top of my head we would need to:

Build docker images from two branches (main could be the latest edge image) a put them somewhere where they can gotten
Auto provision a server for it to run on (and teardown/check-in when done)
Provide a workflow from Benchmark to select the branches
Get Github project permissions to kick of the images builds from Benchmark workflows
Provide a way to select Benchmark subsets rather than the whole thing (which for a comparison would take 8-9hrs)
Provide a process for storing the results in GCloud to a special directory structure for branch builds
Other things that pop up when we dig into it in earnest
I'm not too worried about the Github workflows, but hardware provisioning or anything requiring permissions is always a stop/start/stop/start headache. If the Demo server stuff I did is any guide, it could take a month or more of inching our way there.

Option 2 (lower upfront cost but is harder to support)
An alternative approach would be:

Check out the Benchmark project and build the uber-jar
Run against the latest edge docker image
Build a docker image from the local branch and run against that image
(Both could technically be run from a non-docker DHC but automatic restart would not occur)
Compare results from the results directory
A big con to this approach is that it commandeers your laptop and more of your time.

Which Option Is Best?
Prefer Option 1 because:

No docker image comparison vs local hack would work
Results are available for others to view and come from a more uniform approach
Provisioned server is used with no interference of dev trying to do other work on the same system
Supporting a single hardware profile is better than supporting multiple JVM versions, OSes, Mem heaps, and other laptop "configs"
It's more fun than supporting developers hacking on Benchmark to get this to work locally (though it's not hard)
Eventually the same Dashboards that are used for Nightly and Release comparison could be used for branch comparison

stanbrub added the enhancement New feature or request label Nov 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow Dev to Run Comparison Benchmarks On Two Branches #213

Allow Dev to Run Comparison Benchmarks On Two Branches #213

stanbrub commented Nov 10, 2023 •

edited

Loading

Allow Dev to Run Comparison Benchmarks On Two Branches #213

Allow Dev to Run Comparison Benchmarks On Two Branches #213

Comments

stanbrub commented Nov 10, 2023 • edited Loading

stanbrub commented Nov 10, 2023 •

edited

Loading