Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Benchmark workflows #5842

Merged
merged 2 commits into from
Sep 27, 2023
Merged

Conversation

tylerbenson
Copy link
Member

These workflows run a few JMH benchmarks and publish the results to a benchmarks branch. The expectation is that the branch would be published via github pages.

benchmark-tags.yml is intended to run manually as needed to populate benchmark data from historical versions. This works because the benchmarks being run have been in place for a long time. If we needed a new benchmark, methodologies would need to change.

In order for this to work, a new github pages branch needs to be created (assuming benchmarks):

git checkout --orphan benchmarks
git commit --allow-empty -m "Init"
git push origin benchmarks

(not tested, but something to this effect.)

Note: this won't actually work until the org-level self-hosted runner is configured to allow these workflows.

These workflows run a few JMH benchmarks and publish the results to a `benchmarks` branch. The expectation is that the branch would be published via github pages.

`benchmark-tags.yml` is intended to run manually as needed to populate benchmark data from historical versions. This works because the benchmarks being run have been in place for a long time. If we needed a new benchmark, methodologies would need to change.

Note: this won't actually work until the org-level self-hosted runner is configured to allow these workflows.
@tylerbenson tylerbenson requested a review from a team September 20, 2023 19:53
@codecov
Copy link

codecov bot commented Sep 20, 2023

Codecov Report

All modified lines are covered by tests ✅

see 3 files with indirect coverage changes

📢 Thoughts on this report? Let us know!.

@trask
Copy link
Member

trask commented Sep 21, 2023

nice! can you run it on your fork (using normal github runner) and link to the gh-pages branch so we can see what the report looks like?

@tylerbenson
Copy link
Member Author

tylerbenson commented Sep 21, 2023

Here's what it looks like for now:
https://securingcarter.github.io/opentelemetry-java/
(This is using the same kind of self-hosted runner. I'm using a similar template as the collector with slightly different grouping to better match jhm.)

with:
tool: 'jmh'
output-file-path: sdk/trace/build/jmh-result.json
gh-pages-branch: benchmarks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@jack-berg jack-berg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you open a tracking issue and describe the related followup work you anticipate? This looks like the good start of a solution, but IMO has some follow on work:

  • Need to decide which benchmarks and why
  • Need to have some sort of way to abstract the benchmarks away such that we can reliably run them for old (within reason) and new versions
  • Need to setup the github.io publishing, link to it from the appropriate places, and describe the methodology (why the benchmarks are selected and how users should think about them)

- name: Run Benchmark
run: |
cd sdk/trace/build
java -jar libs/opentelemetry-sdk-trace-*-jmh.jar -rf json SpanBenchmark SpanPipelineBenchmark ExporterBenchmark
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage of this over ./gradlew :sdk:trace:jmh -PjmhIncludeSingleClass=SpanBenchmark?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't try that approach. This seems equivalent to me.

on:
push:
branches: [ main ]
workflow_dispatch:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having two workflow files (one for backfilling and one for updates to main), can we have a single workflow which takes an optional tag as an argument, defaulting to use the main branch if not set?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can, but I worry that would give the impression that it would work on any tag when it can only work with 1.6+. Is this important to you? It seems that would be a lot more work involved in backfilling all the versions I enumerated here. We don't even need to keep this long term. It can be deleted after a successful run.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not important and we can iterate. If it turns out that we want to backfill performance runs on a recurring basis, let's consider refactoring some sort reusuable work flow (i.e. reusable-markdown-link-check.yml).

@tylerbenson
Copy link
Member Author

@jack-berg This is part of a broader cross-language effort. This isn't intended as the final state. Eventually we hope to have these results published on the otel docs site too, similar to what the collector group is doing. See the following references:

Need to decide which benchmarks and why

This test selection was used because it already existed and didn't require any changes. The baseline tests should be the ones defined in the specification repo.

Need to have some sort of way to abstract the benchmarks away such that we can reliably run them for old (within reason) and new versions

I think that's a reasonable idea and can be considered independently from this PR, especially when new benchmarks are created that don't have the history the current ones do. My primary intention for this effort was to establish the benchmarking pipeline with tests running in a stable environment and reports being published, not to be opinionated on which specific tests are being run.

Need to setup the github.io publishing, link to it from the appropriate places, and describe the methodology (why the benchmarks are selected and how users should think about them)

I can't iterate on this until the benchmarks branch is created and the workflows are in place.

@tylerbenson
Copy link
Member Author

Another important point to add:
The self hosted runner is shared across the otel org and doesn't auto-scale. As such we need to be considerate and not monopolize the time. This is one reason we don't want to run on each PR. Currently I chose to run the benchmark on each merge to master because the execution time is only a few minutes. If the runtime is increased either from adding new benchmarks or increasing iteration counts, we should reconsider this and move to run only at release time.

@jack-berg
Copy link
Member

Can you open a tracking issue for anything you plan on iterating on in this repo? Can either be a single issue with a bulleted list of items, or individual issues. I want to get feel for what this would look like.

@tylerbenson
Copy link
Member Author

@jack-berg done: #5857

@jack-berg jack-berg merged commit aff4e12 into open-telemetry:main Sep 27, 2023
17 checks passed
@tylerbenson tylerbenson deleted the tyler/benchmark branch September 27, 2023 19:25
@tylerbenson
Copy link
Member Author

Runner permissions granted: open-telemetry/community#1699

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants