Add benchmark project and action to alert performance issues #209

OptimumCode · 2024-06-29T18:25:22Z

Hi, this is the initial version of the PR. @aSemy could you please take a look?

I faced some issues when adding the benchmark:

For JS target I got an error during benchmark generation:
```
KLIB resolver: Could not find "net.thauvin.erik.urlencoder:urlencoder-lib" in [.../snakeyaml-engine-kmp]
```
I have not figured out the exact reason, but I added this library explicitly into jsMain dependencies and it started working. Maybe anyone has seen something like that before...
The GitHub action works only with a single result file. Because of that, it is a bit tricky to upload results for other platforms (especially because benchmarks for windows and macos must be executed on corresponding runners).
I think that we could create a reusable workflow that executes all benchmarks available for a particular runner (ubuntu, windows, macos), collects the results (using artifacts or github cache) and uploads all the results in a separate job using matrix build (which contains all the supported targets for that runner). But I don't know how this action will work if those workflows try to commit changes in parallel. I think nothing good will happen... Maybe anyone has other ideas (create a PR to the action repo with multi-file support?).

Here is what the generated page with benchmark results looks like:
https://optimumcode.github.io/snakeyaml-engine-kmp/dev/bench/

To make it work you need to do some preparations first (to allow action publish the result to GitHub pages).

Branch with benchmark results:
https://github.com/OptimumCode/snakeyaml-engine-kmp/tree/gh-pages/dev/bench

Here is an example of PR that introduces a performance issue:
OptimumCode#1

Also, every workflow run generates a summary where you can see how the benchmark results have changed:
https://github.com/OptimumCode/snakeyaml-engine-kmp/actions/runs/9725885272?pr=1

Resolves #208

OptimumCode · 2024-06-29T18:26:42Z

And I think this PR should be merged after #207 (I will rebase this branch once it is merged)

OptimumCode · 2024-06-30T06:46:01Z

There is a suggestion on how multiple benchmark results can be combined and processed together.

OptimumCode · 2024-07-01T07:51:53Z

I have managed to join results from multiple runs in a single file and uploaded them using benchmark-action. This is workflow step to do that (and the result). Could the same approach be used here? What do you think?

aSemy · 2024-07-01T08:25:08Z

Thanks for this! I will take a look.

aSemy · 2024-07-01T08:25:31Z

...k/src/commonMain/kotlin/it/krzeminski/snakeyaml/engine/kmp/benchmark/LoadingTimeBenchmark.kt

+                    // TODO: there is a Load class in JVM sources that can handle all of it
+                    //       but it is not available for common code.
+                    //       Probably, it should be moved from JVM sources to common sources.


Good idea, I'll try to commonize Load.

.github/workflows/benchmark.yml

aSemy · 2024-07-01T08:30:28Z

.github/workflows/benchmark.yml

+
+permissions:
+  # deployments permission to deploy GitHub pages website
+  deployments: write


Just a thought: maybe writing to GitHub pages should only be enabled when running on the main branch? If I understand the benchmark action, then the benchmarks should only be updated when running on main, but I wonder if an additional guard would help.

I cannot find how the permissions can be dynamically modified (github context is not available in that scope). I can split this workflow into two with separate sets of permissions and a reusable workflow to reduce duplication. It should resolve concerns about writing something to gh-pages in PRs

But still, it would be possible to modify the PR workflow and push results from PR (it just will be more obvious because the permissions will be modified)

.github/workflows/benchmark.yml

settings.gradle.kts

aSemy · 2024-07-01T08:35:54Z

benchmark/build.gradle.kts

+                implementation("net.thauvin.erik.urlencoder:urlencoder-lib:1.5.0") {
+                    because("error during benchmark generation: e:\n" +
+                        "KLIB resolver: Could not find \"net.thauvin.erik.urlencoder:urlencoder-lib\" in [.../snakeyaml-engine-kmp]")
+                }


That's an unexpected error! I wonder why it doesn't happen on other targets. I'll take a look...

Maybe snake-kmp needs to mark it as api()?

I think this might be related to how the benchmark plugin resolves dependencies when compiling generated benchmarks. But I believe snake-kmp should not mark this dependency as api because it does not expose this library in public API (even if it fixes this problem for some reason)

You are actually right... if you declare the dependency to that library as api, the benchmark for JS compiles without any issues. But it is not right to include a library as api if it is not used in public API. I will raise a question in kotlinx-benchmark repo - maybe they can help with that

There is already an issue for that: Kotlin/kotlinx-benchmark#185
Re-declaration of dependencies.

settings.gradle.kts

OptimumCode · 2024-07-01T10:22:38Z

Hi, @aSemy. I want to rebase the branch on the latest main state. Please, let me know if you are doing some changes - I will do a merge instead of rebase in this case (but I would prefer a liner history)

aSemy · 2024-07-01T10:29:43Z

I'm not working on this branc, go ahead and rebase if you'd like. We'll squash merge the branch into main, so I wouldn't worry too much about rebasing vs merging.

OptimumCode · 2024-07-07T06:46:07Z

Hi, @aSemy
I merged the results for all platforms and they are all available on one page (https://optimumcode.github.io/snakeyaml-engine-kmp/dev/bench/). Does it look okay? Could you please check when you have time?

OptimumCode · 2024-07-20T13:44:07Z

Hi, @krzema12. Could you please also take a look at the PR?

The example can be found in the forked repo: OptimumCode#1
https://optimumcode.github.io/snakeyaml-engine-kmp/dev/bench/

krzema12 · 2024-07-20T20:08:55Z

@OptimumCode thanks for this, it looks good in general, but we need to sort out one problem. From what I see, the new workflow publishes the report via GitHub Pages, but we already use it to publish API reference: https://github.com/krzema12/snakeyaml-engine-kmp/blob/main/.github/workflows/publish-site.yml.

One idea is to deploy both the docs and the performance report using GH Pages, but under different paths. Then we'd also need to ensure that both workflows don't override each other's results.

I'm gonna sleep on this problem, maybe a better solution will appear, but I'm also open to your ideas!

OptimumCode · 2024-07-21T07:40:17Z

Hi, @krzema12. Thank you for pointing out that you already use the github pages in another workflow. I looked into documentation for actions/upload-pages-artifact, actions/deploy-pages and benchmark-action/github-action-benchmark actions - seems like they won't work together without some manual actions...
actions/deploy-pages expects a single artifact with all content for github-pages that fully overrides the previously deployed data. This would mean that results from benchmark-action/github-action-benchmark will override the previously deployed documentation.

I have one thought here: benchmark-action/github-action-benchmark action supports publishing results into another repository. What do you think about creating a snakeyaml-engine-kmp-benchmarks (or any other name you prefer) repository and publishing results into this repo?

krzema12 · 2024-07-21T18:09:07Z

I'm fine with the separate repo! I'll let you know once created. Do I need to set up any extra perms anywhere?

OptimumCode · 2024-07-21T18:28:09Z

You will need to create a personal access token with deployment and contents write permissions for this new repo and add it to the secrets here

krzema12 · 2024-07-22T07:24:55Z

@OptimumCode done! The secret is called PUBLISH_BENCHMARK_RESULTS and will expire in a year (it's impossible to set no expiration date). The dedicated repo is here: https://github.com/krzema12/snakeyaml-engine-kmp-benchmarks.

OptimumCode · 2024-07-22T08:06:00Z

Thank you, @krzema12. I will make necessary changes and let you know once it is done

OptimumCode · 2024-07-23T05:40:20Z

Hi, @krzema12. Could you please advise which branch have you configured for GitHub pages in this new repo? If you haven't could you please create a branch (gh-pages is a default value expected by the action) and configure GitHub pages to serve from this branch?
I completely forgot about this when you asked for what is needed.

krzema12 · 2024-07-23T06:44:27Z

@OptimumCode I've just configured gh-pages branch.

Co-authored-by: Adam <897017+aSemy@users.noreply.github.com>

This reverts commit 343ac8d.

This reverts commit 5f9afd8.

OptimumCode · 2024-07-26T06:46:28Z

Hi, @krzema12. Could you please check if the name of the secret is correct? Either the name is not correct or the secret is not visible in pr workflow...

One more thing, could you please add pull_request write permission to this token for main repo (not for repo with benchmark results)? This is needed to generate an alert comment

krzema12 · 2024-07-26T07:05:44Z

Hi @OptimumCode!

Could you please check if the name of the secret is correct? Either the name is not correct or the secret is not visible in pr workflow...

Yeah, the name is correct, checked by copy-pasting between GitHub's UI and workflow's definition:

What I can do is regenerate the token and update the secret, to make sure I correctly pasted it. I've just done it - let's try now!

One more thing, could you please add pull_request write permission to this token for main repo (not for repo with benchmark results)? This is needed to generate an alert comment

I think you mean ticking the "Allow GitHub Actions to create and approve pull requests" checkbox, is it correct? If yes, I've just done it!

OptimumCode · 2024-07-26T07:11:28Z

I think you mean ticking the "Allow GitHub Actions to create and approve pull requests" checkbox, is it correct? If yes, I've just done it!

I am not sure this will do the trick. I was talking about token permissions:

krzema12 · 2024-07-26T07:16:44Z

@OptimumCode what you show on the screenshot looks like configuring a personal access token's perms - I see this when configuring the PAT for publishing the report to the other repo.

For the GitHub token, I'm only aware of the setting I mentioned, and setting fine-grained perms via YAML, see https://github.blog/changelog/2021-04-20-github-actions-control-permissions-for-github_token/

I'll try digging deeper into why things don't work as expected here.

OptimumCode · 2024-07-26T07:22:58Z

Yes, it is PAT token configuration. We use it instead of GitHub token in benchmark action and in order to generate alert comments this token should have permissions for that (I will remove permissions configuration from workflow itself as we use the PAT token)

I'll try digging deeper into why things don't work as expected here.

Please let me know if I can help here anyhow (right now I have a limited access to the internet but in few hours I will be fully available)

krzema12 · 2024-07-26T08:27:25Z

@OptimumCode ok, got it now! I somehow thought that the GH token is also used, but now I see that we switched to PAT to be able to write to another repo. I've adjusted the PAT's perms, let's see now.

.github/workflows/benchmark.yml

OptimumCode · 2024-07-26T14:09:31Z

Superseded by #215

aSemy self-requested a review June 30, 2024 06:46