Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken PFM-enabled tests #1623

Merged
merged 11 commits into from
Jul 7, 2023
Merged

Fix broken PFM-enabled tests #1623

merged 11 commits into from
Jul 7, 2023

Conversation

macandy13
Copy link
Contributor

@macandy13 macandy13 commented Jul 5, 2023

Performance counter tests are broken (not sure if they ever worked or got broken over the last couple of prs). Sadly the way to fix most of them was to reduce test coverage quite a bit, but IMO the tests in this state are still ok to have around. Ideally a test using more than 6 HW counters would be great to exercise the grouping mechanism, but I couldn't find enough HW counters that can be used from a non-privileged test.

This was uncovered when trying to enable PFM test as part of github workflows. That sadly is infeasible because the CI machines do not support performance counters at all, so the tests will always fail.

test/perf_counters_gtest.cc Outdated Show resolved Hide resolved
Andy Christiansen added 3 commits July 7, 2023 10:15
…d, and there's a comment explaining the rationale behind the new test code.
@macandy13 macandy13 changed the title Enable CI from pfm=1 setting & fix broken tests Fix broken PFM-enabled tests Jul 7, 2023
@macandy13 macandy13 marked this pull request as ready for review July 7, 2023 08:52
@dmah42 dmah42 merged commit 4931aef into google:main Jul 7, 2023
56 of 60 checks passed
@HFTrader
Copy link
Contributor

HFTrader commented Aug 31, 2023

I did some work on performance counters and yes, the performance counters tests were broken since then, more than a year ago, because they were built under unreasonable assumptions on how performance counters work.

On Github actions, and in any virtual machine in general, hw and hw/cache performance counters are generally not available. This is a feature of virtual machines, not a problem with the code itself.

I can't localize the original comments but my understanding after questioning is that those performance counter tests were there so anyone with a real machine and respective permissions could run the tests and verify that they were correct.

Note that even with all permissions is still possible for the tests to fail given the wild variety of hardware and cache profiles even under the Intel line.

However removing the code as was done takes away a tool that future developers would have to verify their work.

@macandy13
Copy link
Contributor Author

Sorry for the insanely late reply, @HFTrader. The reason I took these tests away was that it was very unclear under which circumstances they should work, so in a sense they are useless to some amount of people, depending on HW.

I guess we could bring them back in a different form - my first naive idea would be some binary that can be used to see which performance counters exist on the machine. But you might have better ideas.

Happy to collaborate on this, and apologies again for replying ultra-late.

@HFTrader
Copy link
Contributor

I have so many things on my plate right now, I wouldn't be able to collaborate, realistically. Even then, if those tests are not benefitting anybody, what's the point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants