Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tf: aarch64 fix for cases where CNTFRQ_EL0 returns bogus value #2929

Conversation

pmolodo
Copy link
Contributor

@pmolodo pmolodo commented Feb 2, 2024

Description of Change(s)

As noted in this commit in the linux kernel:

torvalds/linux@c6f97ad

...the value of CNTFRQ_EL0 is sometimes unreliable. The linux kernel instead reads the tick rate from the device tree, and if that fails, only then falls back on CNTFRQ_EL0.

Since we already have measurement-based code, and reading from the device tree seemed tricky, we instead check if CNTFRQ_EL0 seems "sane" (ie, > 1Hz), and if not, fall back on the measurement code used in all other linux flavors.

I ran across this issue when running an aarch64 vm running on aarch64 hardware - I'm not sure if the issue was with the actual hardware or just the VMWare vm. In this case, CNTFRQ_EL0 was returning a negative value. Not sure if all "bad" values will be as obvious, but this change is at least an improvement.

Fixes Issue(s)

  • Arch timing functions and TfStopwatch on linux aarch64 VMs (and possibly other aarch64 platforms?)
  • I have verified that all unit tests pass with the proposed changes
  • I have submitted a signed Contributor License Agreement

As noted in this commit in the linux kernel:

torvalds/linux@c6f97ad

...the value of CNTFRQ_EL0 is sometimes unreliable.  The linux kernel
instead reads the tick rate from the device tree, and if that fails, only
then falls back on CNTFRQ_EL0.

Since we already have measurement-based code, and reading from the device
tree seemed tricky, we instead check if CNTFRQ_EL0 seems "sane"
(ie, > 1Hz), and if not, fall back on the measurement code used in all
other linux flavors.

I ran across this issue when running an aarch64 vm running on
aarch64 hardware - I'm not sure if the issue was with the actual
hardware or just the VMWare vm.  In this case, CNTFRQ_EL0 was returning
a negative value.  Not sure if all "bad" values will be as obvious, but
this change is at least an improvement.
@jesschimein
Copy link
Contributor

Filed as internal issue #USD-9239

@pixar-oss pixar-oss merged commit fc81c00 into PixarAnimationStudios:dev Mar 1, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants