Support for ARM Neoverse N1 platform #381

amarathe84 · 2023-02-07T22:31:47Z

Description

This WIP PR introduces code contributions to support the Ampere Neoverse N1 ARM SoC platform. The code changes will introduce ARM CPU version check in order to bind the appropriate low-level functionality to the higher-level API. This WIP PR will implement the low-level functions in Variorum to expose the power features supported by the Neoverse N1 platform. Code refactoring will be done to split the code base for ARM based on the specific ARM platform.

Fixes #378, #379

Task checklist

Code contributions to support Ampere Neoverse N1 telemetry
Integration/regression testing
- ARM Neoverse N1 tests
- ARM Juno r2 telemetry APIs
- ARM Juno r2 control API
Documentation update
Build/CI update

Testing

Unit and component testing will be done using Variorum example programs on the following systems

Neoverse N1 tests on NVHPC1 in powerlab. Examples testing telemetry functions will be tested.
Arm Juno r2 tests on Juno in powerlab. Examples previously tested for the telemetry and capping APIs will be tested.

Nvhpc1 system integration tests (Ampere Neoverse N1)

Print power usage

$ examples/variorum-print-power-example
_ARM_POWER Host CPU_mW I/O_mW
_ARM_POWER nvhpc1 9583.00 32763.00
_ARM_POWER nvhpc1 9583.00 32763.00


$ examples/variorum-print-verbose-power-example
_ARM_POWER Host: nvhpc1, CPU: 9988.00 mW, I/O: 32623.00 mW
_ARM_POWER Host: nvhpc1, CPU: 9988.00 mW, I/O: 32623.00 mW

Print CPU temperature

$ examples/variorum-print-thermals-example
_ARM_TEMPERATURE Host LOC1 SoC
_ARM_TEMPERATURE nvhpc1 38.00 29.00

$ examples/variorum-print-verbose-thermals-example
_ARM_TEMPERATURE Host: nvhpc1,LOC1: 38.00 C, SoC: 29.00 C

Print CPU frequency

$ examples/variorum-print-frequency-example
_ARM_CLOCKS Host Socket Clock_MHz
_ARM_CLOCKS nvhpc1 0 2777

$ examples/variorum-print-verbose-frequency-example
_ARM_CLOCKS Host: nvhpc1, Socket: 0, Clock: 2777 MHz
_ARM_CLOCKS Host: nvhpc1, Socket: 0, Clock: 2777 MHz

Cap CPU frequency

$ examples/variorum-cap-socket-frequency-limit-example -f 1000
Capping CPU 0 to 1000 MHz.
_ARM_CLOCKS Host Socket Clock_MHz
_ARM_CLOCKS nvhpc1 0 1000

Juno system integration and regression tests (Arm Juno r2)

Print power usage

$ examples/variorum-print-power-example
_ARM_POWER Host Sys_mW Big_mW Little_mW GPU_mW
_ARM_POWER genericarmv8 806.95 41.55 233.75 93.36
_ARM_POWER genericarmv8 772.48 43.70 228.30 93.17

$ examples/variorum-print-verbose-power-example
_ARM_POWER Host: genericarmv8, Sys: 805.21 mW, Big: 43.73 mW, Little: 236.41 mW, GPU: 95.63 mW
_ARM_POWER Host: genericarmv8, Sys: 775.39 mW, Big: 41.58 mW, Little: 192.58 mW, GPU: 93.11 mW

Print CPU temperature

$ examples/variorum-print-thermals-example
_ARM_TEMPERATURE Host Sys_C Big_C Little_C GPU_C
_ARM_TEMPERATURE genericarmv8 37.85 23.12 24.16 24.62

$ examples/variorum-print-verbose-thermals-example
_ARM_TEMPERATURE Host: genericarmv8, Sys: 37.85 C, Big: 23.30 C, Little: 24.47 C, GPU: 24.89 C

Print CPU frequency

$ examples/variorum-print-frequency-example
_ARM_CLOCKS Host CPU Socket Clock_MHz
_ARM_CLOCKS genericarmv8 Big 0 950
_ARM_CLOCKS genericarmv8 Little 1 600

$ examples/variorum-print-verbose-frequency-example
_ARM_CLOCKS Host: genericarmv8, CPU: Big, Socket: 0, Clock: 950 MHz
_ARM_CLOCKS Host: genericarmv8, CPU: Little, Socket: 1, Clock: 600 MHz
_ARM_CLOCKS Host: genericarmv8, CPU: Big, Socket: 0, Clock: 950 MHz
_ARM_CLOCKS Host: genericarmv8, CPU: Little, Socket: 1, Clock: 600 MHz

Cap CPU frequency

# examples/variorum-cap-socket-frequency-limit-example -i 1 -f 1000
Capping CPU 1 to 1000 MHz.

_ARM_CLOCKS Host CPU Socket Clock_MHz
_ARM_CLOCKS genericarmv8 Big 0 950
_ARM_CLOCKS genericarmv8 Little 1 1000

tpatki

For documentation, some updates are missing:

Please update the supported architectures in the README in the "Platform and Microarchitecture Support" section.
Please update the supported architectures sections for the APIs added in variorum.h, so doxygen picks it up in our documentation (see here: https://github.com/LLNL/variorum/blob/dev/src/variorum/variorum.h#L296)

tpatki · 2023-04-25T08:05:51Z

src/variorum/ARM/config_arm.c

-    uint64_t *model = (uint64_t *) malloc(sizeof(uint64_t));
-    *model = ARMV8;
+    unsigned long *model = (unsigned long *) malloc(sizeof(uint64_t));
+    asm volatile(


I haven't tested this change, @amarathe84, I assume you have thoroughly tested the part where it picks up the model. Is there documentation on this that we can add somewhere?

Also, I noticed that for Juno r2, you now have a different model for the big and the little processors, as opposed to just one which we had before. Can you elaborate why? Should we be representing them as the same model as we have been viewing the big.Little as a single entity?

I updated the PR description with the integration/regression tests on nhvpc1 and Juno r2. Please take a look at the test outcomes to see if they look okay.

Let me also post the description for the updated model ID check here shortly.

In the existing ARM implementation we assumed a generic ARMV8 (constant) model. To distinguish between ARM implementations, we need to look at the 'Primary part number' [15:4] bit fields of the MIDR_EL1 (Main ID) register which is defined per ARM CPU implementation (as opposed to a combined SoC like Juno r2). Here are the links to the MIDR_EL1 register for the three ARM CPU architectures we support:

Cortex A53:
https://developer.arm.com/documentation/ddi0500/j/System-Control/AArch64-register-descriptions/Main-ID-Register--EL1

Cortex A72:
https://developer.arm.com/documentation/100095/0003/System-Control/AArch64-register-descriptions/Main-ID-Register--EL1?lang=en

Neoverse N1:
https://developer.arm.com/documentation/100616/0301/register-descriptions/aarch64-system-registers/midr-el1--main-id-register--el1

Based on the model ID we set up the lower-level interfaces.

Juno r2 is a big.LITTLE implementation with both Cortex A72 and Cortex A53 in a single SoC. There are systems (e.g. revisions of Raspberry Pi) with either one of these two but with the same interfaces to lower-level functionality so the same code should work with them, but we haven't tested on such systems.

There's also a filesystem interface for MIDR_EL1 but I couldn't confirm if that's always available on an Arm implementation, so I went with the MRS instruction to get the model ID instead.

I'll add a subsection in the ARM Overview about model identification along with the links to ARM documentation.

Thank you for explaining @amarathe84 ! This is very helpful for me to understand the detail of the model/arch_id. And yes, this would be great to document somewhere outside of this issue too in the ARM documentation in some way. I didn't know about the MIDR_EL1 register.

tpatki · 2023-04-25T08:07:07Z

src/variorum/config_architecture.h

+    ARM_CORTEX_A72  = 0xd08, //ARM Cortex-A72 MPCore processor
+    ARM_CORTEX_A53  = 0xd03, //ARM Cortex-A53 MPCore processor


Why are these separate now, shouldn't we be representing the big.Little Juno r2 device as a single entity (we did this in the past)?

Update: The model check in config_arm.c is for either 0xd08 or 0xd03 is because the mrs %0, MIDR_EL1 =r<model> may return either of the values depending on which CPU runs it at runtime by the OS scheduler. So as long as we pick up one of these model IDs, we know that it's not Neoverse N1 and proceed with using the sysfs interface.

Specifically for Juno r2 we could check for both A53 and A72 CPUs in the big.LITTLE SoC, and not for either one of the ARM CPUs but the change may be non-trivial since detect_arm_arch() needs topological information (i.e. a list of CPUs present on the system) to run on both CPUs sequentially. Should I explore that or does the existing check suffice?

Yes, I think this would address my concern about checking it as a single entity. Thanks @amarathe84 !

I changed the detect_arm_arch() function to check the CPU ID of big and LITTLE clusters using the sysfs interface for midr_el register. The fix works for both Neoverse N1 and Juno r2. I did notice that the file I/O has slowed down the architecture check but that's the only way to simplify the logic. All tests worked as expected.

tpatki · 2023-05-10T22:58:02Z

@amarathe84 Can you look at the comments here, and then we can merge?

amarathe84 · 2023-05-12T23:42:30Z

For documentation, some updates are missing:

Please update the supported architectures in the README in the "Platform and Microarchitecture Support" section.

Please update the supported architectures sections for the APIs added in variorum.h, so doxygen picks it up in our documentation (see here: https://github.com/LLNL/variorum/blob/dev/src/variorum/variorum.h#L296)

Updated both README.md and variorum.h to indicate Neoverse N1 supported APIs.

tpatki · 2023-05-13T01:05:43Z

Looks good to me @amarathe84! I am curious about the model description check, but the PR is good to go I think.
@slabasan can merge after she's had time to review.

Identify and arbitrate based on the ARM CPU version

fbd8529

slabasan force-pushed the arm-neoverse branch from fbd8529 to 4b59510 Compare February 8, 2023 05:06

Further code refectoring

fc32287

amarathe84 force-pushed the arm-neoverse branch from 4b59510 to fc32287 Compare February 13, 2023 16:17

amarathe84 added 6 commits February 13, 2023 08:37

Power and thermal telemetry support

e8e46a2

Refactor common code out of platform-specific implementation

2a74924

Support CPU clocks (averaged)

774589a

Arbitrate low-level functions based on ARM CPU ID

96b3d55

Updated cap frequency limit functionality

6f9d15b

Updated functionality

ba2f7bf

amarathe84 changed the title ~~WIP: Support for ARM Neoverse N1 platform~~ Support for ARM Neoverse N1 platform Mar 28, 2023

amarathe84 requested a review from slabasan March 28, 2023 18:33

amarathe84 marked this pull request as ready for review March 28, 2023 18:33

tpatki added the status-ready-for-review Formatted, and tested on multiple systems. label Mar 28, 2023

amarathe84 requested a review from tpatki March 28, 2023 23:43

tpatki reviewed Apr 25, 2023

View reviewed changes

tpatki linked an issue May 11, 2023 that may be closed by this pull request

Variorum does not support the ARM Neoverse N1 SoC platform #379

Closed

Updated docs with information on Neoverse N1

3116c9e

amarathe84 force-pushed the arm-neoverse branch from db292ad to 3116c9e Compare May 12, 2023 23:41

Updated CPU model identification information

a2d5b8f

amarathe84 force-pushed the arm-neoverse branch from bbce19c to 7f038d6 Compare May 16, 2023 23:20

ARM CPU ID check to include two CPU IDs (for Juno r2)

139264b

amarathe84 force-pushed the arm-neoverse branch from 7f038d6 to 139264b Compare May 16, 2023 23:22

tmcgilchrist mentioned this pull request Jun 1, 2023

Support ARM Neoverse N1 platform patricoferris/ocaml-variorum#5

Open

slabasan approved these changes Jun 5, 2023

View reviewed changes

slabasan added type-feature area-hardware-support labels Jun 5, 2023

slabasan approved these changes Jun 5, 2023

View reviewed changes

slabasan merged commit 349746f into LLNL:dev Jun 5, 2023

slabasan added this to the Production: v0.7.0 Release milestone Jun 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for ARM Neoverse N1 platform #381

Support for ARM Neoverse N1 platform #381

amarathe84 commented Feb 7, 2023 •

edited

Loading

tpatki left a comment

tpatki Apr 25, 2023

amarathe84 May 12, 2023

amarathe84 May 15, 2023

tpatki May 15, 2023

tpatki Apr 25, 2023

amarathe84 May 15, 2023 •

edited

Loading

tpatki May 15, 2023

amarathe84 May 16, 2023

tpatki commented May 10, 2023

amarathe84 commented May 12, 2023

tpatki commented May 13, 2023

		ARM_CORTEX_A72 = 0xd08, //ARM Cortex-A72 MPCore processor
		ARM_CORTEX_A53 = 0xd03, //ARM Cortex-A53 MPCore processor

Support for ARM Neoverse N1 platform #381

Support for ARM Neoverse N1 platform #381

Conversation

amarathe84 commented Feb 7, 2023 • edited Loading

Description

Task checklist

Testing

Nvhpc1 system integration tests (Ampere Neoverse N1)

Print power usage

Print CPU temperature

Print CPU frequency

Cap CPU frequency

Juno system integration and regression tests (Arm Juno r2)

Print power usage

Print CPU temperature

Print CPU frequency

Cap CPU frequency

tpatki left a comment

Choose a reason for hiding this comment

tpatki Apr 25, 2023

Choose a reason for hiding this comment

amarathe84 May 12, 2023

Choose a reason for hiding this comment

amarathe84 May 15, 2023

Choose a reason for hiding this comment

tpatki May 15, 2023

Choose a reason for hiding this comment

tpatki Apr 25, 2023

Choose a reason for hiding this comment

amarathe84 May 15, 2023 • edited Loading

Choose a reason for hiding this comment

tpatki May 15, 2023

Choose a reason for hiding this comment

amarathe84 May 16, 2023

Choose a reason for hiding this comment

tpatki commented May 10, 2023

amarathe84 commented May 12, 2023

tpatki commented May 13, 2023

amarathe84 commented Feb 7, 2023 •

edited

Loading

amarathe84 May 15, 2023 •

edited

Loading