Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Die and Module (and NumaNodeEx) relationships #480

Merged
merged 1 commit into from
Aug 22, 2024

Conversation

bgoglin
Copy link
Contributor

@bgoglin bgoglin commented Aug 20, 2021

Some future Windows release (TBD) will expose "Die" and "Module" information as well as NUMA nodes spanning mulitple processor groups. Bits are appearing in the onlie doc, we don't know all the details yet, and we don't have a way to test yet.

@bgoglin bgoglin force-pushed the windows-new-levels branch 3 times, most recently from fefc02e to 9522cbf Compare November 9, 2021 10:35
@bgoglin bgoglin force-pushed the windows-new-levels branch from a574fde to 0153e8a Compare April 17, 2023 09:28
@bgoglin
Copy link
Contributor Author

bgoglin commented Apr 18, 2023

As of 2023/04, WS22 exposes returns Die/Module info when explicitly requested (Windows 11 does the same when it doesn't crash for unknown reasons). When requesting All relations on a machine without multiple Dies or Modules, Processor/Core/Group/Numa are returned as usual, but no Die/Module. If that's the desired behavior, that means we wouldn't have to manually ignore useless Die/Module objects, good.
Waiting for more tests on a really big machine.

@bgoglin
Copy link
Contributor Author

bgoglin commented Jul 5, 2023

Requested some clarification from Microsoft at MicrosoftDocs/feedback#3917

Starting with "Windows Server 2022 (21H2, build 20348)",
new relations were added for Die, Module and NumaNodeEx.

NumaNodeEx is only for requesting NUMA info in scalable format
(contrary to the old "Numa"), which we already get through "All".

"Die" and "Module" look like what Intel added in x86 CPUID
(already supported in the x86 backend) but it's missing the "Tile"
level for some reason (never used afaik).

RelationProcessorDie uses our existing DIE object.

RelationProcessorModule uses GROUP with subkind "Module" just like x86 did.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
@bgoglin bgoglin force-pushed the windows-new-levels branch from 6eb2d69 to 4ae5aa7 Compare August 22, 2024 09:26
@bgoglin bgoglin changed the title [WIP DNM] Windows Die, Module and NumaNodeEx Windows Die and Module (and NumaNodeEx) relationships Aug 22, 2024
@bgoglin bgoglin merged commit 47751aa into open-mpi:master Aug 22, 2024
1 check passed
@bgoglin bgoglin deleted the windows-new-levels branch August 22, 2024 13:19
bgoglin added a commit that referenced this pull request Aug 23, 2024
The Linux backend ignored Dies that are identical to packages because
the kernel may expose that even if the hardware does not support Dies.
Windows might do the same (see #480 but no reply from Microsoft yet).

Move the filtering to the core and only apply it if all Dies are
identical to their Package.

Implemented by forcing KEEP_STRUCTURE filtering only between Dies
and Packages.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
bgoglin added a commit that referenced this pull request Aug 25, 2024
The Linux backend ignored Dies that are identical to packages because
the kernel may expose that even if the hardware does not support Dies.
Windows might do the same (see #480 but no reply from Microsoft yet).

Move the filtering to the core and only apply it if all Dies are
identical to their Package.

Implemented by forcing KEEP_STRUCTURE filtering only between Dies
and Packages.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
(cherry picked from commit 0b0f5fa)
bgoglin added a commit that referenced this pull request Sep 6, 2024
procInfo[i] is garbage, use procInfo to get relation information
when setting group kinds.

Thanks to this fix, I now confirm that Windows 11 is able to report
"Module" objects, at least on Intel MeteorLake CPUs. hwloc exposes
them a "Module" groups which are usually merged into L2 caches.
Ref #480

By the way, remove an obsolete comment nearby

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
bgoglin added a commit that referenced this pull request Sep 11, 2024
procInfo[i] is garbage, use procInfo to get relation information
when setting group kinds.

Thanks to this fix, I now confirm that Windows 11 is able to report
"Module" objects, at least on Intel MeteorLake CPUs. hwloc exposes
them a "Module" groups which are usually merged into L2 caches.
Ref #480

By the way, remove an obsolete comment nearby

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
(cherry picked from commit 1dc4faf)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant