How can we (Arm) help? #33

diegorusso · 2023-11-20T18:52:02Z

Hello @mdboom, @gvanrossum pointed me to this repo and suggested to raise an issue to introduce myself.

I'm Diego from Arm Ltd and I was wondering if there is anything we could do to help you out with the benchmarks story you are maintaining.
Recently I've been working with Łukasz Langa to have aarch64 benchmarks on speed.python.org (more info on this thread) and Łukasz is fixing a few issues on the website front.

For instance I see you have arm64 results but not aarch64 results. What's the reason for that? What about Windows on Arm?

This is really an initial contact to see if we could help each other, start a discussion and helping out filling any gap you might have in your infrastructure.

Thanks!

mdboom · 2023-11-20T19:20:22Z

Hi! Thanks for reaching out. I saw your work with Łukasz and that's great to see.

There's no real reason we don't have aarch64 or Windows on Arm yet other than prioritizing the Tier 1 (and 1.5 in the case of darwin-arm64) first. We'd obviously need dedicated, bare metal hardware to run the Github self-hosted runner on.

diegorusso · 2023-11-24T17:02:18Z

Michael, ok thanks for the update. Let me see what we can do to help you out.

gvanrossum · 2024-01-24T17:56:37Z

Did this discussion move elsewhere? Can this issue be closed? Or are we still waiting for @diegorusso ?

diegorusso · 2024-01-24T18:19:08Z

Hello, I'm still busy fixing things up for speed.python.org and enabling aarch64 metrics. In theory (but I want to check with Łukasz first) we could use that machine to run more benchmarks. At the moment we are running nightly benchmarks (at midnight) to mimimc the same behaviour of the x86 counterpart. We should utilise it more: it's a pity to keep idle 80 cores with 256GB of RAM :)
We setup the machine with CPU isolation and we could run pyperformance in parallels using CPU affinity (ATM up to 8 parallel runs of pyperformance).
I'll ping Łukasz so we can work out what the best plan is to maximise the use of this machine.

mdboom · 2024-03-29T13:41:09Z

@diegorusso: We are at a point where we could definitely use native aarch64 hardware. CPython is currently developing a JIT, and while it does work on aarch64/Linux, we currently use emulation for CI, and we have no visibility into its performance. Would you be available to discuss how we could get access to that machine (or some other)? Our benchmarking infrastructure is currently based on Github Actions self-hosted runners, so the main lift would be getting the GHA software installed on it and talking to our benchmarking repo.

diegorusso · 2024-04-02T15:46:45Z

hello @mdboom, how are you? Thanks for reaching out. Before exploring the options, I have a few questions:

how many runs do you have per day?
how long do they last? Is it the standard pyperformance run?
who can kick the build/run? Is it just a set of people or anyone from the community?
How the builds are kicked? Is it a cronjob like, automatically (via PR) or manually?

Apologies for the list of questions but it will help to understand the use case and see if there is a viable solution here.

Thanks

mdboom · 2024-04-02T15:56:14Z

hello @mdboom, how are you? Thanks for reaching out. Before exploring the options, I have a few questions:

how many runs do you have per day?

Probaby 4-5 times a day on average.

how long do they last? Is it the standard pyperformance run?

Yes, it's the standard pyperformance run -- usually about 20 minutes for a PGO compile and 1h15m for the benchmark runs.

who can kick the build/run? Is it just a set of people or anyone from the community?

It's just a restricted set of people we trust -- security on raw metal is challenging, and it's just easier that way (and what Github recommends).

How the builds are kicked? Is it a cronjob like, automatically (via PR) or manually?

They are usually kicked off manually by a developer wanting to test a particular change, but we also do a weekly cronjob. We don't have automatic via a PR due to the same security concerns.

Apologies for the list of questions but it will help to understand the use case and see if there is a viable solution here.

No problem.

We also have the different use case of just needing occasional direct access to a Linux-on-ARM machine to debug things when code generation isn't working. We currently use emulation for this, but I think having access to real hardware would be simpler than dealing with cross compilation, etc. @brandtbucher can probably provide more details.

brandtbucher · 2024-04-02T17:15:40Z

We also have the different use case of just needing occasional direct access to a Linux-on-ARM machine to debug things when code generation isn't working. We currently use emulation for this, but I think having access to real hardware would be simpler than dealing with cross compilation, etc. @brandtbucher can probably provide more details.

Just to clarify: I already have AArch64 Linux hardware that I've been using to develop and debug the JIT for that platform. As far as I see it, our needs are currently:

Individual access to a WoA machine for development/debugging.
Benchmarking infrastructure for AArch64 Linux.
Benchmarking infrastructure for WoA.

Less important, but still on the wish list:

AArch64 Linux JIT CI for CPython (non-emulated)
WoA JIT CI for CPython (non-emulated)

JIT buildbots are something we may want to consider at some point to help fill CI gaps.

diegorusso · 2024-04-19T16:37:51Z

Hello,

thanks for providing more information about your use case.

I've put a request (https://github.com/WorksOnArm/equinix-metal-arm64-cluster/issues/325) in for a bare metal AARch64 machine via WorksOnArm to help you out with AArch64 Linux.

Regarding "AArch64 Linux JIT CI for CPython" if you mean the public CPython project and for generic AArch64 test, you can follow the latest here: https://discuss.python.org/t/pep-11-proposal-to-promote-aarch64-plaftorms-to-tier-1/44774/24 TLDR: we're working on it.

brandtbucher · 2024-04-22T22:42:27Z

Thanks! Looks like that issue got an "approved" label, which seems promising. :)

diegorusso · 2024-04-25T19:48:01Z

Correct, the request has been approved. I'm going to provision the machine by tomorrow. In the meantime can you suggest who the best person is to discuss access/admin to the machine?

brandtbucher · 2024-04-25T20:09:17Z

That would be @mdboom.

mdboom · 2024-04-25T20:37:59Z

That would be @mdboom.

Yep. I just now got in touch via e-mail. Looking forward to chatting.

diegorusso · 2024-04-29T12:51:27Z

Quick update on this. The machine has been provisioned and it has a FQDN. On Wednesday we will have a catch-up with Mike so we can decide the access to this machine.

diegorusso · 2024-05-01T16:16:42Z

@mdboom has now full access to the AArch64 machine. It's an Ubuntu 22.04, 80 cores, 256GB memory, ~870GB NVMe storage.
I've installed basic dependencies to build CPython, but I'll leave to Mike to install whatever he needs to have the machine hooked into the CI system.

brandtbucher · 2024-05-01T17:39:04Z

Thanks @diegorusso!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can we (Arm) help? #33

How can we (Arm) help? #33

diegorusso commented Nov 20, 2023

mdboom commented Nov 20, 2023

diegorusso commented Nov 24, 2023

gvanrossum commented Jan 24, 2024

diegorusso commented Jan 24, 2024

mdboom commented Mar 29, 2024

diegorusso commented Apr 2, 2024

mdboom commented Apr 2, 2024

brandtbucher commented Apr 2, 2024

diegorusso commented Apr 19, 2024

brandtbucher commented Apr 22, 2024

diegorusso commented Apr 25, 2024

brandtbucher commented Apr 25, 2024

mdboom commented Apr 25, 2024

diegorusso commented Apr 29, 2024

diegorusso commented May 1, 2024 •

edited

Loading

brandtbucher commented May 1, 2024

How can we (Arm) help? #33

How can we (Arm) help? #33

Comments

diegorusso commented Nov 20, 2023

mdboom commented Nov 20, 2023

diegorusso commented Nov 24, 2023

gvanrossum commented Jan 24, 2024

diegorusso commented Jan 24, 2024

mdboom commented Mar 29, 2024

diegorusso commented Apr 2, 2024

mdboom commented Apr 2, 2024

brandtbucher commented Apr 2, 2024

diegorusso commented Apr 19, 2024

brandtbucher commented Apr 22, 2024

diegorusso commented Apr 25, 2024

brandtbucher commented Apr 25, 2024

mdboom commented Apr 25, 2024

diegorusso commented Apr 29, 2024

diegorusso commented May 1, 2024 • edited Loading

brandtbucher commented May 1, 2024

diegorusso commented May 1, 2024 •

edited

Loading