Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Talk idea: Distributing PGO'ed toolchains for Great Good #4

Open
gburgessiv opened this issue Jun 3, 2021 · 7 comments
Open

Talk idea: Distributing PGO'ed toolchains for Great Good #4

gburgessiv opened this issue Jun 3, 2021 · 7 comments

Comments

@gburgessiv
Copy link
Member

I work with a few teams at Google who ship LLVM-based toolchains to both Google and open-source developers. All of these toolchains have custom pipelines designed to generate PGO profiles, which, when applied to the toolchain, make compiling and linking roughly 10-20% faster.

I'd be more than happy to give a lightning talk going over:

  • The benefits the teams I work with see from PGO
  • How these teams have implemented their pipelines
  • Tools that exist in upstream LLVM that make generating & applying PGO profiles straightforward

I think this information would be useful and interesting to distributors who are interested in shipping PGO'ed toolchains, but who currently choose not to do so. Bonus points if it starts conversations of the form "we'd do PGO if only we had $X feature," where $X isn't terribly difficult to implement and land upstream. ;)

@androm3da
Copy link

custom pipelines designed to generate PGO profiles, which, when applied to the toolchain, make compiling and linking roughly 10-20% faster.

That's impressive! Is there a release workflow that could integrate PGO into the three-phase test-release.sh?

@gburgessiv
Copy link
Member Author

I'm not immediately familiar with the inner workings of test-release.sh, though the two forms of "batteries-included" PGO I'm familiar with upstream are:

From a combination of what you said and the documentation, it sounds like it might not be all that difficult to optionally slide a 4th stage into test-release.sh (making the third into effectively "generate a PGO profile for me," either reusing the bits mentioned above, or shaped in a similar way to those)?

@ojeda
Copy link
Member

ojeda commented Jun 4, 2021

Are the Ubuntu LLVM packages (used e.g. in GitHub CI) and/or the apt.llvm.org ones already PGO'd?

If not, it would be amazing to convince/help them to do it :)

@carlocab
Copy link
Contributor

carlocab commented Jun 4, 2021

Homebrew maintainer here. I've been interested in doing something like this for Homebrew's build of LLVM. Our build has a reasonable number of users, and I think it'll help with this issue, for example.

We probably want to do the CMake-based approach. Would this work when used with the install and install-xcode-toolchain targets (perhaps after building the stage2 target)? Or would we need to use the stage2-install-distribution target?

If it's the latter, that might get tricky, since we don't make use of LLVM_DISTRIBUTION_COMPONENTS, and I can't find any documentation for the right value to set this to in order to preserve our existing configuration.

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Sep 9, 2021

Thanks for taking the time to write up a CFP; we'd be overjoyed to have you present at LLVM Distributors Conf 2021! If you still plan on presenting, this is a reminder to get started on your slides for next week. Once they're done, we will contact you about submitting a PDF of your slides as either a pull request to this repository or via email to the organizer. We hope to have a schedule finalized by EOW; we may iterate on the schedule based on whether presenters have conflicts. Please keep this issue open for attendees to ask questions, or close this issue if you no longer plan on attending. Reminder to keep your talk concise (15 minutes); we wont be doing time for questions in order to fit as much content as possible. Attendees should ask questions here in this github issue.

@llvm-beanz
Copy link

@carlocab There is no reason why you shouldn't be able to use the CMake-based multi-stage approach with distribution installation or the Xcode toolchain installation. I'm not sure I've ever used them all at the same time, but I wrote the initial implementation of both features, and they should be agnostic to each other.

The one thing to keep in mind is that doing the CMake PGO build is effectively a 3-stage build:
Stage1: Host compiler to build new LLVM compiler
Stage2: Stage 1 LLVM compiler to build instrumented LLVM compiler
Stage3: Stage 1 LLVM compiler to build optimized LLVM compiler

As with a normal bootstrap we want a stage1 compiler so that the later stages benefit from all the bug fixes and improvements of the latest compiler. We need to have that stage1 compiler before building the instrumented compiler and generating the profile data.

One of the things that I never really got around to was building out a better suite of training data for the PGO profiles. The in-tree training data is literally one "Hello, World" C++ program.

@carlocab
Copy link
Contributor

Thanks, @llvm-beanz. I was able to implement a PGO build in the end, but I ended up just manually implementing it in our build script, pretty much as you describe, with an extra step of generating profile data with the instrumented compiler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants