Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add utilities for managing a toolchain install, and install and use LLD. #3993

Merged
merged 4 commits into from
Jun 1, 2024

Conversation

chandlerc
Copy link
Contributor

@chandlerc chandlerc commented May 27, 2024

The install directory contains the BUILD logic for creating an
installable tree of data files and executables for the toolchain, and
a library to facilitate toolchain code accessing the paths to their data
within this installation.

Then adds an installation of LLD in a synthetic LLVM installation, and
teaches the Clang runner to configure this and use it for linking
instead of the system linker.

Currently, the install paths only really manage access to the LLVM
binaries installed and used by the Clang runner for linking, but
eventually other data files like the prelude and runtime libraries will
be fleshed out as well. There are TODOs for moving more things over here
such as the prelude.

One interesting aspect of this is where to put helpers like parts of
LLVM in our install. This PR suggests nesting those files under
lib/carbon. While using a lib subdirectory isn't a perfect fit for
the FHS (Filesystem Hierarchy Standard), having a single location where
private data is collected is significantly superior to spreading them
across the system. This also matches similar patterns used by Clang
itself and several other language toolchains and standard libraries.

The install directory also provides a natural place for us to build out
packaging rules to create installable packages in various formats, but
that remains future work.

toolchain/install/installation.h Outdated Show resolved Hide resolved
toolchain/install/installation.h Outdated Show resolved Hide resolved
toolchain/install/installation.h Outdated Show resolved Hide resolved
toolchain/install/installation.h Outdated Show resolved Hide resolved
toolchain/install/installation.h Outdated Show resolved Hide resolved
toolchain/install/installation.cpp Outdated Show resolved Hide resolved
toolchain/install/installation.h Outdated Show resolved Hide resolved
toolchain/install/installation_test.cpp Outdated Show resolved Hide resolved
toolchain/install/installation_test.cpp Outdated Show resolved Hide resolved
toolchain/install/test_binary.cpp Outdated Show resolved Hide resolved
The install directory contains the BUILD logic for creating an
installable tree of data files and executables for the toolchain, and
a library to facilitate toolchain code accessing the paths to their data
within this installation.

Then adds an installation of LLD in a synthetic LLVM installation, and
teaches the Clang runner to configure this and use it for linking
instead of the system linker.

Currently, the install paths only really manage access to the LLVM
binaries installed and used by the Clang runner for linking, but
eventually other data files like the prelude and runtime libraries will
be fleshed out as well. There are TODOs for moving more things over here
such as the prelude.

One interesting aspect of this is where to put helpers like parts of
LLVM in our install. This PR suggests nesting those files under
`lib/carbon`. While using a `lib` subdirectory isn't a perfect fit for
the FHS (Filesystem Hierarchy Standard), having a single location where
private data is collected is significantly superior to spreading them
across the system. This also matches similar patterns used by Clang
itself and several other language toolchains and standard libraries.

The install directory also provides a natural place for us to build out
packaging rules to create installable packages in various formats, but
that remains future work.
@chandlerc chandlerc force-pushed the extract-install-dir branch 3 times, most recently from 13d99d0 to f7905d3 Compare May 29, 2024 02:54
Copy link
Contributor Author

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the feedback. I think I got all the comments here, plus a the ones in #3989. But if I missed anything, let me know.

A few of them I think I still have questions on that need to get resolved.

toolchain/install/installation.h Outdated Show resolved Hide resolved
toolchain/install/installation.h Outdated Show resolved Hide resolved
toolchain/install/installation.h Outdated Show resolved Hide resolved
toolchain/install/installation.h Outdated Show resolved Hide resolved
toolchain/install/installation.h Outdated Show resolved Hide resolved
toolchain/install/test_binary.cpp Outdated Show resolved Hide resolved
toolchain/install/installation_test.cpp Outdated Show resolved Hide resolved
toolchain/install/installation.cpp Outdated Show resolved Hide resolved
toolchain/install/installation.cpp Outdated Show resolved Hide resolved
toolchain/install/installation.h Outdated Show resolved Hide resolved
@chandlerc chandlerc changed the title Introduce a package for installation management. Add utilities for managing a toolchain install, and install and use LLD. May 29, 2024
toolchain/install/install_paths.h Show resolved Hide resolved
toolchain/install/installation.cpp Outdated Show resolved Hide resolved
toolchain/install/BUILD Outdated Show resolved Hide resolved
toolchain/install/BUILD Outdated Show resolved Hide resolved
toolchain/install/BUILD Outdated Show resolved Hide resolved
toolchain/install/install_paths.h Outdated Show resolved Hide resolved
Comment on lines 88 to 90
CARBON_VLOG()
<< "Failed to detect a recognized install path, falling back to: "
<< prefix_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this fallback might actually be dangerous and miss issues, particularly given the custom runfiles implementation. Can it verify that structure seems correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've switched to verifying everything and having an error state that is exposed.

FWIW, I tried a bunch of other API designs first before landing here, none of them worked well. I wanted to use an ErrorOr return or an optional return, but both proved really challenging to make work well. Having the class itself retain error information worked much better. =/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I see that. But it looks like error() is only checked in install_paths_test -- am I missing something? Note, ErrorOr does have some burden, but that's because it's forcing checks.

If you prefer to keep returning a finished InstallPaths when there's a structural error, you might consider something like CheckMarkerFile() -> ErrorOr<Success> instead of storing an error() state. Really there's not much need to store the error if it's not used, and it avoids an implication that the structure is acting on invalid state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, you're clearing the path, which will cause cwd to be used. But still, shouldn't more places -- at least our tests -- be validating they got a correct structure back? Do we want to support unexpected directly paths?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are three cases...

We should definitely directly validate this in mosts tests, I'll work on adding that.

We can't (yet) validate this in fuzzers because we need to restructure the fuzzer code to support finding this. I've left TODOs about that, and wanted to get to this after moving the prelude over. But once done, that should remove the need for a fuzzer/test construction that is unvalidated I think.

The driver is a bit trickier because some invocations of the driver don't use the install. Does it make sense to unconditionally error on missing install even if we don't need it? I can see arguments in both directions and don't feel strongly here. But even if we do validate in all cases, I think we want to do that inside the driver where we have our diagnostic machinery so we can print the error using that infrastructure.

So my plan is to leave a lot of the missing validation here for this PR (for fuzzers and the driver), and then work on both getting the fuzzers to use this and getting the driver to validate it however much seems reasonable.

Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this seems fine to fix forward, and so based on approval going to merge as-is. It at least covers checking in all the tests. And it'll let me empty the stack of PRs. Happy to follow up and fix more cases, and make any further API tweaks here though if there are better APIs, especially if we have a better idea of how to handle diagnosing issues here in the driver itself (the only place where it seems a bit tricky).

Co-authored-by: Jon Ross-Perkins <jperkins@google.com>
@chandlerc chandlerc force-pushed the extract-install-dir branch 3 times, most recently from 4e47b90 to 6b7fd93 Compare May 31, 2024 09:35
Copy link
Contributor Author

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to new and exciting ways download failures can manifest, struggling to get clean test runs here, but I think I've implemented what was desired in our discussion. Narrow and explicit use of different detection strategies, along with some amount of validation and error handling. The error management was awkward as we need to create drivers with bad installs currently. And moreover, don't have support in the driver for diagnosing this. So all of that is TODO -- we need diagnostics and other things to cover when we expect a non-errors install to be there and instead have an erroneous one. But the tracking of errors is in place so we should be able t add that later? This PR is already really big and taking a lot of iterations so I didn't want to add anything more to it. Worried about how much I've had to add to make all this work as-is. Anyways, hopefully this makes more sense to folks. And if not, happy to chat about what would make more sense to unblock all of this.

toolchain/install/install_paths.h Show resolved Hide resolved
toolchain/install/installation.cpp Outdated Show resolved Hide resolved
toolchain/install/installation_test.cpp Outdated Show resolved Hide resolved
toolchain/install/BUILD Outdated Show resolved Hide resolved
toolchain/install/BUILD Outdated Show resolved Hide resolved
Comment on lines 88 to 90
CARBON_VLOG()
<< "Failed to detect a recognized install path, falling back to: "
<< prefix_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've switched to verifying everything and having an error state that is exposed.

FWIW, I tried a bunch of other API designs first before landing here, none of them worked well. I wanted to use an ErrorOr return or an optional return, but both proved really challenging to make work well. Having the class itself retain error information worked much better. =/

toolchain/install/install_paths.h Outdated Show resolved Hide resolved
Comment on lines 88 to 90
CARBON_VLOG()
<< "Failed to detect a recognized install path, falling back to: "
<< prefix_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, you're clearing the path, which will cause cwd to be used. But still, shouldn't more places -- at least our tests -- be validating they got a correct structure back? Do we want to support unexpected directly paths?

Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, though I'd suggest at least having tests + fuzzer uses CHECK-fail or similar if there's an error. We should really expect valid trees in those.

auto CheckMarkerFile() -> void;

llvm::SmallString<256> prefix_;
std::optional<std::string> error_;
Copy link
Contributor

@jonmeow jonmeow May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think you could also represent this as ErrorOr<Success>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I tied this and it ended up being awkward... ErrorOr isn't copyable, and seems to somewhat want consumers to move the error out of it. Using an optional string allows just a single accessor and seems a bit simpler. Also not sure the location is really helpful from Error here.

Comment on lines 103 to 104
auto SetError(llvm::Twine message) -> void;
auto CheckMarkerFile() -> void;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add comments to these, to explain what they're doing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Co-authored-by: Jon Ross-Perkins <jperkins@google.com>
Copy link
Contributor Author

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated based on review, PTAL though as there are still some discussion points where I think there are options here.

toolchain/install/install_paths.h Outdated Show resolved Hide resolved
Comment on lines 103 to 104
auto SetError(llvm::Twine message) -> void;
auto CheckMarkerFile() -> void;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

auto CheckMarkerFile() -> void;

llvm::SmallString<256> prefix_;
std::optional<std::string> error_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I tied this and it ended up being awkward... ErrorOr isn't copyable, and seems to somewhat want consumers to move the error out of it. Using an optional string allows just a single accessor and seems a bit simpler. Also not sure the location is really helpful from Error here.

Comment on lines 88 to 90
CARBON_VLOG()
<< "Failed to detect a recognized install path, falling back to: "
<< prefix_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are three cases...

We should definitely directly validate this in mosts tests, I'll work on adding that.

We can't (yet) validate this in fuzzers because we need to restructure the fuzzer code to support finding this. I've left TODOs about that, and wanted to get to this after moving the prelude over. But once done, that should remove the need for a fuzzer/test construction that is unvalidated I think.

The driver is a bit trickier because some invocations of the driver don't use the install. Does it make sense to unconditionally error on missing install even if we don't need it? I can see arguments in both directions and don't feel strongly here. But even if we do validate in all cases, I think we want to do that inside the driver where we have our diagnostic machinery so we can print the error using that infrastructure.

So my plan is to leave a lot of the missing validation here for this PR (for fuzzers and the driver), and then work on both getting the fuzzers to use this and getting the driver to validate it however much seems reasonable.

Does that make sense?

@chandlerc
Copy link
Contributor Author

Updated based on review, PTAL though as there are still some discussion points where I think there are options here.

Mentioned above, but should mention in the main thread for visibility -- re-looking at this and the approval from Jon, I'm gonna merge with the tests checking this. I'll work on follow-up changes to improve fuzzing, and see how we can handle this in the driver and if we can remove the error state from the design at some point.

@chandlerc chandlerc added this pull request to the merge queue Jun 1, 2024
Merged via the queue into carbon-language:trunk with commit d3a5b0e Jun 1, 2024
7 checks passed
@chandlerc chandlerc deleted the extract-install-dir branch June 1, 2024 03:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants