Add utilities for managing a toolchain install, and install and use LLD. #3993

chandlerc · 2024-05-27T00:31:22Z

The install directory contains the BUILD logic for creating an
installable tree of data files and executables for the toolchain, and
a library to facilitate toolchain code accessing the paths to their data
within this installation.

Then adds an installation of LLD in a synthetic LLVM installation, and
teaches the Clang runner to configure this and use it for linking
instead of the system linker.

Currently, the install paths only really manage access to the LLVM
binaries installed and used by the Clang runner for linking, but
eventually other data files like the prelude and runtime libraries will
be fleshed out as well. There are TODOs for moving more things over here
such as the prelude.

One interesting aspect of this is where to put helpers like parts of
LLVM in our install. This PR suggests nesting those files under
lib/carbon. While using a lib subdirectory isn't a perfect fit for
the FHS (Filesystem Hierarchy Standard), having a single location where
private data is collected is significantly superior to spreading them
across the system. This also matches similar patterns used by Clang
itself and several other language toolchains and standard libraries.

The install directory also provides a natural place for us to build out
packaging rules to create installable packages in various formats, but
that remains future work.

toolchain/install/installation.h

toolchain/install/installation.cpp

toolchain/install/installation.h

toolchain/install/installation_test.cpp

toolchain/install/test_binary.cpp

The install directory contains the BUILD logic for creating an installable tree of data files and executables for the toolchain, and a library to facilitate toolchain code accessing the paths to their data within this installation. Then adds an installation of LLD in a synthetic LLVM installation, and teaches the Clang runner to configure this and use it for linking instead of the system linker. Currently, the install paths only really manage access to the LLVM binaries installed and used by the Clang runner for linking, but eventually other data files like the prelude and runtime libraries will be fleshed out as well. There are TODOs for moving more things over here such as the prelude. One interesting aspect of this is where to put helpers like parts of LLVM in our install. This PR suggests nesting those files under `lib/carbon`. While using a `lib` subdirectory isn't a perfect fit for the FHS (Filesystem Hierarchy Standard), having a single location where private data is collected is significantly superior to spreading them across the system. This also matches similar patterns used by Clang itself and several other language toolchains and standard libraries. The install directory also provides a natural place for us to build out packaging rules to create installable packages in various formats, but that remains future work.

chandlerc

Thanks for all the feedback. I think I got all the comments here, plus a the ones in #3989. But if I missed anything, let me know.

A few of them I think I still have questions on that need to get resolved.

toolchain/install/installation.h

toolchain/install/test_binary.cpp

toolchain/install/installation_test.cpp

toolchain/install/installation.cpp

toolchain/install/installation.h

toolchain/install/install_paths.h

toolchain/install/installation.cpp

toolchain/install/BUILD

toolchain/install/install_paths.h

jonmeow · 2024-05-29T17:32:38Z

toolchain/install/install_paths.cpp

+  CARBON_VLOG()
+      << "Failed to detect a recognized install path, falling back to: "
+      << prefix_;


I think this fallback might actually be dangerous and miss issues, particularly given the custom runfiles implementation. Can it verify that structure seems correct?

Ok, I've switched to verifying everything and having an error state that is exposed.

FWIW, I tried a bunch of other API designs first before landing here, none of them worked well. I wanted to use an ErrorOr return or an optional return, but both proved really challenging to make work well. Having the class itself retain error information worked much better. =/

Thanks, I see that. But it looks like error() is only checked in install_paths_test -- am I missing something? Note, ErrorOr does have some burden, but that's because it's forcing checks.

If you prefer to keep returning a finished InstallPaths when there's a structural error, you might consider something like CheckMarkerFile() -> ErrorOr<Success> instead of storing an error() state. Really there's not much need to store the error if it's not used, and it avoids an implication that the structure is acting on invalid state.

Oh, I see, you're clearing the path, which will cause cwd to be used. But still, shouldn't more places -- at least our tests -- be validating they got a correct structure back? Do we want to support unexpected directly paths?

So there are three cases...

We should definitely directly validate this in mosts tests, I'll work on adding that.

We can't (yet) validate this in fuzzers because we need to restructure the fuzzer code to support finding this. I've left TODOs about that, and wanted to get to this after moving the prelude over. But once done, that should remove the need for a fuzzer/test construction that is unvalidated I think.

The driver is a bit trickier because some invocations of the driver don't use the install. Does it make sense to unconditionally error on missing install even if we don't need it? I can see arguments in both directions and don't feel strongly here. But even if we do validate in all cases, I think we want to do that inside the driver where we have our diagnostic machinery so we can print the error using that infrastructure.

So my plan is to leave a lot of the missing validation here for this PR (for fuzzers and the driver), and then work on both getting the fuzzers to use this and getting the driver to validate it however much seems reasonable.

Does that make sense?

FWIW, this seems fine to fix forward, and so based on approval going to merge as-is. It at least covers checking in all the tests. And it'll let me empty the stack of PRs. Happy to follow up and fix more cases, and make any further API tweaks here though if there are better APIs, especially if we have a better idea of how to handle diagnosing issues here in the driver itself (the only place where it seems a bit tricky).

Co-authored-by: Jon Ross-Perkins <jperkins@google.com>

toolchain/install/install_paths.h

chandlerc

Due to new and exciting ways download failures can manifest, struggling to get clean test runs here, but I think I've implemented what was desired in our discussion. Narrow and explicit use of different detection strategies, along with some amount of validation and error handling. The error management was awkward as we need to create drivers with bad installs currently. And moreover, don't have support in the driver for diagnosing this. So all of that is TODO -- we need diagnostics and other things to cover when we expect a non-errors install to be there and instead have an erroneous one. But the tracking of errors is in place so we should be able t add that later? This PR is already really big and taking a lot of iterations so I didn't want to add anything more to it. Worried about how much I've had to add to make all this work as-is. Anyways, hopefully this makes more sense to folks. And if not, happy to chat about what would make more sense to unblock all of this.

toolchain/install/install_paths.h

toolchain/install/installation.cpp

toolchain/install/installation_test.cpp

toolchain/install/BUILD

chandlerc · 2024-05-31T09:31:20Z

toolchain/install/install_paths.cpp

+  CARBON_VLOG()
+      << "Failed to detect a recognized install path, falling back to: "
+      << prefix_;


Ok, I've switched to verifying everything and having an error state that is exposed.

FWIW, I tried a bunch of other API designs first before landing here, none of them worked well. I wanted to use an ErrorOr return or an optional return, but both proved really challenging to make work well. Having the class itself retain error information worked much better. =/

toolchain/install/install_paths.h

jonmeow · 2024-05-31T16:06:07Z

toolchain/install/install_paths.cpp

+  CARBON_VLOG()
+      << "Failed to detect a recognized install path, falling back to: "
+      << prefix_;


Oh, I see, you're clearing the path, which will cause cwd to be used. But still, shouldn't more places -- at least our tests -- be validating they got a correct structure back? Do we want to support unexpected directly paths?

toolchain/install/install_paths.h

jonmeow

Approving, though I'd suggest at least having tests + fuzzer uses CHECK-fail or similar if there's an error. We should really expect valid trees in those.

jonmeow · 2024-05-31T16:14:57Z

toolchain/install/install_paths.h

+  auto CheckMarkerFile() -> void;
+
+  llvm::SmallString<256> prefix_;
+  std::optional<std::string> error_;


FWIW, I think you could also represent this as ErrorOr<Success>

Hmm, I tied this and it ended up being awkward... ErrorOr isn't copyable, and seems to somewhat want consumers to move the error out of it. Using an optional string allows just a single accessor and seems a bit simpler. Also not sure the location is really helpful from Error here.

toolchain/install/install_paths.h

jonmeow · 2024-05-31T16:17:34Z

toolchain/install/install_paths.h

+  auto SetError(llvm::Twine message) -> void;
+  auto CheckMarkerFile() -> void;


Would it make sense to add comments to these, to explain what they're doing?

Co-authored-by: Jon Ross-Perkins <jperkins@google.com>

toolchain/install/install_paths.h

chandlerc

Updated based on review, PTAL though as there are still some discussion points where I think there are options here.

toolchain/install/install_paths.h

chandlerc · 2024-05-31T18:47:06Z

toolchain/install/install_paths.h

+  auto SetError(llvm::Twine message) -> void;
+  auto CheckMarkerFile() -> void;


chandlerc · 2024-05-31T18:47:51Z

toolchain/install/install_paths.h

+  auto CheckMarkerFile() -> void;
+
+  llvm::SmallString<256> prefix_;
+  std::optional<std::string> error_;


Hmm, I tied this and it ended up being awkward... ErrorOr isn't copyable, and seems to somewhat want consumers to move the error out of it. Using an optional string allows just a single accessor and seems a bit simpler. Also not sure the location is really helpful from Error here.

chandlerc · 2024-05-31T18:48:29Z

toolchain/install/install_paths.cpp

+  CARBON_VLOG()
+      << "Failed to detect a recognized install path, falling back to: "
+      << prefix_;


So there are three cases...

We should definitely directly validate this in mosts tests, I'll work on adding that.

We can't (yet) validate this in fuzzers because we need to restructure the fuzzer code to support finding this. I've left TODOs about that, and wanted to get to this after moving the prelude over. But once done, that should remove the need for a fuzzer/test construction that is unvalidated I think.

The driver is a bit trickier because some invocations of the driver don't use the install. Does it make sense to unconditionally error on missing install even if we don't need it? I can see arguments in both directions and don't feel strongly here. But even if we do validate in all cases, I think we want to do that inside the driver where we have our diagnostic machinery so we can print the error using that infrastructure.

So my plan is to leave a lot of the missing validation here for this PR (for fuzzers and the driver), and then work on both getting the fuzzers to use this and getting the driver to validate it however much seems reasonable.

Does that make sense?

chandlerc · 2024-06-01T03:19:15Z

Updated based on review, PTAL though as there are still some discussion points where I think there are options here.

Mentioned above, but should mention in the main thread for visibility -- re-looking at this and the approval from Jon, I'm gonna merge with the tests checking this. I'll work on follow-up changes to improve fuzzing, and see how we can handle this in the driver and if we can remove the error state from the design at some point.

github-actions bot added the toolchain label May 27, 2024

github-actions bot requested a review from jonmeow May 27, 2024 00:31

chandlerc mentioned this pull request May 27, 2024

Add very minimal support for packaging the toolchain. #3994

Merged

chandlerc force-pushed the extract-install-dir branch 2 times, most recently from ba3252f to fdab8dd Compare May 27, 2024 04:47

jonmeow mentioned this pull request May 28, 2024

Start building and bundling LLD for linking. #3989

Closed

jonmeow reviewed May 28, 2024

View reviewed changes

chandlerc force-pushed the extract-install-dir branch 3 times, most recently from 13d99d0 to f7905d3 Compare May 29, 2024 02:54

chandlerc commented May 29, 2024

View reviewed changes

chandlerc changed the title ~~Introduce a package for installation management.~~ Add utilities for managing a toolchain install, and install and use LLD. May 29, 2024

jonmeow reviewed May 29, 2024

View reviewed changes

toolchain/install/install_paths.h Show resolved Hide resolved

toolchain/install/installation.cpp Outdated Show resolved Hide resolved

jonmeow reviewed May 29, 2024

View reviewed changes

Apply suggestions from code review

6b7fd93

Co-authored-by: Jon Ross-Perkins <jperkins@google.com>

CarbonInfraBot reviewed May 29, 2024

View reviewed changes

toolchain/install/install_paths.h Outdated Show resolved Hide resolved

chandlerc force-pushed the extract-install-dir branch 3 times, most recently from 4e47b90 to 6b7fd93 Compare May 31, 2024 09:35

chandlerc commented May 31, 2024

View reviewed changes

jonmeow reviewed May 31, 2024

View reviewed changes

toolchain/install/install_paths.h Outdated Show resolved Hide resolved

jonmeow approved these changes May 31, 2024

View reviewed changes

jonmeow reviewed May 31, 2024

View reviewed changes

toolchain/install/install_paths.h Outdated Show resolved Hide resolved

jonmeow reviewed May 31, 2024

View reviewed changes

Apply suggestions from code review

c51c098

Co-authored-by: Jon Ross-Perkins <jperkins@google.com>

CarbonInfraBot reviewed May 31, 2024

View reviewed changes

toolchain/install/install_paths.h Outdated Show resolved Hide resolved

updates based on review

b9b9b7b

chandlerc commented May 31, 2024

View reviewed changes

chandlerc added this pull request to the merge queue Jun 1, 2024

Merged via the queue into carbon-language:trunk with commit d3a5b0e Jun 1, 2024
7 checks passed

chandlerc deleted the extract-install-dir branch June 1, 2024 03:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add utilities for managing a toolchain install, and install and use LLD. #3993

Add utilities for managing a toolchain install, and install and use LLD. #3993

chandlerc commented May 27, 2024 •

edited

Loading

chandlerc left a comment

jonmeow May 29, 2024

chandlerc May 31, 2024

jonmeow May 31, 2024

jonmeow May 31, 2024

chandlerc May 31, 2024

chandlerc Jun 1, 2024

chandlerc left a comment

chandlerc May 31, 2024

jonmeow May 31, 2024

jonmeow left a comment

jonmeow May 31, 2024 •

edited

Loading

chandlerc May 31, 2024

jonmeow May 31, 2024

chandlerc May 31, 2024

chandlerc left a comment

chandlerc May 31, 2024

chandlerc May 31, 2024

chandlerc May 31, 2024

chandlerc commented Jun 1, 2024

		auto SetError(llvm::Twine message) -> void;
		auto CheckMarkerFile() -> void;

Add utilities for managing a toolchain install, and install and use LLD. #3993

Add utilities for managing a toolchain install, and install and use LLD. #3993

Conversation

chandlerc commented May 27, 2024 • edited Loading

chandlerc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chandlerc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonmeow left a comment

Choose a reason for hiding this comment

jonmeow May 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chandlerc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chandlerc commented Jun 1, 2024

chandlerc commented May 27, 2024 •

edited

Loading

jonmeow May 31, 2024 •

edited

Loading