Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Overhaul Cargo's build command support #403

Merged
merged 6 commits into from
Oct 31, 2014

Conversation

alexcrichton
Copy link
Member

This RFC is targeted at overhauling the build command in Cargo manifests to be
more robust and fully functional for building and linking native code. While
largely targeted at Cargo, this RFC includes a few minor changes to the compiler
itself, hence posting this RFC in this repository.

The high level summary of this RFC is:

  1. Instead of having the build command be some form of script, it will be a
    rust command instead
  2. Establish a namespace of foo-sys packages which represent the native
    library foo. These packages will have Cargo-based dependencies between
    *-sys packages to express dependencies among C packages themselves.
  3. Establish a set of standard environment variables for build commands which
    will instruct how foo-sys packages should be built in terms of dynamic or
    static linkage, as well as providing the ability to override where a package
    comes from via environment variables.

Rendered

This RFC is targeted at overhauling the `build` command in Cargo manifests to be
more robust and fully functional for building and linking native code. While
largely targeted at Cargo, this RFC includes a few minor changes to the compiler
itself, hence posting this RFC in this repository.

The high level summary of this RFC is:

1. Instead of having the `build` command be some form of script, it will be a
   rust command instead
2. Establish a namespace of `foo-sys` packages which represent the native
   library `foo`. These packages will have Cargo-based dependencies between
   `*-sys` packages to express dependencies among C packages themselves.
3. Establish a set of standard environment variables for build commands which
   will instruct how `foo-sys` packages should be built in terms of dynamic or
   static linkage, as well as providing the ability to override where a package
   comes from via environment variables.
@steveklabnik
Copy link
Member

there's a number of rusts here, rather than Rusts.

@steveklabnik
Copy link
Member

and cargo and unix and some other things. Dunno how much that actually matters.


The responsibility of the build script is to ensure that all requested native
libraries are available for the crate to compile. The conceptual output of the
build script will be metadata metadata on stdout explaining how the compilation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"metadata metadata'

@bvssvni
Copy link

bvssvni commented Oct 16, 2014

+1 Using Rust for building makes life easier when you need something very complex. The ability to wrap native dependencies as Cargo packages makes it possible to solve these problems isolated without having to redistribute the solution across the ecosystem manually.

To overcome the problem with negation, perhaps some form of wildcard target triple can be used, and explicitly negate with an explicit dependency that gives an error message if it does not match the positive one.

extra support code.

This RFC does not propose a convention of what to name the build script files.
Like `links`, it will be illegal to specify `build` without specifying `links`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that one would just use links = [] when only using build for code-generation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah actually now that I've thought about code generation I think I'll lift this requirement. I'll change the conditions for being run to "at least one of links was not overridden or links was empty to start out with", and there would just be no arguments to the script in this case.

@Ericson2314
Copy link
Contributor

Down the road, I'd prefer to use the build scripts less for their side effects, and more to marshal data directly--say parse files, splice the parsed AST with a generated one, and then invoke the compiler. The build script would be library dynamically loaded by cargo, and (dynamically) linked to cargo and the compiler. This approach seems a bit more elegant, and, if nixos is any guide, makes caching and implicit data parallelism for distributed builds more easier.

Of course, without much rust->rust dynamic linking support this isn't feasible at the movement, and for integration with external tools, side effects will still be necessary. So I say definitely this is an improvement and +1 for now, but I hope the ideas mentioned above will be considered in some form for the long term.

@ghost
Copy link

ghost commented Oct 21, 2014

Can we separate the fetch/configure/build steps please? If Rust is going to be a systems language, Cargo won't always be the top of the build chain.

compile.rs sounds like we're going to (eventually) reimplement autotools and make as Rust libraries. Is this planned as a sub-project of Cargo, an external project, or are we expected to just shell out from Rust anyway?

@shadowmint
Copy link

compile.rs will only ever be to invoke third party tools like automake /
cmake / etc.

I suppose you could write a c compiler (eg clang) invoker in it, but it
would be a pretty major waste of time and effort.

Can you elaborate on the fetch / build / configure thing?

Why would you ever want the c compile step to be broken into distinct
parts? If you want to manage your c build step, fork the sys-foo package
that you depend on and add support for using [proprietary build server
here] in it, and override the sys-foo package at the top level.
On Oct 22, 2014 6:56 AM, "György Andrasek" notifications@github.com wrote:

Can we separate the fetch/configure/build steps please? If Rust is going
to be a systems language, Cargo won't always be the top of the build chain.

compile.rs sounds like we're going to (eventually) reimplement autotools
and make as Rust libraries. Is this planned as a sub-project of Cargo, an
external project, or are we expected to just shell out from Rust anyway?


Reply to this email directly or view it on GitHub
#403 (comment).

@alexcrichton
Copy link
Member Author

@Jurily I agree with @shadowmint that the separation of the build steps seems orthogonal to the build cmd aspect of Cargo.

I also do not plan to create a sub-project of cargo to reinvent autotools. I expect that, if necessary, those sorts of libraries will grow naturally in the ecosystem. I'll have to write a few shims for various packages to get Cargo itself working, but I don't plan on undertaking a huge endeavor right out of the gates.

@tomaka
Copy link

tomaka commented Oct 23, 2014

Should the output of the build script be cached? (the metadata and rustc_flag)

In some situations it is desirable to cache it (for example if you generate Rust code, there's no reason why anything would change), but sometimes not (if you are pointing to a library in /usr/lib, you should check whether it is available every time you compile).

Eridius on #cargo suggested to add a flag to tell cargo whether or not it should be cached. In which case I think that this flag should be added to the output of the script. I like this idea.
We could also take the simpler route of not caching anything, or letting the script handle the cache itself.

@shadowmint
Copy link

@tomaka how would control that at a fine level?

If you had something like:

 cargo build --cache_build_step

Then cargo will assume it has to cache all the build steps, which categorically wrong for building c libraries; in general the build tool (make, whatever) will already have the functionality built in to detect file system level changes and invoke the compiler if required.

For a sys-foo package this won't be an issue because once the crate is compiled it won't be recompiled on each build anyway.

For a normal package with code gen it won't be an issue because once the crate is compiled as a dependency it won't be compiled again.

So... it's really only useful if the top level application is using a build step to generate code, and you don't want to run that build step on every compile.

Any --cache_build_step flag shouldn't apply to dependencies, and only have an effect on the top level location where 'cargo build' is invoked as far as I can tell.

(Personally I'm not sure how useful this would actually be; I'm skeptical many top-level applications would use a build step, but I suppose it might make things slightly easier in some specific cases?)

@tomaka
Copy link

tomaka commented Oct 23, 2014

I was not thinking of a flag for the cargo command, but a flag in the output of the script like this:

cargo:rustc-flags=-l static:foo -L /path/to/foo
cargo:root=/path/to/foo
cargo:libdir=/path/to/foo/lib
cargo:include=/path/to/foo/include
cargo:cache=true

As with other small improvements, I guess we'll see later if there's a need for this.

@shadowmint
Copy link

Oh fair enough. I still don't see how useful that will be, given dependencies only get build once anyway, but if there's a need for it that seems fair enough. How would cargo cache the data between builds?

@tomaka
Copy link

tomaka commented Oct 23, 2014

given dependencies only get build once anyway

That's the problem here.

Let's say that the build script of curl-sys calls pkg-config in order to find where on the system the library is located, and if it doesn't find any result, then it builds it manually.

If you run cargo build without having libcurl installed, it works because the script automatically builds the library. But if you run apt-get install libcurl, cargo build, apt-get remove libcurl, cargo build you get linking errors if the script is not re-run.

That's why in my opinion it is desirable to run all the build scripts every time you run cargo build, even if they are deeply nested within some dependencies.
After all the build script is free to do whatever it wants, and we have no idea whether or not its dependencies have been modified (in this example, what I mean by "its dependencies" is the fact that libcurl is installed).

The consequence of this, however, is that in the Case study: generated code, we need to make sure that the code is recompiled if $(OUT_DIR)/generated.rs is modified. I don't know if the include! macro does this already, but I know that it's technically possible.

@shadowmint
Copy link

mm... I don't have any strong feeling either way.

I'm not exactly sure what use case is being served by making cargo rebuild every dependency every build on the off chance some system settings might have changed. CI on travis / cdash maybe? I guess that's plausible.

I'd probably err on the side of waiting until it was actually an issue that is troubling people rather than trying to future proof against it, but that's just my $0.02.

@o11c
Copy link
Contributor

o11c commented Oct 23, 2014

@tomaka no, that's why cargo needs to ask pkg-config itself, and not support non-pkg-config ways.

@shadowmint
Copy link

@o11c Why would cargo ever talk to pkg-config?

On Thu, Oct 23, 2014 at 4:52 PM, o11c notifications@github.com wrote:

@tomaka https://github.com/tomaka no, that's why cargo needs to ask
pkg-config itself, and not support non-pkg-config ways.


Reply to this email directly or view it on GitHub
#403 (comment).

@alexcrichton
Copy link
Member Author

@tomaka I do think we'll need to cache it to forward metadata down to dependent packages. We should probably follow today's rebuilding heuristics in terms of when we run the build command which is to say that any package with a build script will be rebuilt if any file in the package changes (we don't know what files are inputs to the build script).

Which is to say I don't think the build scripts should unconditionally run on all cargo build invocations (we should avoid as much work as possible). I do think that we'll run into the problem that you outlined with installing and uninstall libcurl, but there's some situations I'm willing to accept are out of our control (like that), and it's easy enough to run cargo clean and start over!

@lilyball
Copy link
Contributor

Along with -l, can rustc also gain a flag to specify search paths for native libraries? This would help with rust-lang/rust#13733 on OS X (the issue there is I can't specify -L /usr/local/lib because it breaks rustc's search for rust libraries).

to the same C library, the C dependency should be refactored into a common
Cargo-packaged dependency.

It is illegal to define `link` without also defining `build`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/link/links/

@lilyball
Copy link
Contributor

Doing per-target-triple dependencies sounds maybe too precise. I could imagine there being per-target-triple dependencies, but I would think that more common would be per-OS dependencies.

path = "winhttp"
```

Here the top-level configuration key `platform` will be a table whose sub-keys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says "platform" but the example uses "target".

@alexcrichton
Copy link
Member Author

A search path for native libraries seems somewhat orthogonal to the rustc changes mentioned here, perhaps that's best left for another time. Issues like rust-lang/rust#16402 may also provide a solution there as well.

I agree that the per-triple dependencies may be a little too precise, but I do also feel that we'll always need such a level of precision, we may just want to generalize it later to something more compatible. For example we may want to support something like *-apple-darwin, as a key in the manifest. In general I'm trying to keep as much "triple logic" out of cargo as possible to ensure we don't parse anything differently than rustc, but I do think that it may be inevitable. Do you have an idea in mind to make the syntax a little more palatable though?

@brson
Copy link
Contributor

brson commented Oct 31, 2014

Merged. Discussion. Tracking.

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Oct 31, 2014
This is an implementation of the rustc bits of [RFC 403][rfc]. This adds a new
flag to the compiler, `-l`, as well as tweaking the `include!` macro (and
related source-centric macros).

The compiler's new `-l` flag is used to link libraries in from the command line.
This flag stacks with `#[link]` directives already found in the program. The
purpose of this flag, also stated in the RFC, is to ease linking against native
libraries which have wildly different requirements across platforms and even
within distributions of one platform. This flag accepts a string of the form
`NAME[:KIND]` where `KIND` is optional or one of dylib, static, or framework.
This is roughly equivalent to if the equivalent `#[link]` directive were just
written in the program.

The `include!` macro has been modified to recursively expand macros to allow
usage of `concat!` as an argument, for example. The use case spelled out in RFC
403 was for `env!` to be used as well to include compile-time generated files.
The macro also received a bit of tweaking to allow it to expand to either an
expression or a series of items, depending on what context it's used in.

[rfc]: rust-lang/rfcs#403
lambda-fairy added a commit to lambda-fairy/rfcs that referenced this pull request Mar 5, 2015
lambda-fairy added a commit to lambda-fairy/rfcs that referenced this pull request Mar 13, 2015
@Centril Centril added A-cargo-command Proposals relating to cargo commands. A-env Environment variable related proposals & ideas labels Nov 23, 2018
@maanu1234

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cargo-command Proposals relating to cargo commands. A-env Environment variable related proposals & ideas
Projects
None yet
Development

Successfully merging this pull request may close these issues.