-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfc: collapse Tokio sub crates into single tokio
crate
#1318
Comments
In the Reddit thread on actix-web that probably prompted this, you said:
Now this sure sounds like a good thing to me, but is it known if any projects only pull in a small fraction of the tokio-* crates? It's always been my impression from looking at dependency graphs that they don't. I'm hoping someone has a good idea on how to collect data for this apart from manually auditing dependency graphs, which is the best I know how to do. |
This seems generally reasonable, but I think we do need to figure out what to do with In the new world, I think we will want some independent, small crate that defines AsyncRead/AsyncWrite/etc. Maybe that's just futures-io (there is another issue open for this)? |
Overall I think merging the crates will lead to a better experience for consuming crates. Breaking changes will require a lot more effort to re-export things than bumping individual crates, but in my experience, there are very few "leaf" tokio crates that can be easily bumped without having to bump other crates as well. So, net-net, I don't think we'll lose much flexibility in practice. Also feature gating everything could make it easy to accidentally include too much (à la default features), but the remedy is a quick Cargo.toml edit rather than a code refactor, which is a better experience. |
A single tokio crate would make it a lot easier to maintain tokio in distros that package individual crates, like debian and fedora. I think using a single crate is also a good idea if rust is ever going to do dynamic linking per crate. |
I think that merging all of the crates into a single crate may be mostly just sweeping the perceived bloat under the rug. A quick perusal of I think that Another source of extra crates seems to be
There are a number of the dependencies in I think that there are also some real issues of There are some places where merging might make sense, such as the crates in the Also, while the original comment in this thread indicates that the Finally, I think one of the bigger wins may be some better way of counting or visualizing dependencies, which takes into account sets of dependencies that all come from the same source. If some piece of code is coming from a feature or a sub-crate developed in the same repository, the only main difference is that it can be more easily used and semantically versioned independently if it's a separate crate; but it will show up quite differently in a dependency graph, leading to the feeling of bloat; and some of what I think feels like the perceived bloat is also the perceived number of sources that need to be audited or people and organizations that need to be trusted. If the notion of multiple sub-crates actually all coming from the same parent project/workspace/repository were more prominent in some of these tools like Since this was long, in summary:
|
With feature flags there might be a risk that you don't include the ones you need, because other dependencies of yours happen to include them, leading to possible breakage down the road when your dependencies change the flags they require. Maybe that's not a big problem since it's easy enough to fix after it happens? But it would be more of a problem for newer coders who might not understand why they're getting errors. |
Another concern is forking and With single crate, entire Tokio needs to be replaced instead of e.g. just |
I understand the maintainability concerns but I don't get the issue with the amount of dependencies. Currently someone can depend on tokio which re-export common crates, so one does not have to manually deal with many dependencies. In general I appreciate the modularity of the ecosystem and I feel using features to achieve the same end is less elegant and practical/discoverable for users. Now, the development burden is a good reason to go for a single crate, but as a Tokio user I feel the current solution is practical. It lets libraries depend only on relevant crates and binaries can easily pull tokio. Regarding @kpcyrd's comment on packaging tokio in Linux distros, I don't really see how this helps until Rust has a fixed ABI and it seems current decisions shouldn't be based on such long term prospects. I also don't see Linux distros package Rust librairies like they package C/C++ lib headers or Python modules, because Cargo handles that much better. |
I'm not sure if Cargo features are robust enough for this. For example, in practice Rust gives poor error messages when user forgets to add a feature flag. It just prints that the thing doesn't exist, but the docs say it exists! Super confusing. |
In what way? I can see two sides:
The first case could be fixed by still having separate crates, but also offering a top-level crate that groups and re-exports all of them. Users would add The second is not a real problem IMHO, but merely a perception of a problem. The amount of code compiled will be similar either way (or even worse, given I've got a feeling that there's a group of new Rust users who come from languages with either huge stdlib (so nobody needs to use dependencies), or languages where dependencies are a pain (so everybody avoids using dependencies), so they're shocked how nonchalantly Rust/Cargo uses deps. But for Rust that's fine, so the real problem is communicating to users that they shouldn't be worried when the compilation step prints many lines of "Compiling X". |
I think users (including me) are frustrated by how fast compile time grows as we add dependencies, and the size of the dependency graph is an easy target for complaint. The presence of all the tokio crates and multiple versions of all the rand crates in an ostensibly single-threaded program quickly add credibility to blaming the compilation time on the size of a dependency tree. It would be interesting to assess how end-user compilation time is altered by the suggested changes. |
I think this is an XY problem. The perceived bloat is solved by binary packages in cargo/crates.io and caching it close to the CI instance. |
I'm not talking about CI. |
I am personally in the happy with the current situation boat. The arguments against many dependencies usually boil down to three metrics:
If maintaining all the smaller Note: if managing separate versions of the crates and which depends on which is indeed the main motive for this change there are automation tools that levitate or minimize that burden while keeping all the benefits of current approach |
@saethlin, my point was more general to the issue being addressed by the RFC: users perceive bloat in tokio and the RFC here is attempting to mitigate it by making an uber crate that wraps everything up. Aside from creating a new project, CI for a project depending on tokio, and actually working on tokio itself, when do you need to build tokio? |
Those are two separate issues. If you use micro libraries dynamic linking would imply loading >100 .so's into the process which is a non-zero-cost abstraction. With "C sized" crates the unused code would be LTO'd anyway. This is unrelated to distros. The very real problem with distros is the review process for new packages. If rand decides it's going to need 5 more rand-* crates we need to get them all reviewed and approved. To upload the new crates we need to update rand-core first which breaks the existing dependency tree. Updating rand is generally a non-trivial effort that takes multiple weeks (up to months). |
Starting at https://crates.io/crates/tokio/reverse_dependencies i've listed the numbers of reverse dependencies on crates.io
Some additional random thoughts.
|
One thing I forgot to consider earlier: can cargo handle different feature flags across dependencies and dev-dependencies? (at least it wasn't able to in the past...) A common pattern I've seen is libraries depending on the minimal tokio functionality they need, while pulling in all bells and whistles during testing. If cargo cannot support enabling extra dependency features during testing, crates may end up depending on the entirety of tokio |
@Mathspy This is not necessarily true- rustc itself does parallelize compilation within a single crate, while cargo doesn't (yet) compile dependency chains in parallel. So depending on the crate graph things could go either way- it would have to be benchmarked, and it will change over time as the tools improve. But this is still a relevant point- one of the reasons people complain about the number of dependencies is that it's a proxy for "compilation is slow," and simply merging them will not really change things there. The only way to fix that is to compile less, and simpler, code. |
And we will get a bloat of dependencies if you use tokio for codecs only in the library and tcp/udp in tests, because cargo will combine two features together:
It will compile tcp and runtime even for a non-test builds for the end users of a library. |
Not if they are pulling this library from crates.io, dev-dependencies features are only merged in when a crate is inside the current workspace or a path dependency. |
Good to know. Thanks |
cargo-crev reviews and cargo-audit security advisories are per crate, and don't take features into account. If tokio keeps crates separate, and there's an issue with one of the less often used components, these will affect fewer users. |
@rpjohnst Oh! I see, thank you for the clarification! |
That's a tradeoff with runtime overhead for a tooling problem though. It doesn't improve security, only binaries that actively run the vulnerable code are affected in both cases. |
To me, the most important factors are:
It seems that this change makes the first one harder, and allows the latter to grow with lesser negligence than the current arrangement. Why should |
There are valid arguments for and downsides to both approaches. Since many of the comments have been pro multi-crate, I'll add some for the single crate approach (which is my preferred solution). ReviewsI work with multiple companies that require each dependency to be reviewed. Each version must be signed off by an employee for production use. Splitting libraries like tokio into multiple crates makes this a lot more work. The impact is smaller on the initial review because you need to look at all code anyway, but jumping around between different crates still makes this harder than with a single crate. You need more understanding of the architecture and boundaries between the crates, and just need to keep more things in your head. I also know companies that have policies like Maintenance burdenEven without mandatory review requirements as above, more crates invariable lead to higher velocity of change. More releases, more CHANGELOGs to read, more version bumps, more chances for a the inevitable bugs to cause a problem and more friction in the entire ecosystem. Breaking changes are especially bad in this context. While this also makes it easier to get changes out instead of consolidating to a single-crate release, I'm wondering if this kind of velocity is actually desirable for such an important low-level building block - assuming a certain stability and maturity of the the codebase. It also increases the chance of multiple versions of a crate to sneak in to your build, which is always suboptimal (build time, inconsistencies, ...) and leads to the often annoying process of finding out why and fixing it - usually with a PR for another dependency. Contributing@qm3ster mentioned that a single crate would make contributing harder. I'm curious why that is. A single crate is much better for understanding the code and first time contributing IMO. Build PerformanceThere have been multiple claims for better build performance with multiple crates, due to parallel compilation. This claim really needs some substantiation with measurements. Subjectively, I remember build times being better before the split up in tokio and futures. This of course might just be because the stack has grown in complexity and gained more features. But the point is: including performance in a decision would need some validating benchmarks. Most Common Use Case@AZon8 posted some numbers for how tokio is used on crates.io. Most applications using tokio are private and not on crates.io, so crates-io data actually make dependencies on sub-crates much more likely than total real world use due to the emphasis on libraries and building blocks vs full applications. I think it makes sense to optimize for the most common use case, assuming that more selective usage is still possible. |
A step towards collapsing Tokio sub crates into a single `tokio` crate (#1318). The `fs` implementation is now provided by the main `tokio` crate. The `fs` functionality may still be excluded from the build by skipping the `fs` feature flag.
A step towards collapsing Tokio sub crates into a single `tokio` crate (#1318). The `timer` implementation is now provided by the main `tokio` crate. The `timer` functionality may still be excluded from the build by skipping the `timer` feature flag.
Related to #1318, Tokio APIs that are "less stable" are moved into a new `tokio-util` crate. This crate will mirror `tokio` and provide additional APIs that may require a greater rate of breaking changes.
A step towards collapsing Tokio sub crates into a single `tokio` crate (#1318). The `timer` implementation is now provided by the main `tokio` crate. The `timer` functionality may still be excluded from the build by skipping the `timer` feature flag.
Related to #1318, Tokio APIs that are "less stable" are moved into a new `tokio-util` crate. This crate will mirror `tokio` and provide additional APIs that may require a greater rate of breaking changes. As examples require `tokio-util`, they are moved into a separate crate (`examples`). This has the added advantage of being able to avoid example only dependencies in the `tokio` crate.
A step towards collapsing Tokio sub crates into a single `tokio` crate (#1318). The `net` implementation is now provided by the main `tokio` crate. Functionality can be opted out of by using the various net related feature flags.
A step towards collapsing Tokio sub crates into a single `tokio` crate (#1318). The `net` implementation is now provided by the main `tokio` crate. Functionality can be opted out of by using the various net related feature flags.
A step towards collapsing Tokio sub crates into a single `tokio` crate (#1318). The `net` implementation is now provided by the main `tokio` crate. Functionality can be opted out of by using the various net related feature flags.
A step towards collapsing Tokio sub crates into a single `tokio` crate (#1318). The `io` implementation is now provided by the main `tokio` crate. Functionality can be opted out of by using the various net related feature flags.
A step towards collapsing Tokio sub crates into a single `tokio` crate (#1318). The `io` implementation is now provided by the main `tokio` crate. Functionality can be opted out of by using the various net related feature flags.
A step towards collapsing Tokio sub crates into a single `tokio` crate (#1318). The executor implementation is now provided by the main `tokio` crate. Functionality can be opted out of by using the various net related feature flags.
A step towards collapsing Tokio sub crates into a single `tokio` crate (#1318). The sync implementation is now provided by the main `tokio` crate. Functionality can be opted out of by using the various net related feature flags.
A step towards collapsing Tokio sub crates into a single `tokio` crate (#1318). The sync implementation is now provided by the main `tokio` crate. Functionality can be opted out of by using the various net related feature flags.
A step towards collapsing Tokio sub crates into a single `tokio` crate (#1318). The sync implementation is now provided by the main `tokio` crate. Functionality can be opted out of by using the various net related feature flags.
This has been implemented. |
There has been frustration among Tokio users regarding the number of crates pulled in when depending on Tokio. Here is an opportunity to discuss an alternative strategy. By doing this RFC, users who are happy with the current situation may express this.
Summary
Do not maintain
tokio-*
sub crates, instead all Tokio code will exist in a singletokio
crate and components are enabled or disabled using feature flags.For example, depending on only the timer functionality could be done with:
By default,
tokio
would have the same components enabled as it does today.Motivation
Maintaining a large number of crates comes with an increased maintainership burden. Maintaining correct dependencies between crates is complex. Users feel that large number of dependencies == bloat. Additional rational can be found here.
Details
Tokio must maintain semver stability of its core APIs. This includes traits as well as some types, such as
TcpStream
. Tokio would like to be able to release breaking changes to less fundamental APIs without having to break the entire Tokio ecosystem.Currently, Tokio achieves this goal by breaking up all the various components into individual crates. Doing this allows less stable components to release breaking changes without touching stable components. However, this strategy has drawbacks (see Motivation section).
In this proposal, all Tokio components would be moved into a single crate. Each component would have an associated feature flag, similar to how Tokio does it today.
Not much would change for application developers, they would still just depend on
tokio
and enable / disable feature flags as needed. Library developers would no longer depend on sub crates. Instead, they would depend ontokio
and only pull in the features that they need.Type stability
Core types can maintain stability between breaking semver releases. For example, if the
TcpStream
type does not change between Tokio version 0.2 and Tokio version 0.3, then the following steps would be taken to release 0.3:tokio
0.3tokio
0.2 to depend ontokio
0.3.TcpStream
in 0.3 by re-exporting the implementation from 0.3.TcpStream
type from 0.3.By doing this,
TcpStream
from 0.2 and 0.3 are the same type.Drawbacks
Alternatives
Continue to release new crates for each component.
The text was updated successfully, but these errors were encountered: