Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature-gate all protocols in the libp2p meta crate #1364

Closed
thomaseizinger opened this issue Jan 2, 2020 · 14 comments · Fixed by #1467
Closed

Feature-gate all protocols in the libp2p meta crate #1364

thomaseizinger opened this issue Jan 2, 2020 · 14 comments · Fixed by #1467

Comments

@thomaseizinger
Copy link
Contributor

Depending on the meta-crate is nice because it is a one-stop shop for all imports.

Unfortunately, depending on the meta-crate libp2p brings in all possible protocols (kad, tcp, etc).

It would be nice to have a feature per protocol that one could activate from the meta crate, something like:

libp2p = { version = "0.13", features = ["tcp", "secio", "yamux", "swarm"] }

An orthogonal aspect to this is what the default features should be. The minimum you'd want to bring is probably libp2p-core (should that even be a feature?), so the default features might as-well be empty. That would mean even protocol-authors only have to depend on libp2p instead of pulling in libp2p_swarm etc to keep the dependency foot-print low.

@tomaka
Copy link
Member

tomaka commented Jan 2, 2020

Before doing that, I'd like to have at least some basic measurements that show that removing code does actually speeds up compilation substantially and/or decreases binary size, and that this is not just an urban myth.

@thomaseizinger
Copy link
Contributor Author

That makes total sense :)

I've created a little "benchmark" where I feature-flagged the meta crate in a basic way.
To do that, I removed the CommonTransport stuff from lib.rs because I didn't want to spend too much time on it. If there are more feature-flags available, it might be easier to handle the definition of that in a different way.

I've also left core and core-derive as default dependencies. I don't think it makes sense to depend on libp2p without those two? core-derive might potentially be optional for crates that only want to define a new protocol.

Here is the benchmark repository: https://github.com/thomaseizinger/rust-libp2p-compile-benchmark.

  • current is a project that depends on libp2p as it is released on crates.io
  • feature-flagged is a project that depends on my fork with only the features enabled that we are using in our project at the moment.

On my machine the results are:

image

For the debug build, that is an improvement of ~20% and in the release build we have ~25%. (If I've got the math right 😁)

@tomaka
Copy link
Member

tomaka commented Jan 17, 2020

For reference, it is possible to write this:

#[cfg_attr(docsrs, doc(cfg(feature = "foo")))]
use ...;

Tokio does that, and here's how it renders: https://docs.rs/tokio/0.2.9/tokio/#modules

cc rust-lang/rust#43781

@thomaseizinger
Copy link
Contributor Author

thomaseizinger commented Jan 27, 2020

I've played with an alternative to this approach, which is to depend on the individual crates and compose everything together manually.

Unfortunately, I've hit a wall with the derive(NetworkBehaviour) macro as this one seems to assume that the libp2p crate is in the dependency tree.

@tomaka
Copy link
Member

tomaka commented Jan 28, 2020

seems to assume that the libp2p crate is in the dependency tree.

I'm aware of this, but I don't think it's fixable.

@thomaseizinger
Copy link
Contributor Author

seems to assume that the libp2p crate is in the dependency tree.

I'm aware of this, but I don't think it's fixable.

I don't think it needs fixing, it is a fine assumption that people are expected to only depend on libp2p.
To me, it is another motivation for providing said feature-flags because there are basically no alternatives (not using the custom derive left out).

What is the long-term plan in regards to the crate structure of libp2p?
Have you ever considered merging everything into one crate (plus providing feature-flags)?

It would also help with issues like #1365 :)

@tomaka
Copy link
Member

tomaka commented Jan 29, 2020

What is the long-term plan in regards to the crate structure of libp2p?
Have you ever considered merging everything into one crate (plus providing feature-flags)?

That's what I would do if it was a personal project. In terms of library user experience, having one single crate beats everything else.

But in the case of libp2p, it's always been a major goal to be extensible and that everybody should be able to add their own, and merging everything into one library kind of sends the wrong message.

@thomaseizinger
Copy link
Contributor Author

But in the case of libp2p, it's always been a major goal to be extensible and that everybody should be able to add their own, and merging everything into one library kind of sends the wrong message.

Interesting, can you elaborate, what would be so wrong about that?

Which extensions to you see possible at the moment that would they go away if libp2p were a single, heavily feature-flagged crate?

@tomaka
Copy link
Member

tomaka commented Feb 12, 2020

Interesting, can you elaborate, what would be so wrong about that?

I guess I could ping @mgoelzer or @raulk if they have an opinion about this?

@ghost
Copy link

ghost commented Feb 23, 2020

@tomaka No strong feelings here. Doing this does make the cargo.toml a bit more complicated for the beginner (it's a bit reminiscent of the syn crate's features split), but since one of the goals of the libp2p project is modularity, making this change would certainly not be inconsistent with the philosophy of the project.

@raulk
Copy link
Member

raulk commented Feb 23, 2020

As long as the library is modular in terms of API (e.g. it’s possible to mount custom protocols, secure channels, multiplexers, discoverers, etc. easily), and users can depend only on the modules they actually require, I could be ok taking the path of least resistance that also simplifies life for developers, and feels natural in the target language.

Questions/comments:

  • During a build, would cargo resolve all dependencies, or only the ones relevant to the feature?
  • One concern that’s irrelevant in rust-libp2p because it is a monorepo anyway is how module ownership/custodianship is carried out. This keeps coming up every time we evaluate a monorepo setup in go or js, but in this case, it would make no difference because all code is physically hosted on the same repo.
  • It would, however, make a difference in terms of releasing, no?

@ghost
Copy link

ghost commented Feb 23, 2020

During a build, would cargo resolve all dependencies, or only the ones relevant to the feature?

If I'm understanding the question correctly, then no. The features = [...] is a conditional compilation thing, so in the example @thomaseizinger gave

libp2p = { version = "0.13", features = ["tcp", "secio", "yamux", "swarm"] }

the compiler would not pull in the noise crate because noise is not in that list, even though it is present in the source tree. If the programmer tried to use noise anyway, it would be a compile-time error, undefined symbol or some such.

Right now the meta crate pulls in a bunch of sub crates:

[dependencies]
multiaddr = { package = "parity-multiaddr", version = "0.7.0", path = "misc/multiaddr" }
multihash = { package = "parity-multihash", version = "0.2.1", path = "misc/multihash" }
libp2p-mplex = { version = "0.15.0", path = "muxers/mplex" }
libp2p-identify = { version = "0.15.0", path = "protocols/identify" }
libp2p-kad = { version = "0.15.0", path = "protocols/kad" }
libp2p-floodsub = { version = "0.15.0", path = "protocols/floodsub" }
libp2p-gossipsub = { version = "0.15.0", path = "./protocols/gossipsub" }
libp2p-ping = { version = "0.15.0", path = "protocols/ping" }
libp2p-plaintext = { version = "0.15.0", path = "protocols/plaintext" }
libp2p-pnet = { version = "0.15.0", path = "protocols/pnet" }
libp2p-core = { version = "0.15.0", path = "core" }
libp2p-core-derive = { version = "0.15.0", path = "misc/core-derive" }
libp2p-secio = { version = "0.15.0", path = "protocols/secio", default-features = false }
libp2p-swarm = { version = "0.5.0", path = "swarm" }
libp2p-uds = { version = "0.15.0", path = "transports/uds" }
libp2p-wasm-ext = { version = "0.8.0", path = "transports/wasm-ext" }
libp2p-yamux = { version = "0.15.0", path = "muxers/yamux" }
[...]

If all you wanted to do is write a new protocol, your project would start by including the meta crate like this:

[dependencies]
libp2p = "0.15.0"

which will then pull into your build that long list of crates above.

llvm doesn't do dead code elimination in debug builds, only release builds (I think - @tomaka would know better), so it's a compile time[*] and binary size tradeoff versus the simplicity of just being able to write libp2p = "0.15.0".

feels natural in the target language

The current way (libp2p = "0.15.0") is probably more common, but that's because most crates on crates.io are less complex and much more monolithic than libp2p. Either would feel natural, but the features=[...] version raises the barrier to getting started because the new developer has to somehow already know that there exists both secio and noise (for example).

[*] Compile time should only be a factor if you're starting from a cargo clean build, because the toolchain caches stuff that was previously compiled. In the normal write->compile->run->repeat workflow, you're not doing a cargo clean. Actually, I was a little confused by @thomaseizinger benchmarking of the compile times because of this, so there may be some element of this that I'm misunderstanding.

@thomaseizinger
Copy link
Contributor Author

thomaseizinger commented Feb 23, 2020

Actually, I was a little confused by @thomaseizinger benchmarking of the compile times because of this, so there may be some element of this that I'm misunderstanding.

The reason I did the benchmark was because @tomaka was asking for data :)
You are totally right, during normal dev iteration, the increase in compile times is not noticable because the results are cached.

It still quite a tax upon complete rebuilds though which also happen from time to time. However, it is not just compile-times.
Having unused dependencies in our dependency tree makes things like auditing our own project harder. There are quite a few outdated versions of crates in our tree that are only brought in by libp2p through a protocol that we don't use.

It is not super harmful but it would be nice to be able to get rid of them :)

It would, however, make a difference in terms of releasing, no?

Depends. If we keep the meta-crate approach, the release workflow should remain unchanged.
There is an orthogonal thing that could simplify releases, which is to merge all sub-crates into the current meta crate. That makes it impossible for other protocol-crates to only depend on what they need (say, just libp2p-swarm for example). But that could be solved again by these feature-flags.

@tomaka
Copy link
Member

tomaka commented Feb 24, 2020

There is an orthogonal thing that could simplify releases, which is to merge all sub-crates into the current meta crate.

I'd be in favour of that, as it makes the general user experience of the library way better.

That makes it impossible for other protocol-crates to only depend on what they need (say, just libp2p-swarm for example).

You could still depend on libp2p with just the swarm feature enabled for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants