Feature-gate all protocols in the libp2p meta crate #1364

thomaseizinger · 2020-01-02T00:02:15Z

Depending on the meta-crate is nice because it is a one-stop shop for all imports.

Unfortunately, depending on the meta-crate libp2p brings in all possible protocols (kad, tcp, etc).

It would be nice to have a feature per protocol that one could activate from the meta crate, something like:

libp2p = { version = "0.13", features = ["tcp", "secio", "yamux", "swarm"] }

An orthogonal aspect to this is what the default features should be. The minimum you'd want to bring is probably libp2p-core (should that even be a feature?), so the default features might as-well be empty. That would mean even protocol-authors only have to depend on libp2p instead of pulling in libp2p_swarm etc to keep the dependency foot-print low.

The text was updated successfully, but these errors were encountered:

tomaka · 2020-01-02T10:45:51Z

Before doing that, I'd like to have at least some basic measurements that show that removing code does actually speeds up compilation substantially and/or decreases binary size, and that this is not just an urban myth.

thomaseizinger · 2020-01-03T01:55:43Z

That makes total sense :)

I've created a little "benchmark" where I feature-flagged the meta crate in a basic way.
To do that, I removed the CommonTransport stuff from lib.rs because I didn't want to spend too much time on it. If there are more feature-flags available, it might be easier to handle the definition of that in a different way.

I've also left core and core-derive as default dependencies. I don't think it makes sense to depend on libp2p without those two? core-derive might potentially be optional for crates that only want to define a new protocol.

Here is the benchmark repository: https://github.com/thomaseizinger/rust-libp2p-compile-benchmark.

current is a project that depends on libp2p as it is released on crates.io
feature-flagged is a project that depends on my fork with only the features enabled that we are using in our project at the moment.

On my machine the results are:

For the debug build, that is an improvement of ~20% and in the release build we have ~25%. (If I've got the math right 😁)

tomaka · 2020-01-17T17:52:25Z

For reference, it is possible to write this:

#[cfg_attr(docsrs, doc(cfg(feature = "foo")))]
use ...;

Tokio does that, and here's how it renders: https://docs.rs/tokio/0.2.9/tokio/#modules

cc rust-lang/rust#43781

thomaseizinger · 2020-01-27T23:52:42Z

I've played with an alternative to this approach, which is to depend on the individual crates and compose everything together manually.

Unfortunately, I've hit a wall with the derive(NetworkBehaviour) macro as this one seems to assume that the libp2p crate is in the dependency tree.

tomaka · 2020-01-28T08:09:43Z

seems to assume that the libp2p crate is in the dependency tree.

I'm aware of this, but I don't think it's fixable.

thomaseizinger · 2020-01-29T05:20:52Z

seems to assume that the libp2p crate is in the dependency tree.

I'm aware of this, but I don't think it's fixable.

I don't think it needs fixing, it is a fine assumption that people are expected to only depend on libp2p.
To me, it is another motivation for providing said feature-flags because there are basically no alternatives (not using the custom derive left out).

What is the long-term plan in regards to the crate structure of libp2p?
Have you ever considered merging everything into one crate (plus providing feature-flags)?

It would also help with issues like #1365 :)

tomaka · 2020-01-29T09:29:17Z

What is the long-term plan in regards to the crate structure of libp2p?
Have you ever considered merging everything into one crate (plus providing feature-flags)?

That's what I would do if it was a personal project. In terms of library user experience, having one single crate beats everything else.

But in the case of libp2p, it's always been a major goal to be extensible and that everybody should be able to add their own, and merging everything into one library kind of sends the wrong message.

thomaseizinger · 2020-01-29T23:33:54Z

But in the case of libp2p, it's always been a major goal to be extensible and that everybody should be able to add their own, and merging everything into one library kind of sends the wrong message.

Interesting, can you elaborate, what would be so wrong about that?

Which extensions to you see possible at the moment that would they go away if libp2p were a single, heavily feature-flagged crate?

tomaka · 2020-02-12T11:14:43Z

Interesting, can you elaborate, what would be so wrong about that?

I guess I could ping @mgoelzer or @raulk if they have an opinion about this?

ghost · 2020-02-23T01:48:30Z

@tomaka No strong feelings here. Doing this does make the cargo.toml a bit more complicated for the beginner (it's a bit reminiscent of the syn crate's features split), but since one of the goals of the libp2p project is modularity, making this change would certainly not be inconsistent with the philosophy of the project.

raulk · 2020-02-23T02:06:08Z

As long as the library is modular in terms of API (e.g. it’s possible to mount custom protocols, secure channels, multiplexers, discoverers, etc. easily), and users can depend only on the modules they actually require, I could be ok taking the path of least resistance that also simplifies life for developers, and feels natural in the target language.

Questions/comments:

During a build, would cargo resolve all dependencies, or only the ones relevant to the feature?
One concern that’s irrelevant in rust-libp2p because it is a monorepo anyway is how module ownership/custodianship is carried out. This keeps coming up every time we evaluate a monorepo setup in go or js, but in this case, it would make no difference because all code is physically hosted on the same repo.
It would, however, make a difference in terms of releasing, no?

ghost · 2020-02-23T05:51:18Z

During a build, would cargo resolve all dependencies, or only the ones relevant to the feature?

If I'm understanding the question correctly, then no. The features = [...] is a conditional compilation thing, so in the example @thomaseizinger gave

libp2p = { version = "0.13", features = ["tcp", "secio", "yamux", "swarm"] }

the compiler would not pull in the noise crate because noise is not in that list, even though it is present in the source tree. If the programmer tried to use noise anyway, it would be a compile-time error, undefined symbol or some such.

Right now the meta crate pulls in a bunch of sub crates:

[dependencies]
multiaddr = { package = "parity-multiaddr", version = "0.7.0", path = "misc/multiaddr" }
multihash = { package = "parity-multihash", version = "0.2.1", path = "misc/multihash" }
libp2p-mplex = { version = "0.15.0", path = "muxers/mplex" }
libp2p-identify = { version = "0.15.0", path = "protocols/identify" }
libp2p-kad = { version = "0.15.0", path = "protocols/kad" }
libp2p-floodsub = { version = "0.15.0", path = "protocols/floodsub" }
libp2p-gossipsub = { version = "0.15.0", path = "./protocols/gossipsub" }
libp2p-ping = { version = "0.15.0", path = "protocols/ping" }
libp2p-plaintext = { version = "0.15.0", path = "protocols/plaintext" }
libp2p-pnet = { version = "0.15.0", path = "protocols/pnet" }
libp2p-core = { version = "0.15.0", path = "core" }
libp2p-core-derive = { version = "0.15.0", path = "misc/core-derive" }
libp2p-secio = { version = "0.15.0", path = "protocols/secio", default-features = false }
libp2p-swarm = { version = "0.5.0", path = "swarm" }
libp2p-uds = { version = "0.15.0", path = "transports/uds" }
libp2p-wasm-ext = { version = "0.8.0", path = "transports/wasm-ext" }
libp2p-yamux = { version = "0.15.0", path = "muxers/yamux" }
[...]

If all you wanted to do is write a new protocol, your project would start by including the meta crate like this:

[dependencies]
libp2p = "0.15.0"

which will then pull into your build that long list of crates above.

llvm doesn't do dead code elimination in debug builds, only release builds (I think - @tomaka would know better), so it's a compile time[*] and binary size tradeoff versus the simplicity of just being able to write libp2p = "0.15.0".

feels natural in the target language

The current way (libp2p = "0.15.0") is probably more common, but that's because most crates on crates.io are less complex and much more monolithic than libp2p. Either would feel natural, but the features=[...] version raises the barrier to getting started because the new developer has to somehow already know that there exists both secio and noise (for example).

[*] Compile time should only be a factor if you're starting from a cargo clean build, because the toolchain caches stuff that was previously compiled. In the normal write->compile->run->repeat workflow, you're not doing a cargo clean. Actually, I was a little confused by @thomaseizinger benchmarking of the compile times because of this, so there may be some element of this that I'm misunderstanding.

thomaseizinger · 2020-02-23T22:59:01Z

Actually, I was a little confused by @thomaseizinger benchmarking of the compile times because of this, so there may be some element of this that I'm misunderstanding.

The reason I did the benchmark was because @tomaka was asking for data :)
You are totally right, during normal dev iteration, the increase in compile times is not noticable because the results are cached.

It still quite a tax upon complete rebuilds though which also happen from time to time. However, it is not just compile-times.
Having unused dependencies in our dependency tree makes things like auditing our own project harder. There are quite a few outdated versions of crates in our tree that are only brought in by libp2p through a protocol that we don't use.

It is not super harmful but it would be nice to be able to get rid of them :)

It would, however, make a difference in terms of releasing, no?

Depends. If we keep the meta-crate approach, the release workflow should remain unchanged.
There is an orthogonal thing that could simplify releases, which is to merge all sub-crates into the current meta crate. That makes it impossible for other protocol-crates to only depend on what they need (say, just libp2p-swarm for example). But that could be solved again by these feature-flags.

tomaka · 2020-02-24T10:56:07Z

There is an orthogonal thing that could simplify releases, which is to merge all sub-crates into the current meta crate.

I'd be in favour of that, as it makes the general user experience of the library way better.

That makes it impossible for other protocol-crates to only depend on what they need (say, just libp2p-swarm for example).

You could still depend on libp2p with just the swarm feature enabled for example.

tomaka added the difficulty:easy label Jan 2, 2020

tomaka mentioned this issue Feb 5, 2020

Remove support for secio paritytech/substrate#4831

Merged

tomaka mentioned this issue Feb 24, 2020

Feature-gate all dependencies #1467

Merged

tomaka closed this as completed in #1467 Mar 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature-gate all protocols in the libp2p meta crate #1364

Feature-gate all protocols in the libp2p meta crate #1364

thomaseizinger commented Jan 2, 2020

tomaka commented Jan 2, 2020

thomaseizinger commented Jan 3, 2020

tomaka commented Jan 17, 2020 •

edited

Loading

thomaseizinger commented Jan 27, 2020 •

edited

Loading

tomaka commented Jan 28, 2020

thomaseizinger commented Jan 29, 2020

tomaka commented Jan 29, 2020 •

edited

Loading

thomaseizinger commented Jan 29, 2020

tomaka commented Feb 12, 2020 •

edited

Loading

ghost commented Feb 23, 2020

raulk commented Feb 23, 2020 •

edited

Loading

ghost commented Feb 23, 2020 •

edited by ghost

Loading

thomaseizinger commented Feb 23, 2020 •

edited

Loading

tomaka commented Feb 24, 2020

Feature-gate all protocols in the libp2p meta crate #1364

Feature-gate all protocols in the libp2p meta crate #1364

Comments

thomaseizinger commented Jan 2, 2020

tomaka commented Jan 2, 2020

thomaseizinger commented Jan 3, 2020

tomaka commented Jan 17, 2020 • edited Loading

thomaseizinger commented Jan 27, 2020 • edited Loading

tomaka commented Jan 28, 2020

thomaseizinger commented Jan 29, 2020

tomaka commented Jan 29, 2020 • edited Loading

thomaseizinger commented Jan 29, 2020

tomaka commented Feb 12, 2020 • edited Loading

ghost commented Feb 23, 2020

raulk commented Feb 23, 2020 • edited Loading

ghost commented Feb 23, 2020 • edited by ghost Loading

thomaseizinger commented Feb 23, 2020 • edited Loading

tomaka commented Feb 24, 2020

tomaka commented Jan 17, 2020 •

edited

Loading

thomaseizinger commented Jan 27, 2020 •

edited

Loading

tomaka commented Jan 29, 2020 •

edited

Loading

tomaka commented Feb 12, 2020 •

edited

Loading

raulk commented Feb 23, 2020 •

edited

Loading

ghost commented Feb 23, 2020 •

edited by ghost

Loading

thomaseizinger commented Feb 23, 2020 •

edited

Loading