feat: add public construction for `QueryId` #4508

MitchTurner · 2023-09-17T21:54:31Z

Description

Adding a way for QueryId to be constructed by the library consumer

Notes & open questions

The reason I need this is I am creating an abstraction of Libp2p in my code so that I can have a Swarm trait rather than depending on the concrete swarm directly. This allows me to fake swarm behavior in my tests. I need to be able to generate QueryIds in these fakes :)

I don't believe these changes require documentation or tests, but happy to add anything the maintainers would like :)

Change checklist

I have performed a self-review of my own code
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
A changelog entry has been made in the appropriate crates

thomaseizinger

Thanks for opening the PR!

I don't think we want to add such a constructor because it would potentially break invariants.

For testing Swarms, I'd recommend to take a look at: https://github.com/libp2p/rust-libp2p/blob/master/swarm-test/src/lib.rs

We use that across the codebase and it has been working really well for us.

MitchTurner · 2023-09-18T01:12:30Z

Oh. That looks promising. I'll take a look at that when I get a chance. I'll let you know if that seems like a good alternative.

thomaseizinger · 2023-09-18T02:16:23Z

Oh. That looks promising. I'll take a look at that when I get a chance. I'll let you know if that seems like a good alternative.

The advantage of using this is that your tests use the exact same interface as your prod code, just with a MemoryTransport underneath so you don't need to interact with the APIs of NetworkBehaviour directly :)

thomaseizinger · 2023-09-18T02:42:51Z

Will close this for now! :)

MitchTurner · 2023-09-18T04:47:17Z

Is there a way to push events manually using the test swarm?

thomaseizinger · 2023-09-18T05:28:12Z

Is there a way to push events manually using the test swarm?

Can you elaborate on what exactly you mean? libp2p-swarm-test is designed for black-box testing. If you want to trigger a certain behaviour, you should establish the necessary connections and have the individual protocols to their thing. For example: https://github.com/libp2p/rust-libp2p/blob/master/protocols/kad/tests/client_mode.rs

This kind of test design will result in much more stable tests because you are only using the public API of the module.

MitchTurner · 2023-09-18T06:40:10Z

Can you elaborate on what exactly you mean?

For sure! I'm writing code that handles incoming events. There is no concept of connections at this level, just that I'm getting events from somewhere and I want to make sure I handle each event correctly.

Since I don't need to worry about connecting with real peers, I've just abstracted away the swarm and created a trait that has the methods I need:

pub trait LibP2PSwarm: Send {
    fn next_event_fut(
        &mut self,
    ) -> BoxFuture<'_, SwarmEvent<MyBehaviourEvent, Either<Error, Void>>>;
}

And I have a fake impl for tests that makes dealing with events a breeze:

struct FakeSwarm {
    events: Vec<SwarmEvent<MyBehaviourEvent, Either<Error, Void>>>,
}

impl LibP2PSwarm for FakeSwarm {
    fn next_event_fut(
        &mut self,
    ) -> BoxFuture<'_, SwarmEvent<MyBehaviourEvent, Either<Error, Void>>> {
        Box::pin(async move {
            if let Some(event) = self.events.pop() {
                return event
            } else {
                std::future::pending().await
            }
        })
    }
}

So I can just say what events are coming in and write tests that check that my code is handling those events correctly.

libp2p-swarm-test is designed for black-box testing. If you want to trigger a certain behaviour, you should establish the necessary connections and have the individual protocols to their thing.

This seems similar to what I'm doing, in that you're specify the events that you want to trigger. The syntax is confusing me a bit, but I'll admit I'm still learning. But I don't want to worry about managing another swarm--that's out of scope.

I will definitely, definitely want to set up some integration tests using the MemoryTransport to actually test peering once I get to that, but it would be overkill for these unit tests. That can use my live implementation of LibP2PSwarm:

struct RealSwarm {
    swarm: Swarm<MyBehaviour>,
}

impl LibP2PSwarm for RealSwarm {
    fn next_event_fut(
        &mut self,
    ) -> BoxFuture<'_, SwarmEvent<MyBehaviourEvent, Either<Error, Void>>> {
        Box::pin(self.swarm.select_next_some())
    }
}

Do you see a flaw in my thinking? Or do you know a better way to do this?

thomaseizinger · 2023-09-18T07:02:42Z

I see what you are trying to get at but this is going to be quite hard to achieve I am afraid. Purposely, several of the events that we emit have crate-private constructors which allows us to attach invariants to them. We can then be sure that no-one apart from us can construct a particular ID etc. It also minimizes the public API we have to commit to and can treat more things as implementation details.

I'd recommend you do one or both of the following:

Pretty much all of the APIs on Swarm are also available within a NetworkBehaviour. Thus, depending on how complex your network protocol is, you should be able to create your own NetworkBehaviour and unit-test its logic using libp2p-swarm-test.
There are legit cases that need to orchestrate across multiple behaviours and/or the Swarm. I'd recommend that you pack that into an event-loop like construct like here: https://github.com/libp2p/rust-libp2p/blob/master/examples/file-sharing/src/network.rs. You should be able to test this layer again by instantiating it with a Swarm from libp2p-swarm-test. You can then be assured that your event loop correctly handles networking events and translates them into more high-level application events / commands. The events and commands API of your event-loop then becomes the contract that your remaining application can code against.

I'd advise against directly abstracting Swarm away and producing fake-events. A busy application will produce a lot of events and constructing them correctly (i.e. consistently with each other and in the correct order as a real swarm) will be very cumbersome. Plus, you are completely missing out on testing the code that issues commands on the Swarm as a result of these events. The event-loop I am suggesting above is a good fit because you will find yourself wanting to access Swarm a lot of times as a result of receiving an event. See for example:

rust-libp2p/examples/chat-example/src/main.rs

Lines 119 to 124 in f5e644d

    
           SwarmEvent::Behaviour(MyBehaviourEvent::Mdns(mdns::Event::Discovered(list))) => { 
        
               for (peer_id, _multiaddr) in list { 
        
                   println!("mDNS discovered a new peer: {peer_id}"); 
        
                   swarm.behaviour_mut().gossipsub.add_explicit_peer(&peer_id); 
        
               } 
        
           },

Not sure how new you are to Rust but in my experience, mocking data doesn't scale very well. Rust more or less gives you "integration tests" for free through its type-system and exhaustive pattern matching. If it compiles without warnings, you are probably fine on that front. Thus, I'd focus on tests of extremely small components that include critical decision-logic. You can probably extract those out into free functions. Lastly, I'd top it off with some larger e2e tests, possible still based on a memory-transport if you also can run the rest of your app completely within a process.

In general, Rust's type system results in really strong API contracts. But that also means that tiny changes can cause a ripple effect of compile errors through your codebase, even when all you wanted to do is make a small refactoring. My personal conclusion from this is that these kind of "mid-size" tests often hurt more than they help because they rely on APIs that change frequently which makes refactorings really painful because now you suddenly have all these tests that you need to fix too. Instead, you should figure out which APIs in your system are the least likely to change. Those you should write your tests against.

thomaseizinger · 2023-09-18T07:13:44Z

A piece of advice regarding traits: Don't treat them as interfaces. Forget everything you've learned about OOP. Traits should abstract over functionality. My rule of thumb is: If you don't have production code that is generic over your trait (i.e. fn foo<T: MyTrait>(t: T)), the trait probably shouldn't exist. If you are writing an application, you should rarely be in need of using traits. If you consider using a trait, you are probably thinking in terms of making a library, i.e. something abstract. But do you need a library? Do you have 2+ consumers of this code or are you creating premature abstractions?

Just my 2c :)

PS: I recently wrote a TURN server. Roughly 5000 LoC: https://github.com/firezone/firezone/tree/main/rust/relay

Guess how many traits it has? Just 7 but two are in the process of being upstream to a library and three are extension traits. The remaining two are legit traits that are being used as abstractions, see https://github.com/firezone/firezone/blob/10faffc4dbbc8fef7b2573f15a34357f1236569a/rust/relay/src/server.rs#L731-L735.

❯ rg trait
src/net_ext.rs
4:pub trait IpAddrExt {
17:pub trait SocketAddrExt {

src/stun_codec_ext.rs
9:pub trait MethodExt {
13:pub trait MessageClassExt {

src/server.rs
1050:/// Private helper trait to make [`error_response`] more ergonomic to use.
1051:trait StunRequest {
1075:/// Private helper trait to make [`Server::verify_auth`] more ergonomic to use.
1076:trait ProtectedRequest {

src/auth.rs
16:pub trait MessageIntegrityExt {

MitchTurner · 2023-09-18T08:11:27Z

I'm hitting the hay so I'm gonna have to digest everything you've written later, but thanks for all your input!

Just a little context, I'm not new to Rust; I've been using it for almost five years for work. The places I've worked have emphasized clean architecture and test-driven development with Rust. I like to approach designs with hexagonal architecture in mind (https://en.wikipedia.org/wiki/Hexagonal_architecture_(software)?wprov=sfti1). All this to say, I know how I want to design my code but I'm not sure if libp2p was designed in a way that will make it easy. I'd prefer a library expose a trait of the interface I need to depend on for IO so that I can fake it as I please--I identified that interface as the API of the swarm since it is what is used in the event loops. I've only seen concrete swarms used in any of the example code though, and I think I can see why based on how the swarm is constructed.

That's why I went for my own trait.

This is for a personal project where I've abstracted the storage of the system away and this libp2p event handling is in an "adapter" for the storage. However a lot of my findings are gonna spill over into my work where we use libp2p. That code is overly coupled to libp2p and it's causing problems with maintainability. This personal project is kinda an exploration of solutions I hope to implement at work.

thomaseizinger · 2023-09-18T08:33:52Z

I like to approach designs with hexagonal architecture in mind (en.wikipedia.org/wiki/Hexagonal_architecture_(software)?wprov=sfti1).

I am a big fan of that. So much that I actually wrote a blog post on it several years ago: https://blog.eizinger.io/5835/rust-s-custom-derives-in-a-hexagonal-architecture-incompatible-ideas. Disclaimer: It is 4 years old and I don't hold the exact same believes any more.

There is definitely value in a ports-and-adapters design. If I were to implement one today, I think it'd go down the path of wrapping all of networking in a module (i.e. the eventloop described above) and treating that as the interface you are abstracting away.

This is similar to how you probably have an application-specific interface for your storage and don't directly code against SQL in your business logic, even though SQL is technically already an abstraction layer where you could swap a production Postgres for an in-memory SQlite for testing.

Really, it comes down to which components you want to be able to swap out and I think, directly swapping out Swarm will be very painful in how much code you have to write for your "fake" Swarm.

That code is overly coupled to libp2p and it's causing problems with maintainability.

Would love to learn more about the problems you are facing! Are you aware of our open-maintainers call, e.g. https://github.com/libp2p/rust-libp2p/discussions?discussions_q=open+maintainers+call+? The next one scheduled for September 26th. Also happy to hop on a call before that if you fancy.

work where we use libp2p

I am guessing you are talking about https://github.com/FuelLabs/fuel-core? It would be great if you could update to the latest version 😇 v0.50.0 is almost a year old already and 0.52.0 especially was a quite exciting release. Since then, we've for example stabilized QUIC which you might want to look into!

thomaseizinger · 2023-09-18T08:41:43Z

I'd prefer a library expose a trait of the interface I need to depend on for IO so that I can fake it as I please

This is rather difficult with libp2p because we manage connections under the hood and for example spread them across different tasks of your runtime to keep the latency of the behaviour state machine down. Additionally, given the encryption and multiplexing that we are doing, I don't see how you would actually be able to meaningfully fake that IO.

There is a universe in which libp2p is implemented in a SANS-IO way and you need to do all the IO yourself. That scales better than having interfaces, as interfaces are always a contract you are locked into compared to a state machine that you need to drive. For example, it is why I think it is worth exploring getting rid of RecordStore and instead externalizing that: #3035

I don't think using rust-libp2p would be very easy in that universe. SANS-IO libraries struggle to offer something that works out of the box because well, the whole point is to not do any IO so you need to do that yourself.

MitchTurner · 2023-09-19T01:07:00Z

Pretty much all of the APIs on Swarm are also available within a NetworkBehaviour. Thus, depending on how complex your network protocol is, you should be able to create your own NetworkBehaviour and unit-test its logic using libp2p-swarm-test.

Currently, I'm just copying the behavior from the distributed key share example, so I'm not as worried about the testing the behavior as I am testing my code that interacts with it--that's the interface I want to fake.

There are legit cases that need to orchestrate across multiple behaviours and/or the Swarm. I'd recommend that you pack that into an event-loop like construct like here: https://github.com/libp2p/rust-libp2p/blob/master/examples/file-sharing/src/network.rs. You should be able to test this layer again by instantiating it with a Swarm from libp2p-swarm-test. You can then be assured that your event loop correctly handles networking events and translates them into more high-level application events / commands. The events and commands API of your event-loop then becomes the contract that your remaining application can code against.

Yes. I do already have a event loop which takes input from the Storage interface as well as handling the events when the queries come back. I'll give it a try the way you're recommending. I looks like I don't really even need to connect to another swarm necessarily--but I'll have the option to. For example, if my code is doing a kademlia.get_record() I can do a kademlia.put_record() before handing the swarm to my code. Just about as good as what I've been doing :)

MitchTurner · 2023-09-19T01:14:16Z

Would love to learn more about the problems you are facing!

I am guessing you are talking about https://github.com/FuelLabs/fuel-core?

That's right! I've only been on the team for a couple months, so I'm still getting familiar with all the code. Refactoring isn't the priority right now, so I can't say that it will be super soon, but I'd love to jump on a call and discuss things once I done some serious spiking--it's possible it won't be major. In the mean time I'm settling with incremental changes to push things in the right direction and to see what is fragile 👍 and working with libp2p on personal projects to make sure I have a sound grasp.

thomaseizinger · 2023-09-19T02:01:13Z

Pretty much all of the APIs on Swarm are also available within a NetworkBehaviour. Thus, depending on how complex your network protocol is, you should be able to create your own NetworkBehaviour and unit-test its logic using libp2p-swarm-test.

Currently, I'm just copying the behavior from the distributed key share example, so I'm not as worried about the testing the behavior as I am testing my code that interacts with it--that's the interface I want to fake.

Good! I only mentioned it because it is a not-so-often-considered option of using rust-libp2p to write your own NetworkBehaviour. Really, they are just plugins and can encapsulate a lot of functionality. For example, as part of the 0.52 release, we enhanced the APIs to make it possible to implement connection management as a plugin. See https://github.com/libp2p/rust-libp2p/blob/master/misc/allow-block-list/src/lib.rs for an example.

What I am getting at is: If you can push certain functionality into a plugin, it is easier to (unit)-test that compared to adding more and more things on top. The only thing that plugins can't do is send arbitrary messages to each other. They likely will never be able to because it would cause circular dependencies between them.

There are legit cases that need to orchestrate across multiple behaviours and/or the Swarm. I'd recommend that you pack that into an event-loop like construct like here: master/examples/file-sharing/src/network.rs. You should be able to test this layer again by instantiating it with a Swarm from libp2p-swarm-test. You can then be assured that your event loop correctly handles networking events and translates them into more high-level application events / commands. The events and commands API of your event-loop then becomes the contract that your remaining application can code against.

Yes. I do already have a event loop which takes input from the Storage interface as well as handling the events when the queries come back. I'll give it a try the way you're recommending. I looks like I don't really even need to connect to another swarm necessarily--but I'll have the option to.

Yes. Do note that some operations might fail if we don't have any active connections.

Would love to learn more about the problems you are facing!

I am guessing you are talking about FuelLabs/fuel-core?

but I'd love to jump on a call and discuss things once I done some serious spiking--it's possible it won't be major. In the mean time I'm settling with incremental changes to push things in the right direction and to see what is fragile 👍 and working with libp2p on personal projects to make sure I have a sound grasp.

Sounds good! Would be great to have you on the call :)

MitchTurner · 2023-09-19T03:35:36Z

All sounds good! My tests are now at parity with where they were before and I'm feeling optimistic about moving forward. I'm going to have a lot more confidence that the code will do what I expect in the wild now that the tests are using more true Swarm code.

I'm sure I'll have more questions moving forward, but I'll open an issue or discussion rather than having another back-and-forth in some defunct PR :P

Thanks for your help and thanks for being a good maintainer ❤️

Add public construction for QueryId

d620801

MitchTurner changed the title ~~Add public construction for QueryId~~ feat: add public construction for QueryId Sep 17, 2023

Add CHANGELOG entry

646109c

MitchTurner marked this pull request as ready for review September 17, 2023 22:03

thomaseizinger requested changes Sep 18, 2023

View reviewed changes

thomaseizinger closed this Sep 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add public construction for `QueryId` #4508

feat: add public construction for `QueryId` #4508

MitchTurner commented Sep 17, 2023 •

edited

Loading

thomaseizinger left a comment

MitchTurner commented Sep 18, 2023

thomaseizinger commented Sep 18, 2023

thomaseizinger commented Sep 18, 2023

MitchTurner commented Sep 18, 2023

thomaseizinger commented Sep 18, 2023

MitchTurner commented Sep 18, 2023 •

edited

Loading

thomaseizinger commented Sep 18, 2023

thomaseizinger commented Sep 18, 2023

MitchTurner commented Sep 18, 2023

thomaseizinger commented Sep 18, 2023

thomaseizinger commented Sep 18, 2023 •

edited

Loading

MitchTurner commented Sep 19, 2023 •

edited

Loading

MitchTurner commented Sep 19, 2023

thomaseizinger commented Sep 19, 2023

MitchTurner commented Sep 19, 2023

feat: add public construction for QueryId #4508

feat: add public construction for QueryId #4508

Conversation

MitchTurner commented Sep 17, 2023 • edited Loading

Description

Notes & open questions

Change checklist

thomaseizinger left a comment

Choose a reason for hiding this comment

MitchTurner commented Sep 18, 2023

thomaseizinger commented Sep 18, 2023

thomaseizinger commented Sep 18, 2023

MitchTurner commented Sep 18, 2023

thomaseizinger commented Sep 18, 2023

MitchTurner commented Sep 18, 2023 • edited Loading

thomaseizinger commented Sep 18, 2023

thomaseizinger commented Sep 18, 2023

MitchTurner commented Sep 18, 2023

thomaseizinger commented Sep 18, 2023

thomaseizinger commented Sep 18, 2023 • edited Loading

MitchTurner commented Sep 19, 2023 • edited Loading

MitchTurner commented Sep 19, 2023

thomaseizinger commented Sep 19, 2023

MitchTurner commented Sep 19, 2023

feat: add public construction for `QueryId` #4508

feat: add public construction for `QueryId` #4508

MitchTurner commented Sep 17, 2023 •

edited

Loading

MitchTurner commented Sep 18, 2023 •

edited

Loading

thomaseizinger commented Sep 18, 2023 •

edited

Loading

MitchTurner commented Sep 19, 2023 •

edited

Loading