-
Notifications
You must be signed in to change notification settings - Fork 34
Rearchitecting Rust ABCI #61
Comments
@tomtau
I'd imagine the simplest approach would be to separate functionality via traits. The challenge is identifying the right connection for the right trait. |
@davebryson that request reading/parsing should be now asynchronous and separated from response writing -- so one way to go about it could be to have 3 concurrent channels/queues (for each connection type), place and take parsed requests+origin (some way to use / refer to the corresponding writer) there |
Have you considered making In such a setup, the mutex in respond would become unnecessary, and it would be up to the implementation to properly lock application state. In that case, the application developer could separate application state as appropriate, lock them separately, and allow for parallel execution of various operations that do not use the same state. Worst case, the application developer just uses one mutex for simplicity, which is basically what is now hardcoded into
To do down that route, have you considered making the various methods of Then |
Yes, that could be reasonable -- running requires it anyway: https://github.com/tendermint/rust-abci/blob/develop/src/lib.rs#L118
Yes, that could work -- I guess it'd also need to keep track of the order, as the responses need to be written in order (currently this is enforced by the mutex) |
I don't know how relevant it is to the current conversation, but, I've been experimenting with different architectures for implementing ABCI (keeping existing issues of ABCI in mind, #30, #49), and have come up with following design: https://github.com/devashishdxt/abci-rs |
@devashishdxt looks good to me, I assume it makes sense to split up the application in those three traits, and by making them What could be improved is the spawning of a thread for each incoming connection I'm starting to think that you might not want to have the application deal with any concurrency, instead modelling it as a set methods that would take some Maybe each trait could have an associated type representing the associated state, and a method returning an empty state on start-up, and it would the responsibility of ABCI to then store the initially empty state, and pass it along as a That way developers just need to implement a trait taking a |
Why not use an Actor model?
😄 reminds me of the good 'ole Erlang days... |
In a normal network application, I'd worry about spawning a thread for each incoming connection. But, tendermint only creates three connections, so, there can only be three spawned threads at any point of time. One option is to use a threadpool of three threads, but, it won't make any difference in current scenario.
The reason I think state management should be left upto application developer is because, as a library, we cannot think of all the possible usecases that the developer would like to solve. So, it is very difficult to design an API which suits all usecases. Besides this, developers may want to manage state on their own (for example, deciding persistance layer, encoding of state). |
I guess that's possible -- in Actix: https://actix.rs/book/actix/sec-2-actor.html
|
What I meant was to rather use (three?) long-lived threads, a bit like "actors" as suggested above. That way you avoid creating and destroying threads so often, and you could drop some locks if the data is confined to a dedicated thread.
Sure, and you can still design for some usecase(which one?) and make it more convenient. Project with the needs and resources to do things differently can still use the raw underlying message layer. Concurrency is relatively hard, so it might make sense to abstract that away so people can just provide a trait without having to think about that. Even if some concurrency is required, it can be offered as an API by the library. |
@gterzian: Yeah. I guess we’ll have to go through multiple design iterations to reach a stable API and decide what works best for 90% of usecases. |
With |
The async/await grpc actor approach used in Libra has some interesting ideas. Although I have to say, I'm still on the fence as to whether rearchitecting ABCI right now is worth the extra complexity it would put on users. As an aside, IMO having a solid merkle storage approach that works with the ABCI model is probably one of the biggest things lacking in the Rust/Tendermint ecosystem. I took a stab at adapting Exonum's merkle storage to create a generic framework for Rust ABCI: See Rapido - Merk and JellyMerkleTree are next on the list... |
@davebryson you might want to check out https://github.com/nomic-io/merk |
@tarcieri Thanks. I have. |
Nice work @devashishdxt ! I like just using threads (no need for tokio here, right?). Do you think it's worth having two threads per connection instead of one (ie. one for reading and one for writing), like in the Go version? The advantage is that the next request can be processed before the last response is sent back over the wire, but maybe this isn't important, because most of the latency is likely in the app method (ie. where we do crypto), rather than in sending the bytes back (local socket), and we're going to block on the app method anyway. I also like the idea of having distinct traits for each connection and not locking the app on every call, but there's still going to need to be synchronization around Commit - ie. while Commit is running on the Consensus conn, Mempool and Query conns should be blocked. We can start by letting ABCI devs handle this, but it's something that might belong in the server itself. It's worth noting that in Go we do have a global lock on the app for every method, but the Go ABCI server is probably under used since many Go apps just compile in with Tendermint. Another note, it might be nice if ABCI connection creation came with initiation data so we could tell right away which kind of connection we're setting up (and hence which methods will be received and in what order so we can have #49). We can also figure this out on receiving the first ABCI request, but connection metadata would be cleaner, and provide opportunity to eg. check versions tendermint/tendermint#2804 |
Do you have any example of what could go wrong (and is critical) if they are not blocked? For query requests, they may return invalid data, but clients should ideally verify what they are being returned anyway. In mempool processing, invalid transactions may be admitted, but that's OK (as transactions are still checked in the consensus). |
That is correct. The reason I did not want to have ABCI specs mention that Updates made to the
Yes. I think having two threads per connection is a very neat optimisation (maybe, one for server IO and one for ABCI application?).
I agree. I was thinking of including some state management utilities (an abstraction over basic sanity checks without being too opinionated) with the crate.
Right now, a reference of all the three trait objects is available to each thread (they might not use it, but it is available to them). If there is some metadata included in connections, we'll be able to create threads for specific connection types and restricting their access to other parts of ABCI application. |
One extra sanity check could be in |
The moment you add threads of any kind, logic runs "out of order" unless additional synchronization is added. In the referenced article, even the "single queue with a lock" can result in "out of order" execution. This is not due to futures, but it is due to threading and it is resolved by using coordination of some sort. |
I think it's a debatable issue. The nice thing about using futures, and possibly even Tokio tasks, is it would compose better with other applications already written with Tokio. This seems particularly useful for things like peg zones, where there's a pretty high chance the application is going to be using Tokio anyway (as an example, we'd like to integrate with the Zebra Zcash Node). |
Another approach would be to make tokio an optional dependency and allow apps to choose either a threads-and-blocking-I/O or async strategy. If you're amenable to that I could potentially contribute the tokio side of things. |
I'm a spectator to Rust ABCI, not a developer or user of it, so discount this comment appropriately to what I have at stake, but... It's a bit confusing to read comments saying that something can "just" use threads, rather than "needing" Tokio, because it seems like the assumption is that using explicit threading and synchronization is easier, simpler, or for some other reason better-by-default than using futures with Tokio, and I don't think this assumption is valid. In fact I would suggest that the opposite is true: that unless there's a compelling reason to use explicit threading, it's better to use futures, which are the preferred Rust abstraction for this problem. As @tarcieri notes, using Tokio for ABCI makes it easier to interoperate with other software using Tokio. But this is not really about whether to use Tokio or not, but about whether to use an async, futures-based model like the rest of the Rust networking ecosystem, or to use explicit threading and force everyone wanting to use ABCI to define their own translation layer between its synchronous model and their async futures code.
I don't think this is quite correct. Tokio's tasks are not scheduled in order, but tasks are not the same thing as futures. Tasks are more akin to goroutines or some kind of greenthread, while a future is a unit of asynchronous computation. So you will get exactly the kind of guarantee you want just by doing, e.g., abci.deliver_tx(/*...*/).await?;
abci.deliver_tx(/*...*/).await?;
abci.deliver_tx(/*...*/).await?; on a single task -- the futures are not executed concurrently or in parallel unless explicitly requested with, e.g.,
I would just note that this is exactly the kind of question you don't have to think about if you use Tokio instead of explicit threading.
Compared to providing a futures-based async API, this seems like an enormous amount of extra work for very little benefit. |
Happy to see there's a growing buzz for the intersection between Tendermint and Rust. It seems part of this thread is a dance around the issue of whether or not this crate is an application framework or just a barebones interface to ABCI, for example suggestions that this crate should provide a Merkle store or its own abstractions. I would vote that those things belong in external crates so that this crate doesn't impose decisions on the application other than what the ABCI protocol already requires. As for the threads vs futures conversation, I'm not well informed on this but it seems to me that the tokio-based model is easier to convert to a blocking queue model if that's what the application developer wants, rather than vice versa. Async seems to shine when dealing with many messages in a request/response model which is why it works well in the HTTP server world. |
It depends on how the app manages state, and if there are data races between query and commit. Imagine a range query over a key range that is updated during commit, so some keys are the old version and some are the new version. Yes, clients doing verification should detect this as invalid (since some of the keys will have proofs in one state hash and some in another) it seems like the kind of guarantee we'd want correct servers to provide - clients already have enough non-sense to deal with.
We try to provide the strongest guarantees we can in the mempool, which roughly translate to "correct proposers will not propose invalid sequences of txs", where a valid sequence is one that passes the local CheckTx calls, in order, against the latest committed state. We also (optionally) replay all remaining mempool txs after a block is committed to see which are still valid. So it's possible the synchronization here could be lazier, but it's subtle. I'm hoping we can specify the ABCI connection semantics more formally soon. |
That is correct. We can manage order of execution the same way we do with our threaded model (thanks to |
@tarcieri @hdevalence @mappum thanks for your thoughts.
I did have this assumption, from ergonomics, performance, and dependency perspectives, since we have only 3, ordered, long-lived connections and minimal synchronization between them. Also, given that an ABCI app is a linearizable, deterministic state machine, I was thinking they wouldn't have async components (eg. they don't make network calls). But of course, they are expected to persist state, aspects of which may for various reasons need to happen asynchronously, and some folks (eg. Agoric) are even starting to explore more async ABCI app designs - who knows, maybe one day folks will introduce networking layers into them. There's also the inherently async nature of DeliverTx and CheckTx calls. And ABCI is primarily a library to be used by many people. So yeh, it probably does make sense to adopt tokio here. That said, what would you consider a compelling reason to use explicit threading? We're having a similar discussion about what concurrency model to use for the new Tendermint full node. Again, since we're not building the kind of thing where async really shines architecture wise (ie. millions of connections), and since Tendermint is more of a binary than a library, we've had the impression we might not "need" the ergonomic/dependency overhead of tokio, but we're working on some experiments to feel it out. |
Pretty much the main reason to use threads and blocking I/O at this point is because you don't want the dependency on an async reactor like tokio, possibly to work in environments where dependencies like mio aren't available. Note that doing this leaves a number of tricky problems unsolved, like I/O timeouts. The main reason I used threads + blocking I/O in tmkms was to ship something on stable Rust which would be usable ASAP. Now that
The whole "roflscale number of connections" angle belies the real point of using an async system, IMO, which is a composable event-driven system which makes it easier to plug ABCI into a complex existing application like a Bitcoin/Ethereum/Zcash node. |
I'd say a lot can be said in favor of native threads in terms of simplicity, flexibility, and also the preservation of boundaries. Flexibility: A thread can be used to model a larger range of runtime behavior. Tasks are meant for non-blocking code. When using native threads, you can essentially build your own runtime to fit your exact needs. That doesn't mean it needs to be a very complicated runtime by the way, but it might run according to a very specific set of constraints, for example as suggested at #49 . This could also be relevant if your system actually would want to run user-supplied async code as tasks. Simplicity: Threads, and related tool, really have a simple API. Spawn a thread, add some local state, block on a channel, and handle messages by operating on local state. There is not much more to worry about. Even with shared-state and mutexes/condvar, if you use the mandatory Preservation of boundaries: In systems of almost any size, it pays to preserve boundaries between components, some of which can have completely different runtime behavior and needs. If you start passing futures around, you are much more quickly tied into one way of doing things, and that can be unnecessarily constraining. Threads don't force the rest of the system to use threads in the way futures can, and as a matter of fact if the coarse grain units in the system are modeled with threads, they can internally still own say a Tokio runtime and run futures if that fits their needs. What is best avoided, in my opinion, is passing futures around from one component to another, which pretty much enforces a certain style where it might not be fitting. I find the below article, with the title not doing it justice at all, provides a very good collection of wisdom on multithreading in general. If you scroll about half-way down(sorry can't link to the specific paragraph), you can find an "Alternatives to mutexes" section that starts with the below line:
Which is followed by a critique of "The event-driven style", and a dedicated "The fallacy of thread context-switch cost" paragraph. A thought-provoking read. https://www.chromium.org/developers/lock-and-condition-variable |
A quick note on this: tokio has a multithreaded scheduler, so brief periods of blocking (say to persist to a synchronizing backing store) don't hang the event reactor in the same way it would a single-threaded event loop (e.g. Node). That said it's certainly not suitable for indefinite blocking, and that approach only really makes sense for things which block briefly.
I'd be curious what you envision for composing a threads-and-blocking-I/O-based ABCI with a tokio app like Zebrad (or It seems like one of the main use cases for ABCI is embedding it inside of existing full node implementations for the purposes of peg zones, and in practice pretty much all of those are written using tokio. If you do decide to go with threads-and-blocking-I/O instead of tokio, it seems extremely important to me to come up with a plan for how to compose it with existing tokio-based applications. The tricky bit an async event framework like tokio solves for you is multiplexing I/O events with other inter-task futures. While you can use e.g. channels to communicate with other threads, you now have the problem that those threads have interests in both I/O readiness and channel readiness. At that point, aside from hacks like busywaiting or racy signal-based approaches, the only composable model is to use another I/O handle (e.g. a pipe or socketpair) and e.g. tl;dr: if you do end up going with threads-and-blocking-I/O, what is the concrete plan for integrating ABCI with tokio applications? |
Here is a quick PR to show what I mean with the above, note I haven't actually run it yet: #95 |
@gterzian while that appears to accomplish your stated goal of not using futures or async/await, I'm still confused how that's helpful |
Interesting discussion, I was just thinking about trying out rust abci on a simple app, and then saw this discussion. Saw the point that the main use case is embedding abci in existing full nodes, all written with Tokio. Can you link these? I am very curious what abci apps exist, and a canonical list of both experiments and (partially) maintained code would be a great reference, both for devs and to ground this conversation on the current state of affairs. If there is minimal existing code, a complete rearchitecture (if it makes future clients easier) is quite attractive. If there is much existing code built on this, the refactoring cost must be factored in (and ideally a similar api, even if implementation changes) |
Ok, let me try to clear things up then. I think the approach taken in #95 makes the API more flexible and easier to use then if it was explicitly based on async/futures. And this flexibility goes both ways, the app can be written in any style, and the ABCI layer also retains the same freedom. The API is more flexible because you can write your app any way you want, including without any concurrency. The introduced The flexibility also makes the API quite easy to use. The existing "counter" app required only minimal change, and it was easy to add an example using threads and message-passing. Someone else could write another example using Tokio with equal ease. If the API was async, then everyone would have to write an async app(or jumps through hoops by adding some intermediary layer). I think it's better if everyone can choose their style, including async if that's what they want. It's also not necessary to use futures in the API to enable an app to be written using futures. I can't think of a simpler API than one passing a |
Based on the inputs and comments on this thread, I've modified the implementation of Code: https://github.com/devashishdxt/abci-rs Code review and critique appreciated. |
@devashishdxt nice! That's definitely along the lines I was thinking, and kudos for implementing async traits. I haven't played with I might attempt to play with your crate and see if I can get it working in tandem with a |
That is correct, As far as I know, I don’t know, in details, the difference in performance of both crates, but, I think both should be very similar in terms of performance. |
I'd agree for this use case it's probably fine. I'll try integrating it into a |
Hi @devashishdxt I took a look at the code and it seems rather straight forward. The main chunks non-trivial logic being in
Points on the counter example:
I don't mean to be too nit-picky, and this is nice work. Thank you for doing it and just sending some ideas for enhancements. |
I agree with you. We need to think more about the API in terms of internal state management. In #61 (comment), I did mention that we need to think about an abstraction over state which will also have all the state related sanity checks built in. |
Great. I built this abstraction in golang two times (for originak cosmos-sdk and then weave) and happy to rework the counter example to provide what I find cleaner abstraction around state management. If you are open to such a pr, let me know and I will work on it next week. |
That’d be very helpful. Thank you. |
I am strongly of the opinion that this ABCI component should just handle sockets, serialization, and deserialization - it seems the current assumption is that it should also be a sort of base framework for applications and make decisions about managing state. I started a lower-level implementation here which lets the consumer handle connections as they want: https://github.com/nomic-io/abci2 It's implemented the way @ebuchman suggested above in this thread: a read thread and write thread per connection which handle the socket IO and the protobuf encoding, with channels for consumers to read and write through these threads (very Golang-like). On top of this, it would be trivial to implement any kind of application interface, e.g. the simple synchronous Async is cool and I've been a long time fan of JS Promises, but I'm not sure there's any difference if the low-level ABCI connection uses tokio/mio vs standard blocking sockets since we have a very small number of connections - that being said I can see how application frameworks can make good use of them if they happen to do a lot of IO (e.g. reading from the state database). Note that I needed to implement ABCI like this in order to enable concurrency. Both of the existing ABCI crates don't allow for any concurrent request processing since they both block (or await) until they receive a response for the most recent request. ABCI does support concurrency since it will write multiple requests before receiving a response, but it is up to the ABCI server to respond to the requests in the order it received them. |
I am coming from this at a radically different perspective, looking for a solution to a protocol I didn't expect to be so complex in implementation given the requirements, finding myriad complexity in a place I think could be solved a lot more simply, and I'm not sure it's entirely their fault: As someone spanning the gap of an expert Rust programmer and someone running production validator infrastructure, this solution seems far too complex to me for the actual purpose it's accomplishing. The idea of shipping anything that complex to production is somewhat terrifying to me (multiple interdependent subprocesses where I don't think they need to exist). That's okay, it's a prototype, but unless there's movement on addressing some of the concerns I've raised in this thread, it seems like there won't be movement on simpler solutions. The challenges of simultaneously integrating and reacting to things like:
...really seem like a textbook use case for at least some of the tools in the All that said, https://github.com/devashishdxt/abci-rs seems like what I want, so I guess perhaps I should just invest my time there? |
Whoa, interesting - they use it only for the gossip network and do not have a consensus state. I understand why they did that if there aren't better tools out there for creating gossip networks, but using Tendermint for a network that doesn't need consensus seems kind of like using an internal combustion engine as a space heater. And if you don't need Tendermint, you don't need ABCI. Either way, if you're using ABCI you still have to go through the choke point of the 3 connections that Tendermint makes and respond to each request in-order, so async won't create any sort of performance difference for that part of the system. You could still provide the exact same top-level async interface in |
I'm completely unconcerned with "performance" in this case, and much more concerned with simple composition. Async allows for universally composable reactive designs, which is something I think could've benefitted the above project, whose goals are a small part of what I'd like to do with an even more complex ABCI application, and yet that project in particular seems already overcomplicated to me.
|
@tarcieri I hear your complaint above and I think this is largely a critique of the design not matching your needs, rather than just abci-rs. Tendermint ABCI is a complex interface. For the white-city demo, I would propose using a rust implementation of libp2p, which should be more tailored for their usecase.
For the example of a relay chain - are you building a peg zone? or just a relay. If a relay, you don't need tendermint/abci at all I believe. In fact, there was much discussion with tendermint devs that an abci app connecting back to tendermint rpc is an anti-pattern. Why not just make it a separate process that talks to tendermint rpc as a client - thus a much simpler design. You can run such a relay on every validator node and use a private key for permissioning and I think accomplish what you want. For a peg zone, there is so much complexity with IBC and shared security (cross-chain collateralization) that I think the abci interface is a super minor part. I would honestly try to use out-of-the-box cosmos-sdk for the blockchain part and run the relay part as a separate (rust-only, no abci) process. I think this would give a much cleaner architecture. However, I do not understand your use case fully - can you reference any architecture docs for the referenced project? I think a rearchitecture could reduce complexity significantly.
|
We're building a zero-knowledge peg zone that bridges IBC and the Zcash shielded pool via what we eventually hope to make a 2-way peg: https://www.zfnd.org/blog/eng-roadmap-2020/
I'm aware of this (see aforementioned DKG which uses this architecture), however speaking as someone who both ran qmail in the '90s and is running Rust-based ABCI apps using this architecture today, I'm really curious to know why. Complex multiprocess architectures are a pain in the ass to debug. I personally think it's much simpler use no more processes than are strictly necessary: simplicity to me means fewer moving pieces. Especially for something like a DKG use case, I'm curious how much of the rationale is based on "there weren't good alternatives to a multiprocess architecture". If we were to go the simple lego blocks model of plugging together processes based on what's available, here's what I'm counting for the above project:
That's a lot.
It's not public yet, but will be soon. We're trying to nail down the architecture, and I think the main point of contention right now is around how many processes are necessary. Personally I'd like 3:
Now add zero-knowledge DKG, zero-knowledge fraud proofs, collective proving among the validator set, etc 😜 Yes, there are definitely much harder problems we're grappling with on this particular project, but they seem off topic for this thread. |
Okay, I see your points. I would say the go implementation of tendermint is also a fair bit more complex than it needs to be, Maybe a light-weight rust version could bring your list down to 1 node plus a kms. I must say zebra sounds quite interesting... |
Heh. The "why" was more from a deterministic state machine perspective, where the ABCI app is hidden from the world and shouldn't engage in operations which could be non-deterministic, like network requests. But it's increasingly looking like the multi-process complexity isn't worth it and we just need ABCI applications and frameworks to make sure the deterministic state machine is properly isolated from non-determinism (eg. see the recent move of the Cosmos REST api back into the gaiad binary ...). |
@mappum is the same true in the other direction? If one direction is more natural / simpler, we should probably try to do that so we can consolidate to a single implementation that provides the interfaces folks need, even if that's both async/await and non async/await. |
We can provide the same interface using |
I read this thread a while ago, and was on the side of no async in abci apps. But it seemed like it got settled that way cuz zcash or parity does it... something like that. I generally agree with #61 (comment) In the end, the code went async, without clear consensus . Can someone explain why. The state-machine itself, which reads/writes to a merkle store and does math should be synchronous, right? And we just need to run three even threads, which need to keep a strict ordering of the messages they process. Even if we want to do some sort of pipelining, If this is really needed for some sort of external compatibility and not for inherent architecture reasons, I think we should have two implementations of this level. |
I think there are effectively two directions you can go here:
Of these two, I'm weakly a fan of the former: it means you can write higher-level abstractions specialized to particular executors, or use existing ones like There is a finer grained middle ground: you can build the core protocol processing in separate, fine-grained request processing functions and response processing functions, then implement both a threaded blocking runtime and an async runtime in terms of the same core functions, with the former using However, though some parts can be shared, this still involves writing a lot of the functionality twice ("WET", c'est la Rust), and can be somewhat annoying. I can't think of any popular crates that have done this successfully either offhand, or I'd give an example. |
These are good points. I think it'll be good to list some of these consequences in https://github.com/informalsystems/tendermint-rs/pull/510/files#diff-8560f0a125ba5a5011ace9b97f504ab5R41 I'll close this issue, as the "tendermint/rust-abci" will likely be archived and the future development (reflecting these changes) will continue under https://github.com/informalsystems/tendermint-rs/ (when this is finalised and merged informalsystems/tendermint-rs#489 ) |
as discussed in
#50
#31
there are 3 connections:
messages exchanges in info and mempool connections probably do not require to borrow ABCI applications as mutable, so it may be possible to rearchitect ABCI without having the mutex on the entire application (so that, for example, CheckTX/Info/Query can be processed while consensus operations are going)
The text was updated successfully, but these errors were encountered: