System scheduling #2747
Replies: 7 comments 29 replies
-
Plugin interop
I agree with this; being able to convince plugins to play nice with each other (and your own app structure) is one of the major benefits of label exporting. See #2160 for some more discussion on this, and how we might extend it further. The fundamental tradeoff to be made here is configurability vs leaking internal details everywhere and making it far too easy to break the invariants of your dependencies. |
Beta Was this translation helpful? Give feedback.
-
As-if dependencies
I agree; this is a solid idea. The distinction is meaningful, particularly with many-to-many labels. We should add these as primitives, and then build higher level APIs to support both flavors of dependency edges. |
Beta Was this translation helpful? Give feedback.
-
Higher-level system ordering APIsThese are definitely needed. #2381 is a solid starting point IMO; what additional functionality would you like to see? Obviously we'd want some way to specify that subgraphs use as-if logic, but that's handled easily enough via some form of API specialization.
I fundamentally disagree with this. The existing parallel executor enforces the minimum possible set of forbidden system-overlaps; this is the correct starting point. Ordered dependencies ala Task-level cross-system dependencies go beyond those primitives, but I don't think they're are a feasible idea. But I'll get into that in another comment to try and keep things organized :) |
Beta Was this translation helpful? Give feedback.
-
Task-level dependenciesFundamentally, these are going to play very, very poorly with Rust and Bevy's ownership model. As I understand it, (table-stored) data within the ECS is ultimately broken down into archetype-component blocks for efficient, dense memory storage. Having two systems operating on different parts of the same data at once in sequence seems like a very large source of complexity and risk, and would require communication and polling of partially complete state. I worry that as your systems change during development, you'll need to be extremely careful to ensure that the invariants you need to safely perform this split are upheld. However, you're absolutely correct that between-system parallelism is not the panacea: much of the work in many games is in fact going to be bottlenecked by a few heavy systems. I feel quite strongly that the correct path forward is to further optimize our very basic Ultimately, if you have extra threads free while waiting for a heavy system, why bother dealing with the complexity to split the work across multiple systems when you can just throw more threads at the heavy system directly? |
Beta Was this translation helpful? Give feedback.
-
System order ambiguitiesI'm very sympathetic to your concerns about the prevalence of system order ambiguities, and the "allow-by-default" stance that Bevy currently takes. I agree that they tend to create cascading risk as the project grows, and the bugs that they can introduce are very hard to identify (and right now, impossible to fix by default). The idea of "deterministic by default" is appealing: see #2480 for some of my own thoughts on the matter. However, the proponents of the current approach have some solid points:
On that note, let's think about how we can improve those tools and bridge the gap. My opinion is still that system order ambiguities are sufficiently painful that they should be forbidden by default, but we're not likely to convince the rest of the dev team or user base until resolving every ambiguity is intuitive and maintainable. Here's the steps I think we can take to get there. On the ambiguity front:
2 and 3 must be permitted to pierce the plugin veil, otherwise bugs will not be able to be tracked down. For system ordering:
|
Beta Was this translation helpful? Give feedback.
-
Runtime ambiguity detection
This is a dangerous path: runtime detection is unreliable as the sets of archetype in the world may change at any hard sync in any way. The consequences of failure are critical: you will get crashes, horrible bugs, and genuine unsoundness. The obvious solution is to run the runtime check each hard sync point, and then log and panic if it's detected. This is expensive and frustrating. Moreover, any guarantees that you receive here are provisional: prone to breaking suddenly during refactors, when loading scenes or when a previously unexplored corner case is hit. #1481 attempts to bypass this, by forcing users to explicitly write out rules about the archetype identities that can be used by, among other things, the ambiguity checker. These would be enforced at component insertion and removal as well, but would represent explicit promises that we could use and reason about for other important things. Archetype invariants are very much an experimental research project though. I'm quite fond of them, but it remains to be seen if they would be feasible for the average end user to reason about, and performant in production. |
Beta Was this translation helpful? Give feedback.
-
@davidscherer, would you be up for making a PR to introduce the "as-if" dependency edges and expose them in We should find a more intuitive name, but that's something we can discuss in the PR thread. |
Beta Was this translation helpful? Give feedback.
-
The Bevy website says:
I am really hoping that this is true. I am a new user of Bevy, and I have found that I like most of its design decisions a lot, but I have grave concerns in this area. I think that it is possible to move the frontier of tradeoffs, not just move along it.
Please read my comments in a respectful and constructive tone!
First of all, I think that the major advantage of the current ("new") executor interface is actually one that is not stated very clearly on the page linked above: it provides a way to specify the interleaving of systems that are defined by different plugins. This is a very important capability in a plugin-based engine architecture and must be addressed by any proposed improvements.
Secondly, so you understand where I am coming from, I think there are usually only two reasonable choices of goal for the end effect of the schedule in a particular game: either (a) perfect determinism (as required by deterministic netcode, replay, etc; it's also the easiest model to reason about, test, and debug), or (b) qualitative determinism (there may be small errors because e.g. floating point addition is not commutative, and intentionally random behavior may be sampled differently, but the probability distribution of behavior of the game does not depend qualitatively on scheduling decisions). My personal belief is that perfect determinism should be the default in the ecosystem, because it is the safest, most easily testable, and most composable choice. (For example: It's fairly easy to test if a plugin you are considering adopting is perfectly deterministic, but very hard to tell the difference between one that is qualitatively deterministic and one that has bugs but only if the number of CPU cores is not a multiple of 4. And you know that any combination of two deterministic plugins is deterministic, but the combination of two qualitatively deterministic plugins might not be qualitatively deterministic!) But there are costs in effort and performance that can add up, so reasonable minds could differ on this point. In any case both must be supported, and in any case there is no use for schedules that produce materially ambiguous results; those are almost by definition difficult-to-reproduce bugs waiting to emerge on particular combinations of hardware, room temperature, changes to unrelated systems, and situations in the game.
Unfortunately, the current design makes materially ambiguous schedules the default, and makes it unergonomic, error-prone, and slow to do anything else. In many cases the material problem caused by an ambiguity will not manifest until long after it is introduced or after the plugin containing it is combined with others. I think this is a very serious problem that will compound over time. Even if I fix this for myself, I have to expect that third party plugins are going to be buggier than they otherwise would be (and that their bugs might arise only in combination), just as undefined behavior and data race issues make dependencies risky in a C++ engine.
I do not find that the duplicated information about system inputs and outputs in my build code makes my code particularly "self-documenting". The reason that system A needs to run before system B is, for example, that system A produces a component each frame that system B consumes. Even if I write some traits that let me "document" that, like
I have just mechanically duplicated (and have to maintain) a bunch of information that was already in the type signatures of the systems involved. I can use
ReportExecutionOrderAmbiguities
to verify that the resulting schedule is not ambiguous, but it can't verify that the duplicated information here isn't wrong or out of date (and perhaps restricting the schedule unnecessarily), and it will also, necessarily, demand ambiguity sets for systems that Bevy, at runtime, could prove are not conflicting. The ergonomics of this entire process are bad, which is very sad because so much effort has gone, from what I can tell successfully, into making the other parts of Bevy's ECS painless to use.Moreover, it's not actually that great for performance. For one, the second easiest thing to do (after leaving the schedule completely ambiguous, most likely creating a combinatorial explosion of bugs) is to make the schedule completely serial, which likely gives up much more performance than just constraining the order. Secondly, specifying the order of two systems always forces the scheduler to execute them serially, even if it turns out (e.g. because of the exact set of archetypes at run time) that they could execute not just in either order but in parallel. Thirdly, having more bugs to track down means having less time to optimize. Worst of all, systems are not really the right granularity for parallelism. If you are serious about performance, you are going to want expensive systems to generate multiple tasks using, for example,
.par_for_each_mut()
. But the proper dependencies among these tasks permit much more parallelism (not just more reordering!) than dependencies among the systems that generate them! If system A mutates a component which system B consumes, then the task that consumes that query in system B for the first 1024 entities in archetype X can't be started until the tasks (if any!) that mutate the first 1024 entities of archetype X in system A are complete. But that doesn't mean that the two systems cannot significantly overlap, even in a perfectly deterministic schedule!I think it is possible to be more ergonomic, faster, deterministic by default, and still make interleaving between plugins specifiable. But I do not think it can be done entirely by layering something on top of the existing interface.
The following proposals should be understood as very tentative. I'm far from an expert on Bevy yet.
At minimum, I think something like an interface distinction between
after
andas_if_after
is needed at a low level (in the executor itself).after
is documented and implemented as ensuring that the execution of the systems does not overlap in time at all (serial execution);as_if_after
would constrain the executor only to ensure that any effects of an "earlier" system in this graph visible to the "later" system through components or resources are visible to the "later" system as it executes (serializable execution).as_if_after
cannot be implemented efficiently in terms ofis_after
;after
cannot be implemented at all in terms ofas_if_after
without the ability to add some kind of false conflict.Given a good implementation of
as_if_after
, specifying an ordered sequence of systems with it will produce a serializable schedule at least as performant as the best schedule that would currently pass ReportExecutionOrderAmbiguities without ambiguity sets (and sometimes a better one, because sometimes a conflict can only be proved impossible at runtime). If you want a schedule faster than that, you should opt out of serializability in some way, asserting (as with ambiguity sets) that you have proven (and commit to maintain) some systems sufficiently commutative.To improve the ergonomics while preserving plugin interleaving, I suggest an algebra of system sets (this PR might be a starting point for this part?), perhaps very roughly like
Although this algebra makes it painless to specify serializable schedules for a single plugin, the underlying model is still defaulting to ambiguity, so you can still get undeclared ambiguity where components and resources cross plugins (e.g. just omit the references to
OtherPlugin
labels in the above example). I think it might be OK to rely onReportExecutionOrderAmbiguities
for that if it can be enabled by default. A more radical approach might be to tie bringing components from another plugin "into scope" for the systems in your SystemSet to including an appropriate label from that plugin into your SystemSet - essentially this would be a similar check but it could be stricter and the error message would be more helpful: "system1
wantsOtherPlugin::Component1
but it has not been brought into scope for it. Try giving it or an enclosing SystemSet a dependency onOtherPlugin::Label5
orOtherPlugin::Label6
." This is not a fully baked idea, but the mental model is that every SystemSet (including individual systems) gets its inputs and provides its outputs to either an enclosing SystemSet or to/from a SystemSet or label that it mentions explicitly. So if I want to useTransform
in my plugin, I can either.then( Renderer::Render )
afterward (so that something I explicitly mention consumes the component) or else callers of my plugin must consumeTransform
for me (e.g. by.then( MyPlugin::FrobTransform ).then( Renderer::Render )
).Task-level dependencies could be a separate project, focused entirely on (primarily opt-in) optimization. For example:
ParallelQuery
only exposes.par_*
methods (which don't wait for completion? have to think about that), and a system taking aParallelQuery
could therefore be scheduled without regard to any conflicts created by that query. Then.par_for_each
would equip each of its tasks with the necessary dependencies created by those conflicts. Or you might actually needAsyncQuery
, so that the system can be rescheduled when its tasks are finished. There are lots more potential parallelism killers to attack (ResMut
, commutatively mutable components, etc)I hope this is understandable and a helpful starting point for discussion.
I have some sympathy for more radical approaches like #2259, because although I want modularity I never asked for the requirement of writing the top levels of my game not in Rust but in a language with just global variables, conditionals, and function calls (but not function parameters or definitions), even without the additional requirement to explicitly specify the dataflow dependency graph between every line of code in it. I suspect that the problems with modularity could be addressed somehow; perhaps for example where today a plugin would export a label so that you can inject systems into the middle of its processing it would instead accept a function over all the components and resources available at that point? But it is very speculative and I think there are some tough challenges, and so it would probably be a mistake to make solutions to (what I see as) serious problems dependent on that.
Beta Was this translation helpful? Give feedback.
All reactions