Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit system ordering #1040

Closed
alice-i-cecile opened this issue Dec 10, 2020 · 6 comments
Closed

Explicit system ordering #1040

alice-i-cecile opened this issue Dec 10, 2020 · 6 comments
Labels
A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible

Comments

@alice-i-cecile
Copy link
Member

What problem does this solve or what need does it fill?

Ensuring that one system A has run before the next system B is an extremely common pattern, allowing us to ensure that the data in B has been pre-processed appropriately before we handle it.

As discussed in the alternatives section, there are three ways to get similar behavior: implicit declaration order, stages and system chaining. All of these are inadequate for the common case where you simply care that A has completed before B begins.

Describe the solution would you like?

Option 1. Remove the implicit guarantee that systems added first within a stage run before those added after it. This encourages people writing confusing, brittle code that can't be refactored safely. In its place, add a data-structure that unambiguously specifies the dependency graph of conflicting access to data within a stage, ideally verified at compile-time.

The total order created by the system registration order is sufficient, but not necessary, to ensure that we can always find a deterministic system ordering. While this total ordering is intuitive for small programs, it becomes unwieldy for games with more than one file adding systems to the same stage and the overspecification can cause unpredictable changes to unrelated systems.

Instead, we can get away with constructing a directed acyclic graph, with all systems within the stage as a node. In order for a system to be scheduled, each of its parents must be complete. But much of this graph can be constructed automatically: only key relationships between systems that share conflicting component / resource access need to be disambiguated.

This ensures that ordering constraints of each stage is well-documented in a single place, trivial to helpfully visualize, easy to consume for new schedulers and must be manually changed when important new behavior is modified.

If we ensure that systems are static (and instead toggle optionality with run criteria and custom schedulers), this graph can be constructed and verified unambiguous at compile time, making it a powerful dev tool to prevent painful ordering-driven logic bugs and identify system parallelism bottlenecks.

Option 2. Add a good way to examine and modify the system registration order, in order to detect and fix problems created by system ordering. Adding systems before or after another system could then, at game startup, simple squish into the registration order in a postprocessing step, bumping later entries down. Of course then, if you have multiple systems being added after the same system that was inserted in the ordinary way, you've reproduced the pattern and probably need to rely on registration order of the ordered systems as well :(

Describe the alternative(s) you've considered?

  1. Implicit declaration order: simply declare the systems in order, and the scheduler makes sure they obey certain rules. This is very hard to follow, prone to being accidentally broken in mysterious ways, and forces us to record and verify the declaration order in our scheduler. The problems with this get much worse as the project grows, and systems are added in plug-ins, with the timing interacting in mysterious ways based on the arbitrary order that the plug-ins themselves are added.

  2. Stages: You can ensure that one system runs after another by ensuring that they're in distinct stages. But this introduces a serious parallelism hit: rather than only waiting for your dependencies to complete, you're also stuck waiting for all the other systems in that stage to complete. On larger projects, I could also see this producing a messy proliferation of stages whose purpose and relation becomes unclear.

  3. System chaining: This pattern seems much better suited to simple utility functions added on to systems. You do not seem to be able to delay execution, or ergonomically create branching dependencies in either direction.

Additional context

This is closely related to the discussion in #1021, Stages v2. Having an explicit way to ensure system ordering would make the process of entering and exiting states much cleaner, and this feature will need integration with the .with_run_criteria discussed there.

@alice-i-cecile
Copy link
Member Author

If we decide that both of these behaviours could be desirable for different use cases (simple vs. complex games for example), we could enforce explicit dependencies with a flag of some fashion. I think that the natural place for that to go would be at the stage level, as the complexity of various stages (e.g. UPDATE vs. LAST) could vary significantly even within the same complex game.

@cart
Copy link
Member

cart commented Dec 10, 2020

Remove the implicit guarantee that systems added first within a stage run before those added after it.

This isn't quite true. Execution order is "arbitrary" for systems by default. It is defined by registration order only for systems that conflict.

I'm pretty well convinced at this point that the design @Ratysz has suggested (both on the discord and in his blog post) is the right call. Your suggestion sounds pretty similar.

In a nutshell (@Ratysz correct me if im wrong)

  1. Within a stage, Systems run in parallel by default (in arbitrary order). Systems with no conflicts and no manual dependencies run first
  2. When a system finishes executing, we greedily run new systems whose constraints have been met
  3. Repeat until the schedule is finished

Heuristics can also be added to ensure that high-cost systems run first in contexts where cores are saturated (both in (1) and in (2))

This does get a little interesting when we throw with_run_criteria and SystemSets into the mix (as discussed in #1021), but I think in general thats solved by evaluating execution criteria at the beginning of the stage (which also makes sense because it makes execution logic more predictable).

@Ratysz
Copy link
Contributor

Ratysz commented Dec 10, 2020

In a nutshell [...]

Yes, that's pretty much the gist of it; here is a non-gist of it from a while ago.

That specific implementation suffers from being a third-party library - mostly the Arc<Mutex<Box<dyn System>>> travesty that requires locking to check for conflicts. An integrated version will be easily able to avoid that and only lock for actually running the system, if even that - the algorithm's guarantees of single active access to system's state and lack of component+resource conflicts, failsafed with component and resource borrow checking, should be sufficient to let us do lock-free unsafe access.

[...] solved by evaluating execution criteria at the beginning of the stage [...]

Yes; in the linked implementation, what I would try for accomodating that is changing this:

scope.spawn(async move {
    listener.await;
    system
        .try_lock()
        .unwrap_or_else(|| unreachable!())
        .run(world, resources);
    sender.send(id).await.unwrap();
});

to this:

if system.run_criteria_satisfied() {
    scope.spawn(async move {
        listener.await;
        system
            .try_lock()
            .unwrap_or_else(|| unreachable!())
            .run(world, resources);
        sender.send(id).await.unwrap();
    });
} else {
    scope.spawn(async move {
        listener.await;
        sender.send(id).await.unwrap();
    });
}

This maintains the disabled system's presence for dependency resolution purposes.

Other constraints can also be supported, either in a similar preemptive fashion, or by adding them to the conflict check. We could even support opt-in registration order preservation on a per-system basis.

I would put this all together into a PR, but at this point it would be a draft PR to a draft PR.

@cart
Copy link
Member

cart commented Dec 10, 2020

@Ratysz yeah you might want to wait for me to wrap up the SystemSets impl / us all to (hopefully) agree on that impl.

if system.run_criteria_satisfied()

Just keep in mind that run_criteria is a system too, so it needs world/resource access and it shouldn't run in parallel unless it is properly scheduled.

@cart
Copy link
Member

cart commented Dec 10, 2020

But I also doubt I'll change the current SystemStage much/at all, so you could probably get started on non-systemset scheduling now if you don't want to wait.

@Moxinilian Moxinilian added A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible labels Dec 10, 2020
@alice-i-cecile
Copy link
Member Author

This will be closed by #1144.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible
Projects
None yet
Development

No branches or pull requests

4 participants