Separate narrow phase from solver into NarrowPhasePlugin
#100
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
Currently, narrow phase collision detection is tightly coupled with the penetration constraint solver. This makes implementing more advanced collision detection features like child colliders quite cumbersome and difficult, makes things much harder to parallelize, makes it impossible to use custom collision detection implementations without rewriting the entire solver, and is overall very annoying to work with.
To make things more modular and easier to work with, it would be great to have a separate narrow phase. However, XPBD has to deal with changing constraint directions and positions in the solver, which can lead to bodies moving into a penetrating state from a non-penetrating state. This can lead to missed collisions if collisions are computed just once before the solver, which is why the separate narrow phase hasn't been implemented earlier.
Implementation
This PR separates the narrow phase into a
NarrowPhasePlugin
that nicely encapsulates all collision detection logic and handles collision events. The solver then iterates through the collisions, and creates and solves penetration constraints for them. As a nice bonus, colliders no longer have to be attached to rigid bodies for collisions to be detected.To solve the problem with missed collisions, the narrow phase uses a "prediction distance". It essentially allows "speculative contacts" between entities that aren't quite penetrating yet but might penetrate at some point during the constraint solve, and the penetration constraints check if the bodies are actually colliding by computing the penetration depth at the current state.
In addition, this PR revamps how collision events and
CollidingEntities
are handled in order to resolve some problems. Collision events are only sent at the end of each run of thePhysicsSchedule
instead of during each substep, which from the user's perspective eliminates duplicate events, since users don't generally schedule their systems inside theSubstepSchedule
.CollidingEntities
are also updated based on the events that are sent at the end of the physics schedule, so you won't get missing entities in cases where the bodies don't happen to be penetrating during the last substep.Finally, I added a
Collisions
resource that stores all collisions between colliders in anIndexMap
that uses fxhash. This is mainly used for the collisions event to track when entities begin/stop penetrating, but in the future, it could be used to optimize the contact manifold computation by giving Parry the previous contact manifolds.Collisions
also has some potentially useful methods likecollisions_with_entity
that people can use.Performance
Separating the narrow phase from the solver like this increases the number of
get_many
/get_many_mut
calls quite significantly. However, I managed to do some other optimization to the narrow phase, and I implemented simple multithreading using Bevy'sComputeTaskPool
. This multithreading is controlled by the newparallel
feature, which is enabled by default.Below are Tracy traces that show the performance of the most expensive systems. The traces were running for slightly different durations, so the percentages are more important than the raw lengths in seconds. They were run on a 13th Gen Intel i7-13700F.
Previously:
Now, with
parallel
feature disabled:Now, with
parallel
feature enabled:As you can see, the performance seems to be slightly better overall, especially with multithreading, at least on my machine. In cases where there are very few collisions, the multithreading can cause unnecessary overhead, but in general it should be faster. However, Criterion benchmarks seem to indicate that performance is worse, while in my testing it seems better, so the results are a bit contradictory.
The multithreaded version also seems to be deterministic, but I'm not sure if
par_splat_map
should actually be deterministic, so it could be just coincidence. If it is, we can just sort the collisions by e.g. entity.Future work
Separating the narrow phase like this makes it much easier to implement more collision detection features, like the following:
These will be implemented in future PRs.
Todo
CollisionStarted
,CollisionEnded
andCollidingEntities
correctly; they shouldn't be sent/cleared at every substep, but rather at the end of every physics frame