Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix axis getting mixed up when split leaf #82436

Merged
merged 1 commit into from
Sep 27, 2023

Conversation

Rindbee
Copy link
Contributor

@Rindbee Rindbee commented Sep 27, 2023

order[0] should hold the longest axis (max_axis_index()), while order[POINT::AXIS_COUNT - 1] should hold the shortest axis (min_axis_index()).

@Rindbee Rindbee requested a review from a team as a code owner September 27, 2023 10:06
@lawnjelly
Copy link
Member

lawnjelly commented Sep 27, 2023

Bear in mind, there is a possibility that the comment is out of date (rather than the code wrong way). 😄

I did quite a bit of empirical testing at the time, so it would be nice to double check this is an improvement with some testing (I have vague memory some of the results were counter intuitive).

Agree it does seem like it should be splitting on longest axis preferentially, but would be nice to confirm. It is very possible it could be wrong way around, as this routine occurs rarely and only affects the optimality of the tree (in cases with elongated AABBs, which may be rare), so would be unlikely to be noticed.

I'll try and have another look at this.

Copy link
Member

@lawnjelly lawnjelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think on balance this is worth a go.
It does look like it may have been unintentional, and if the other way had been better, I should have changed the comment at the time.

So I think we should go for this and watch for any regressions. 👍
Also needs changing in 3.x.

(For reference for mergers, this is unlikely to cause bugs, as it just affects the optimality of the tree when reshaping, and thus performance.)

Aside

As an aside it's nice to have another pair of eyes looking through this. It's quite specialized so doesn't get as much review as I would ideally like .. although @pouleyKetchoupp did spend quite a while working on it too. The area most in need of scrutiny if I remember right is the _logic_balance area as it depended on some of Randy Gaul's code.

I did in fact last week start writing a reference version as a kind of "unit test" so we could double check any physics bugs were physics side rather than BVH side. I haven't rolled this out but it didn't detect any bugs so far which is good.

I also think physics should have a reference brute force broad phase, both because of the possibility of bugs in BVH, but also because I have long suspected they are not taking proper account of tunneling. In BVH if you move AABB from A to B it will only consider collisions at A, and at B, rather than in between. Whereas for a move I suspect the physics should be passing a merged AABB of the current and previous AABB. BVH will not do this for the client, that is client responsibility.

@akien-mga akien-mga added this to the 4.2 milestone Sep 27, 2023
@akien-mga akien-mga added the cherrypick:3.x Considered for cherry-picking into a future 3.x release label Sep 27, 2023
@Rindbee
Copy link
Contributor Author

Rindbee commented Sep 27, 2023

The area most in need of scrutiny if I remember right is the _logic_balance area as it depended on some of Randy Gaul's code.

_logic_balance seems to be fine, if we can ensure that the relevant nodes have been refitted.

The areas that I think are problematic are those that use node_remove_item() to remove and re-insert.

After a leaf node with multiple items uses node_remove_item() to remove an item, the aabb of the leaf node is outdated. Before calling refit_branch(), the results obtained by using other refit* methods may also be dirty, that is, these refit* calls seem to have little meaning.

node_remove_item(p_ref_id, tree_id, &abb);
// we must choose where to add to tree
ref.tnode_id = _logic_choose_item_add_node(_root_node_id[tree_id], abb);
_node_add_item(ref.tnode_id, p_ref_id, abb);
refit_upward_and_balance(ref.tnode_id, tree_id);

node_remove_item(ref_id, tree_id);
// we must choose where to add to tree
ref.tnode_id = _logic_choose_item_add_node(_root_node_id[tree_id], abb);
// add to the tree
bool needs_refit = _node_add_item(ref.tnode_id, ref_id, abb);
// only need to refit from the PARENT
if (needs_refit) {
// only need to refit from the parent
const TNode &add_node = _nodes[ref.tnode_id];
if (add_node.parent_id != BVHCommon::INVALID) {
// not sure we need to rebalance all the time, this can be done less often
refit_upward(add_node.parent_id);

@akien-mga akien-mga merged commit ec62b8a into godotengine:master Sep 27, 2023
@akien-mga
Copy link
Member

Thanks!

@lawnjelly
Copy link
Member

lawnjelly commented Sep 27, 2023

After a leaf node with multiple items uses node_remove_item() to remove an item, the aabb of the leaf node is outdated.

There is an optimization in node_remove_item, as recalculating the leaf AABB is expensive. If the removed node is totally within the leaf AABB (with some expansions), then it is not a "corner AABB", it does not push the extents, so there is no need to recalculate the leaf AABB.

To be a bit more clear on some of the thinking : in a few cases, the leaf bounds are purposefully not kept up to date. Even if a corner item is removed (and the leaf made dirty), as long as the bound is conservative (i.e. over-estimates) then nothing should go seriously wrong and the refit can be deferred.

This is done in a few places as keeping the bounds exactly up to date on each change would be prohibitively expensive. Keeping bounds up to date isn't a big deal with a traditional BVH with e.g. 1-4 items per leaf, but when you may be storing e.g. 1024 items in a leaf, the full update needs to be deferred as much as possible.

There may well be improvements available though, I spent a fair bit of time tweaking the refits they are quite difficult to get right.

@Rindbee Rindbee deleted the fix-axis-being-mixed-up branch September 27, 2023 13:53
@Rindbee
Copy link
Contributor Author

Rindbee commented Sep 28, 2023

Before, I was concerned that it might be missing something. Now it feels like some leaf nodes may be calculated multiple times.

  • leaf nodes whose leaf marked dirty will be caught by refit_branch, and then recalculate these leaf nodes.
  • _node_add_item will recalculate the leaf node immediately;
  • split_leaf will create two new leaf node (their leaves will be marked dirty, for recalculate?) ;

It seems that clear should mark dirty as false; only in some special cases, node_remove_item should mark dirty as true.

--- a/core/math/bvh_structs.inc
+++ b/core/math/bvh_structs.inc
@@ -83,7 +83,7 @@ public:
 
        void clear() {
                num_items = 0;
-               set_dirty(true);
+               set_dirty(false);
        }

Aside

I'm confused by the results from test_perf_broadphase. While it feels like some expected stages are shortening, others are increasing. It feels like the modification shouldn't have much effect, but it has a significant effect.

Especially the Remove Time. A few days ago, testing 600x600 on my computer would take around 43s, now it takes around 30s.

@lawnjelly
Copy link
Member

lawnjelly commented Sep 28, 2023

Before, I was concerned that it might be missing something. Now it feels like some leaf nodes may be calculated multiple times.

Calculated in multiple places? Yes they can be. But ideally we don't want to calculate them multiple times though, if possible. But this may occur in some situations.

leaf nodes whose leaf marked dirty will be caught by refit_branch, and then recalculate these leaf nodes.

Yes.

_node_add_item will recalculate the leaf node immediately

_node_add_item returns a bool whether to refit, and like node_remove_item, it only orders a refit if the item added is a "corner item".

If I remember right, the general idea is this:

  • If a leaf AABB is expanded (by adding a border item), it should be calculated immediately.
  • Removing a non-border item does not affect the AABB, no calculation
  • Removing a border item shrinks the AABB, so it is marked as dirty, to be recalculated on the next pass

The AABB is thus a conservative bound.
In some cases expanding an AABB could be deferred, but most subsequent operations require the conservative AABB to be up to date.

split_leaf will create two new leaf node (their leaves will be marked dirty, for recalculate?)

Yes.

It seems that clear should mark dirty as false; only in some special cases, node_remove_item should mark dirty as true.

clear sets the number of items in a node to zero, therefore the currently stored AABB is out of date, thus the node is marked as dirty.

I'm confused by the results from test_perf_broadphase

I'm not familiar with this test, it seems one of pouleyKetchoups. I'm trying it out but get nothing on screen (maybe it deliberately doesn't render). Perhaps you can describe what this test does and what aspect of it you find interesting and why?

While it feels like some expected stages are shortening, others are increasing.

Do you mean "creating objects, adding objects, moving objects"? This is unclear.

It feels like the modification shouldn't have much effect, but it has a significant effect.

What modification are you referencing?

Especially the Remove Time. A few days ago, testing 600x600 on my computer would take around 43s, now it takes around 30s.

What has changed in these few days?

I'm really not clear on what you are saying here, if you can be more precise with language this could help (maybe this has been auto-translated?), and also it is important to not assume any knowledge.

@Rindbee
Copy link
Contributor Author

Rindbee commented Sep 28, 2023

It seems that clear should mark dirty as false; only in some special cases, node_remove_item should mark dirty as true.

clear sets the number of items in a node to zero, therefore the currently stored AABB is out of date, thus the node is marked as dirty.

Currently, the item will be added immediately after clear, which will immediately update the aabb of the leaf node. So it seems unnecessary for clear to mark dirty as true (and in most cases it is required that num_items > 0, the leaf node as the root node is an exception).

Do you mean "creating objects, adding objects, moving objects"? This is unclear.

Yes. The stages recorded in the output.

What modification are you referencing?

That's the result before and after applying this PR. I haven't figured out why yet.

What has changed in these few days?

I'm sure it should be before and after this PR is merged. I verified it by reverting the commit.

Some other interesting tests, they all have a greater impact on the time of "Removing objects":

Test 1.

Similarly, revert this PR, and then replace the above code with

if (!pending_shape_update_list.in_list()) {
	GodotPhysicsServer2D::godot_singleton->pending_shape_update_list.add(&pending_shape_update_list);
}

Test 2.

In the script, here, if removed in the reverse order, it takes less time.

@Rindbee
Copy link
Contributor Author

Rindbee commented Sep 28, 2023

I'm not familiar with this test, it seems one of pouleyKetchoups. I'm trying it out but get nothing on screen (maybe it deliberately doesn't render). Perhaps you can describe what this test does and what aspect of it you find interesting and why?

I'm not particularly familiar with it either, I just started learning about bvh last week and I'm not sure if these tests are appropriate.
It seems to add objects without overlapping, change the object's position, and then remove the object. It seems that the location and size are not random enough.

@lawnjelly
Copy link
Member

lawnjelly commented Sep 28, 2023

Currently, the item will be added immediately after clear, which will immediately update the aabb of the leaf node. So it seems unnecessary for clear to mark dirty as true (and in most cases it is required that num_items > 0, the leaf node as the root node is an exception).

I don't understand, is marking it as dirty causing a problem, or was this suggesting an optimization, or am I missing a call to set this as non-dirty afterwards?

That's the result before and after applying this PR. I haven't figured out why yet.

Especially in artificial benchmarks, changes in the tree structure could lead to more noticeable changes in performance. The test in question seems to create items in a regular grid as far as I can see. Now whether any change in timing is due to directly the BVH or due to more collisions being processed by the physics is a question which can be revealed by profiling. (This is part of the reason the physics should also have a switchable reference broad phase imo.)

if removed in the reverse order, it takes less time.

This is generally the case due to the scene tree processing, (as well as order being likely important for tree structures such as BVH). This is why "global timings" often end up being difficult to interpret. It is important not to conflate changes in timing caused by other parts of the engine, with parts that are trying to be examined. Using several timers for different parts of the code can help here, and of course profiling.

One thing that pouley did identify as a problem was that physics objects were often added at the origin then moved to their start location. This was causing all sorts of unnecessary collision processing to happen, because if everything started at the origin it all started in collision. Whether this was fully fixed in the physics I don't know.

@Rindbee
Copy link
Contributor Author

Rindbee commented Sep 28, 2023

I don't understand, is marking it as dirty causing a problem, or was this suggesting an optimization, or am I missing a call to set this as non-dirty afterwards?

There seems to be no big problem. I thought that setting dirty to false would slightly reduce the time, but the result was not much improved. The number of leaf nodes requiring calls to refit_upward is indeed reduced. I feel that refit_upward is not particularly expensive. Maybe the generated aabb is not random enough?

@Rindbee
Copy link
Contributor Author

Rindbee commented Sep 28, 2023

set_dirty(true); set_dirty(false);
0 1

@lawnjelly
Copy link
Member

lawnjelly commented Sep 28, 2023

I've spent a little time this morning testing the splitting and it seems mostly ok. It is possible that we can set the child leaves to non-dirty after the split, as their AABB seems to be up to date at the end of the splitting routine.

This could be causing a single extra un-needed refit at a later stage (it is done in in incremental_optimize, which is done once per tick or so), but it's probably not registered as a performance problem so far because:

  1. Splitting is rare, and the refit only occurs once I think.

I'll try and evaluate a bit better whether we can setting the child leaves as non-dirty after a split. I think it should be okay, but there's likely little gain to performance, and risk of regression, so worth treading with care. It could be that although the refit to the leaf itself isn't required, it ends up being needed for the ancestor nodes.

@Rindbee
Copy link
Contributor Author

Rindbee commented Sep 28, 2023

This seems to be a bug.

refit_upward(p_node_id);

p_node_id might be rp.node_id.

@lawnjelly
Copy link
Member

lawnjelly commented Sep 28, 2023

This seems to be a bug.

refit_upward(p_node_id);

p_node_id might be rp.node_id.

Ah yes well spotted, that doesn't look right.

Interesting that it was working ok with this bug .. suspect it had been getting by because:

  1. The splitting code didn't really need this refit (as above).
  2. This was also done when removing a border item, which should shrink the node AABB. But not shrinking it didn't break things, just made it less efficient

We should fix this, am just checking there aren't any other implications.

I think given that the dirty flag wasn't actually working, you are correct and we should probably fix this and set_dirty to false on creating new leaf nodes. Then the onus is on the calling code to:

  • Call refit upward immediately when expanding the borders when adding items to the leaf
  • Make dirty when reducing the borders of the leaf when removing items

In a way it would be nice to defer the case of adding items but I'm not sure it gains much because in almost every case the next operation requires refitting the tree to get everything up to date. 🤔

@Rindbee if you can make a PR I can review. I think for now we can just fix above bug (replacing with rp.node_id) and set_dirty(false) on clear, as this second thing must be safe given that it wasn't working(!) up to now. 😁

As far as I can see there are no major performance regressions to this (it should in theory be faster in the long run as the tree should be more optimal, but it's doing more refits than before). If necessary we could limit the refits to one per tick in the incremental optimize but I think it looks ok so far.

@Rindbee
Copy link
Contributor Author

Rindbee commented Sep 28, 2023

The number of splits is related to the number of items (_refs.used_size()).

It may be a little slower when adding items because the items will be moved (removed and reinserted), leaf nodes may be more likely to be marked dirty.

if (s.bpid == 0) {
s.bpid = space->get_broadphase()->create(this, i, shape_aabb, _static);
space->get_broadphase()->set_static(s.bpid, _static);
}
space->get_broadphase()->move(s.bpid, shape_aabb);

The results in #82436 (comment) are uncommon because aabb in actual cases may be more random.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug cherrypick:3.x Considered for cherry-picking into a future 3.x release topic:core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants