-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Display list culling prototype based on buffer offsets #38189
Conversation
Proceeding further with this prototype would like require landing the code that merges the bounds/rtree generation into the builder process so that we can remember the buffer offsets for the various primitives that triggered entries in the rtree. I looked briefly at a workaround, but the bounds/rtree calculators that are currently used don't have access to the offsets of the primitives that they are measuring and potentially recording and we won't really know at construction/build time which primitives will have an associated rect in the tree. The main issues seem to be:
Because of the indeterminate nature of the above optional bounds cases, the builder can't simply remember all of the offsets proactively since not every offset is associated with an entry in the tree. |
Looking for feedback from @ColdPaleLight and @endless7 |
It occurs to me that a variation of this prototype could be based on the "index" of the rendering primitives - i.e. counting records rather than offsets. That variation would require the rendering ops (or any op that has a bounding box in the rtree) to increment the cur_index rather than having that done in the enclosing loop. |
I have a prototype of the "index" version that I've also plumbed to remember the indices of the ops that have a rectangle in the rtree so that it can actually perform an RTree search and identify the ops that need to be culled. I need to modify the rendering code to pass in the cull_rect at render time to see if it is working, though. I'll submit that as a new PR so that the 2 implementations can be compared. |
The implementation in SkPicture is very similar to this, and @endless7 has tried this solution before. The conclusion at that time is that this solution requires a one-to-one correspondence between the rects in RTree and the Ops in DisplayList (they are already in one-to-one correspondence in Skia), but now there is not a one-to-one correspondence in DisplayList. |
This whole PR is to demonstrate that that conclusion is not accurate. Neither of these prototypes necessarily require that condition. saveLayer operations can have a rect or not, they will still be executed appropriately. Rendering operations require an op, though, (unless they are a NOP or clipped out) and that is the one glitch left. Right now if a rendering operation has an attribute which causes it to flood the clip area, and no clip has been established and no bounds were handed to the Builder, then it will only set a flag. For RTree generation we should simply require a bounds to be handed to the Builder and then there will be a 1:1 correspondence between rendering ops and rects and the requirements of these 2 prototypes will be satisfied. |
Waiting for #34365 to land before I do much more work on this concept. It will be much easier to track ops vs rects once that PR lands. One issue that I see is that sometimes adding a DL will add a bunch of rects to the RTree and they might all show up in the query that is trying to do render clipping. We might want to have a way to query the RTree for embedder overlay detection that includes those sub-rects and another way that is doing render culling that only queries on the bounds. This could all be done with metadata and filtering in the methods that do the query without having to compute multiple RTrees, but given that dispatching a sub-DL from a parent-DL can pass along the culling, we only need its rect to appear once in the query list for render culling. |
The "op index" variant of this PR has been submitted as #38304 |
Closing in favor of #38429 |
This is a variant of #38304 that uses offsets of the ops within the buffer rather than indices to control which ops re rendered or culled. The description that follows is mostly the same as the other PR.
This PR creates the underpinnings for a mechanism that can be used to cull DisplayLists based on the offsets of the rendering operations that are desired. Such a list of offsets could be obtained from the RTree that the DisplayList maintains as long as it is annotated with the offset into the buffer of the DLOp that generated the rectangle in the tree.
Note that only the offsets of the rendering ops are required.
The following ops will cull themselves if they are not needed:
The following ops will not cull themselves and will always be executed. The tracking to prevent them from being executed would require a lot of overhead that is unlikely to save much time:
The code has been tested by setting the "next_rendering_offset" to the current offset or to 0 and all of the tests seem to pass which indicates that either of these techniques can be used in the un-culled Dispatch case. To enable culling, a list of offsets would have to be provided and the line of code in DisplayList::Dispatch which talks about how to update next_rendering_offset would need to be updated to fill in the offset of the next rendering primitive in the list whenever we pass and execute the previous primitive.
The performance of the overhead of this mechanism has not (yet) been measured.