Display list culling prototype based on buffer offsets #38189

flar · 2022-12-11T03:28:03Z

This is a variant of #38304 that uses offsets of the ops within the buffer rather than indices to control which ops re rendered or culled. The description that follows is mostly the same as the other PR.

This PR creates the underpinnings for a mechanism that can be used to cull DisplayLists based on the offsets of the rendering operations that are desired. Such a list of offsets could be obtained from the RTree that the DisplayList maintains as long as it is annotated with the offset into the buffer of the DLOp that generated the rectangle in the tree.

Note that only the offsets of the rendering ops are required.

The following ops will cull themselves if they are not needed:

all rendering ops (Draw...Op)
all saveLayers with their matching restores
all save operations with their matching restores
all transform ops
all clip ops

The following ops will not cull themselves and will always be executed. The tracking to prevent them from being executed would require a lot of overhead that is unlikely to save much time:

all set/clear attribute calls, most of which assign a simple value into a field in the dispatcher, some of which may cause a shared object to be created, though.

The code has been tested by setting the "next_rendering_offset" to the current offset or to 0 and all of the tests seem to pass which indicates that either of these techniques can be used in the un-culled Dispatch case. To enable culling, a list of offsets would have to be provided and the line of code in DisplayList::Dispatch which talks about how to update next_rendering_offset would need to be updated to fill in the offset of the next rendering primitive in the list whenever we pass and execute the previous primitive.

The performance of the overhead of this mechanism has not (yet) been measured.

flar · 2022-12-12T20:41:31Z

Proceeding further with this prototype would like require landing the code that merges the bounds/rtree generation into the builder process so that we can remember the buffer offsets for the various primitives that triggered entries in the rtree.

I looked briefly at a workaround, but the bounds/rtree calculators that are currently used don't have access to the offsets of the primitives that they are measuring and potentially recording and we won't really know at construction/build time which primitives will have an associated rect in the tree. The main issues seem to be:

Primitives which flood the clip due to modifying transparent black often simply record a flag rather than add a bounds
saveLayer usually doesn't add a bounds, and this prototype doesn't need them to do so, but in some cases of having attributes that flood the enclosing clip, a saveLayer might add the clip bounds to the accumulator.

Because of the indeterminate nature of the above optional bounds cases, the builder can't simply remember all of the offsets proactively since not every offset is associated with an entry in the tree.

flar · 2022-12-12T20:42:31Z

Looking for feedback from @ColdPaleLight and @endless7

flar · 2022-12-12T21:54:59Z

It occurs to me that a variation of this prototype could be based on the "index" of the rendering primitives - i.e. counting records rather than offsets. That variation would require the rendering ops (or any op that has a bounding box in the rtree) to increment the cur_index rather than having that done in the enclosing loop.

flar · 2022-12-13T09:14:35Z

I have a prototype of the "index" version that I've also plumbed to remember the indices of the ops that have a rectangle in the rtree so that it can actually perform an RTree search and identify the ops that need to be culled. I need to modify the rendering code to pass in the cull_rect at render time to see if it is working, though. I'll submit that as a new PR so that the 2 implementations can be compared.

ColdPaleLight · 2022-12-13T15:20:09Z

I have a prototype of the "index" version that I've also plumbed to remember the indices of the ops that have a rectangle in the rtree so that it can actually perform an RTree search and identify the ops that need to be culled. I need to modify the rendering code to pass in the cull_rect at render time to see if it is working, though. I'll submit that as a new PR so that the 2 implementations can be compared.

The implementation in SkPicture is very similar to this, and @endless7 has tried this solution before. The conclusion at that time is that this solution requires a one-to-one correspondence between the rects in RTree and the Ops in DisplayList (they are already in one-to-one correspondence in Skia), but now there is not a one-to-one correspondence in DisplayList.

c.f.
https://github.com/google/skia/blob/b37516f2cf040e02783810a6c587555cbff559c1/src/core/SkRecordDraw.cpp#L26-L47

flar · 2022-12-13T17:22:04Z

The implementation in SkPicture is very similar to this, and @endless7 has tried this solution before. The conclusion at that time is that this solution requires a one-to-one correspondence between the rects in RTree and the Ops in DisplayList (they are already in one-to-one correspondence in Skia), but now there is not a one-to-one correspondence in DisplayList.

c.f. https://github.com/google/skia/blob/b37516f2cf040e02783810a6c587555cbff559c1/src/core/SkRecordDraw.cpp#L26-L47

This whole PR is to demonstrate that that conclusion is not accurate. Neither of these prototypes necessarily require that condition.

saveLayer operations can have a rect or not, they will still be executed appropriately.

Rendering operations require an op, though, (unless they are a NOP or clipped out) and that is the one glitch left. Right now if a rendering operation has an attribute which causes it to flood the clip area, and no clip has been established and no bounds were handed to the Builder, then it will only set a flag. For RTree generation we should simply require a bounds to be handed to the Builder and then there will be a 1:1 correspondence between rendering ops and rects and the requirements of these 2 prototypes will be satisfied.

flar · 2022-12-14T22:06:10Z

Waiting for #34365 to land before I do much more work on this concept. It will be much easier to track ops vs rects once that PR lands.

One issue that I see is that sometimes adding a DL will add a bunch of rects to the RTree and they might all show up in the query that is trying to do render clipping. We might want to have a way to query the RTree for embedder overlay detection that includes those sub-rects and another way that is doing render culling that only queries on the bounds. This could all be done with metadata and filtering in the methods that do the query without having to compute multiple RTrees, but given that dispatching a sub-DL from a parent-DL can pass along the culling, we only need its rect to appear once in the query list for render culling.

flar · 2022-12-15T01:07:59Z

The "op index" variant of this PR has been submitted as #38304

flar · 2022-12-22T02:22:38Z

Closing in favor of #38429

flar added 2 commits December 10, 2022 19:12

Conceptual prototype for RTree culling in DisplayList

d78fce8

remove potentially ambiguous null ptr

146fba5

flar mentioned this pull request Dec 11, 2022

DisplayList has no proactive culling as was available in SkPicture flutter/flutter#92740

Closed

flar mentioned this pull request Dec 15, 2022

Display list culling prototype based on op indices #38304

Closed

flar changed the title ~~Display list culling prototype~~ Display list culling prototype based on buffer offsets Dec 15, 2022

flar marked this pull request as draft December 15, 2022 01:10

flar mentioned this pull request Dec 20, 2022

Display list R-Tree culling #38429

Merged

flar closed this Dec 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Display list culling prototype based on buffer offsets #38189

Display list culling prototype based on buffer offsets #38189

flar commented Dec 11, 2022 •

edited

Loading

flar commented Dec 12, 2022

flar commented Dec 12, 2022

flar commented Dec 12, 2022

flar commented Dec 13, 2022

ColdPaleLight commented Dec 13, 2022

flar commented Dec 13, 2022 •

edited

Loading

flar commented Dec 14, 2022

flar commented Dec 15, 2022

flar commented Dec 22, 2022

Display list culling prototype based on buffer offsets #38189

Display list culling prototype based on buffer offsets #38189

Conversation

flar commented Dec 11, 2022 • edited Loading

flar commented Dec 12, 2022

flar commented Dec 12, 2022

flar commented Dec 12, 2022

flar commented Dec 13, 2022

ColdPaleLight commented Dec 13, 2022

flar commented Dec 13, 2022 • edited Loading

flar commented Dec 14, 2022

flar commented Dec 15, 2022

flar commented Dec 22, 2022

flar commented Dec 11, 2022 •

edited

Loading

flar commented Dec 13, 2022 •

edited

Loading