[encoding] Fix path_reduced_scan buffer size #551

armansito · 2024-04-19T16:51:09Z

This is the intermediate TagMonoid buffer that is used by the two-stage scan in the "use_large_path_scan" case. This needs to be the same size as the overall path_reduce output buffer, which gets rounded up to a multiple of the path_reduce workgroup size.

This hasn't been a problem until now because src/wgpu_engine.rs allocates Buffers that are generally larger than the entries returned in the BufferSizes structure, due its quantized size class / pooling strategy. The bindings get their view size assigned based on the whole size of the buffer.

Skia uses a similar allocation strategy but assigns view size to be the precise value from BufferSizes. This causes incorrect behavior as WGSL clamps buffer accesses to the view size.

The now corrected buffer size fixes this issue.

This is the intermediate TagMonoid buffer that is used by the two-stage scan in the "use_large_path_scan" case. This needs to be the same size as the overall path_reduce output buffer, which gets rounded up to a multiple of the path_reduce workgroup size. This hasn't been a problem until now because src/wgpu_engine.rs allocates Buffers that are generally larger than the entries returned in the BufferSizes structure, due its quantized size class / pooling strategy. The bindings get their view size assigned based on the whole size of the buffer. Skia uses a similar allocation strategy but assigns view size to be the precise value from BufferSizes. This causes incorrect behavior as WGSL clamps buffer accesses to the view size. The now corrected buffer size fixes this issue.

raphlinus

Yes, this is definitely a problem in the original - pathtag_scan1 writes in full workgroup increments. I'm assuming the observed problem only manifests with more than 256k pathtags? I did look at the code in the small case and it seems fine, but it's possible I've missed something.

In any case, good catch.

armansito · 2024-04-19T18:10:04Z

Yes, this is definitely a problem in the original - pathtag_scan1 writes in full workgroup increments. I'm assuming the observed problem only manifests with more than 256k pathtags? I did look at the code in the small case and it seems fine, but it's possible I've missed something.

In any case, good catch.

Yes, I've only observed this in the large scan case.

raphlinus approved these changes Apr 19, 2024

View reviewed changes

armansito added this pull request to the merge queue Apr 19, 2024

Merged via the queue into main with commit b1dc07e Apr 19, 2024
15 checks passed

armansito deleted the fix-scan-buffer-size branch April 19, 2024 18:17

waywardmonkeys added this to the Vello 0.2 release milestone May 3, 2024

DJMcNab mentioned this pull request May 16, 2024

Full system hang on Apple M1 8GB #548

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[encoding] Fix path_reduced_scan buffer size #551

[encoding] Fix path_reduced_scan buffer size #551

armansito commented Apr 19, 2024

raphlinus left a comment

armansito commented Apr 19, 2024

[encoding] Fix path_reduced_scan buffer size #551

[encoding] Fix path_reduced_scan buffer size #551

Conversation

armansito commented Apr 19, 2024

raphlinus left a comment

Choose a reason for hiding this comment

armansito commented Apr 19, 2024