-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lift line vertex/strip count limitations #5207
Conversation
2fcee4b
to
a785b2c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calculating the number of lines needed beforehand is quite annoying and error prone.
I think it makes sense for the big stuff (when a user has logged line segments), but for smaller stuff (camera ray lines, bounding box edges, …), I wonder if we could instead use a dynamic mode for the line allocator, i.e. that collects in RAM and push to VRAM once finished.
crates/re_space_view_spatial/src/visualizers/transform3d_arrows.rs
Outdated
Show resolved
Hide resolved
crates/re_space_view_spatial/src/visualizers/transform3d_arrows.rs
Outdated
Show resolved
Hide resolved
crates/re_space_view_spatial/src/visualizers/transform3d_arrows.rs
Outdated
Show resolved
Hide resolved
yes, probably. But do we really want to fork the linebuilder right now :/ |
Hmmm actually.. maybe we can make the regular line builder growable by having several cpu-write-gpu-read buffers in there 🤔 |
/// row size in bytes is a multiple of `wgpu::COPY_BYTES_PER_ROW_ALIGNMENT`. | ||
/// This makes it a lot easier to copy data from a continuous buffer to the texture. | ||
/// If we wouldn't do that, we'd need to do a copy for each row in some cases. | ||
pub fn data_texture_size( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these three functions were moved without change other than a todo note on their now short-lived fate
Alright! This is now worlds better than before and will make putting everything else in points & lines onto a cpu-write-gpu-read buffer a breeze!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love it!
But the main invariant is not always guaranteed to be kept, so this is buggy
I wish there was an easy way to write unit tests for this new allocator without shimming all wgpu calls 🤔 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I added checkboxes to explicitly test on both WebGL and WebGPU
…ht perf improvements for large point clouds (#5229) ### What * Part of #1398 * Follow-up to #5207 Point cloud builder uses now `DataTextureSource` for all its data. `DataTextureSource` is now better at dealing with the max texture size: data writes are clamped ahead of time, meaning we no longer reserve memory for data that we can't write into the texture upon `DataTexturesSource::finish`. This bubbled up now since `PointCloudBuilder` so far used a vec for positions and checked that against the maximum. The changes here leave the check in place, but make `DataTextureSource` reserving & writing clamp to the maximum and deal with this robustly even if high level data structures (like `PointCloudBuilder`) fail to handle this. Artificially limiting the data texture size to 256 * 256 looks like this: <img width="1751" alt="image" src="https://github.com/rerun-io/rerun/assets/1220815/ff65944c-b499-4772-8993-9619daf8d6b0"> I tested performance before/after on these changes on two opf scenes on my mac to check for changes in perf and found the switch to using `DataTextureSource` for position/radius improved performance slightly: * olympic (1.5mio points): ~7.0ms -> ~6.8ms * rainwater (4.6mio points): ~17.1ms-> ~16.5ms (numbers via performance metrics, release build) ### Checklist * [x] I have read and agree to [Contributor Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and the [Code of Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md) * [x] I've included a screenshot or gif (if applicable) * [x] I have tested the web demo (if applicable): * Using newly built examples: [app.rerun.io](https://app.rerun.io/pr/5229/index.html) * Using examples from latest `main` build: [app.rerun.io](https://app.rerun.io/pr/5229/index.html?manifest_url=https://app.rerun.io/version/main/examples_manifest.json) * Using full set of examples from `nightly` build: [app.rerun.io](https://app.rerun.io/pr/5229/index.html?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json) * [x] The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG * [x] If applicable, add a new check to the [release checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)! - [PR Build Summary](https://build.rerun.io/pr/5229) - [Docs preview](https://rerun.io/preview/963956c93faa135571b48dc764f26ffea87e186b/docs) <!--DOCS-PREVIEW--> - [Examples preview](https://rerun.io/preview/963956c93faa135571b48dc764f26ffea87e186b/examples) <!--EXAMPLES-PREVIEW--> - [Recent benchmark results](https://build.rerun.io/graphs/crates.html) - [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)
What
Applies the same principles as on the previous PR that fixes the point cloud limits.
The key difference here is that we use lines in a lot of places, making this a bigger refactor than originally assumed.
The need to know both strip & vertex count for lines ahead of time is a bit problematic and isn't always as easy as it was with points. We err on the side of generating more draw data bundles, but to limit this (a single drawdata for a single line is extremely wasteful as we have to allocate a bunch of textures, buffers, bind groups, etc. etc.) I had to introduceLineDrawableBuilderAllocator
which is a very simplistic Vec-like allocator (minus the increase in size) forLineDrawableBuilder
(previously calledLineStripSeriesBuilder
).I'm not super happy with this construct overall, but it's the best I could come up with in the short-term and things seem to be fairly robust and at least not overly complicated.
In the future it would be nice to reconcileLineDrawableBuilderAllocator
andLineDrawableBuilder
into a single construct, likely still with the limitations that the size of a batch (think named unit with a transform) needs to be known ahead of time, which is practically always the case!Second iteration: There's now
DataTextureSource
(ideas for better names?) which is essentially a thing where you can throw data in and get a data texture out! It handles all the copies and dynamic sizings for you. This makes everything awesome because now we can handlereserve
call just as an optimization without requiring them and without being on a bad path if you don't! <3Checklist
main
build: app.rerun.ionightly
build: app.rerun.io