Skip to content

Commit

Permalink
Lift line vertex/strip count limitations (#5207)
Browse files Browse the repository at this point in the history
### What

* follow-up of #5192 
* Fixes #3076
* Fixes #4844
* _technically_ there's still limits (see #5192 ) but they are machine
dependent making this relatively hard to query. Overall I'd argue we
removed the previous limits so there's no need for this in the original
sense.

<img width="1252" alt="image"
src="https://github.com/rerun-io/rerun/assets/1220815/512be197-9817-4f5a-8054-e3661346d361">
(note that we haven't spend a lot of time optimizing the collection of
lines, large amounts of lines don't perform all that well so far)


Applies the same principles as on the previous PR that fixes the point
cloud limits.
The key difference here is that we use lines in *a lot* of places,
making this a bigger refactor than originally assumed.

---

~The need to know both strip & vertex count for lines ahead of time is a
bit problematic and isn't always as easy as it was with points. We err
on the side of generating more draw data bundles, but to limit this (a
single drawdata for a single line is extremely wasteful as we have to
allocate a bunch of textures, buffers, bind groups, etc. etc.) I had to
introduce `LineDrawableBuilderAllocator` which is a very simplistic
Vec-like allocator (minus the increase in size) for
`LineDrawableBuilder` (previously called `LineStripSeriesBuilder`).
I'm not super happy with this construct overall, but it's the best I
could come up with in the short-term and things seem to be fairly robust
and at least not overly complicated.~

~In the future it would be nice to reconcile
`LineDrawableBuilderAllocator` and `LineDrawableBuilder` into a single
construct, likely still with the limitations that the size of a batch
(think named unit with a transform) needs to be known ahead of time,
which is practically always the case!~

---

Second iteration: There's now `DataTextureSource` (ideas for better
names?) which is essentially a thing where you can throw data in and get
a data texture out! It handles all the copies and dynamic sizings for
you. This makes everything awesome because now we can handle `reserve`
call just as an optimization without requiring them and without being on
a bad path if you don't! <3

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using newly built examples:
[app.rerun.io](https://app.rerun.io/pr/5207/index.html)
* Using examples from latest `main` build:
[app.rerun.io](https://app.rerun.io/pr/5207/index.html?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[app.rerun.io](https://app.rerun.io/pr/5207/index.html?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!
* [x] Test misc examples on WebGL
* [x] Test misc examples on WebGPU

- [PR Build Summary](https://build.rerun.io/pr/5207)
- [Docs
preview](https://rerun.io/preview/012ab21d8acbf86f8d45bfdba3737f8ebe989784/docs)
<!--DOCS-PREVIEW-->
- [Examples
preview](https://rerun.io/preview/012ab21d8acbf86f8d45bfdba3737f8ebe989784/examples)
<!--EXAMPLES-PREVIEW-->
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)
  • Loading branch information
Wumpf authored Feb 19, 2024
1 parent cb32693 commit bf9a821
Show file tree
Hide file tree
Showing 32 changed files with 901 additions and 645 deletions.
18 changes: 11 additions & 7 deletions crates/re_renderer/shader/lines.wgsl
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,6 @@ struct BatchUniformBuffer {
@group(2) @binding(0)
var<uniform> batch: BatchUniformBuffer;

const POSITION_TEXTURE_SIZE: u32 = 512u;
const LINE_STRIP_TEXTURE_SIZE: u32 = 256u;

// Flags
// See lines.rs#LineStripFlags
const FLAG_CAP_END_TRIANGLE: u32 = 1u;
Expand Down Expand Up @@ -89,8 +86,13 @@ struct LineStripData {

// Read and unpack line strip data at a given location
fn read_strip_data(idx: u32) -> LineStripData {
let coord = vec2u(idx % LINE_STRIP_TEXTURE_SIZE, idx / LINE_STRIP_TEXTURE_SIZE);
var raw_data = textureLoad(position_data_texture, coord, 0).xy;
let position_data_texture_size = textureDimensions(position_data_texture);
let raw_data = textureLoad(position_data_texture,
vec2u(idx % position_data_texture_size.x, idx / position_data_texture_size.x), 0);

let picking_instance_id_texture_size = textureDimensions(picking_instance_id_texture);
let picking_instance_id = textureLoad(picking_instance_id_texture,
vec2u(idx % picking_instance_id_texture_size.x, idx / picking_instance_id_texture_size.x), 0).xy;

var data: LineStripData;
data.color = linear_from_srgba(unpack4x8unorm_workaround(raw_data.x));
Expand All @@ -99,7 +101,7 @@ fn read_strip_data(idx: u32) -> LineStripData {
data.unresolved_radius = unpack2x16float(raw_data.y).y;
data.flags = ((raw_data.y >> 8u) & 0xFFu);
data.stippling = f32((raw_data.y >> 16u) & 0xFFu) * (1.0 / 255.0);
data.picking_instance_id = textureLoad(picking_instance_id_texture, coord, 0).rg;
data.picking_instance_id = picking_instance_id;
return data;
}

Expand All @@ -110,7 +112,9 @@ struct PositionData {

// Read and unpack position data at a given location
fn read_position_data(idx: u32) -> PositionData {
var raw_data = textureLoad(line_strip_texture, vec2u(idx % POSITION_TEXTURE_SIZE, idx / POSITION_TEXTURE_SIZE), 0);
let texture_size = textureDimensions(line_strip_texture);
let coord = vec2u(idx % texture_size.x, idx / texture_size.x);
var raw_data = textureLoad(line_strip_texture, coord, 0);

var data: PositionData;
let pos_4d = batch.world_from_obj * vec4f(raw_data.xyz, 1.0);
Expand Down
70 changes: 41 additions & 29 deletions crates/re_renderer/src/allocator/cpu_write_gpu_read_belt.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,14 @@ pub enum CpuWriteGpuReadError {
#[error("Attempting to allocate an empty buffer.")]
ZeroSizeBufferAllocation,

#[error("Buffer is full, can't append more data!
Buffer contains {buffer_element_size} elements and has a capacity for {buffer_element_capacity} elements.
Tried to add {num_elements_attempted_to_add} elements.")]
#[error(
"Buffer is full, can't append more data! Buffer has a capacity for {buffer_capacity_elements} elements.
Tried to add {num_elements_attempted_to_add} elements, but only added {num_elements_actually_added}."
)]
BufferFull {
buffer_element_capacity: usize,
buffer_element_size: usize,
buffer_capacity_elements: usize,
num_elements_attempted_to_add: usize,
num_elements_actually_added: usize,
},

#[error("Target buffer has a size of {target_buffer_size}, can't write {copy_size} bytes with an offset of {destination_offset}!")]
Expand All @@ -26,9 +27,9 @@ pub enum CpuWriteGpuReadError {
destination_offset: u64,
},

#[error("Target texture doesn't fit the size of the written data to this buffer! Texture copy size: {copy_size:?} bytes, written data size: {written_data_size} bytes")]
#[error("Target texture doesn't fit the size of the written data to this buffer! Texture target buffer should be at most {max_copy_size} bytes, but the to be written data was {written_data_size} bytes.")]
TargetTextureBufferSizeMismatch {
copy_size: wgpu::Extent3d,
max_copy_size: u64,
written_data_size: usize,
},
}
Expand Down Expand Up @@ -89,14 +90,16 @@ where
#[inline]
pub fn extend_from_slice(&mut self, elements: &[T]) -> Result<(), CpuWriteGpuReadError> {
re_tracing::profile_function!();
let (result, elements) = if elements.len() > self.remaining_capacity() {

let remaining_capacity = self.remaining_capacity();
let (result, elements) = if elements.len() > remaining_capacity {
(
Err(CpuWriteGpuReadError::BufferFull {
buffer_element_capacity: self.capacity(),
buffer_element_size: self.num_written(),
buffer_capacity_elements: self.capacity(),
num_elements_attempted_to_add: elements.len(),
num_elements_actually_added: remaining_capacity,
}),
&elements[..self.remaining_capacity()],
&elements[..remaining_capacity],
)
} else {
(Ok(()), elements)
Expand All @@ -111,12 +114,12 @@ where

/// Pushes several elements into the buffer.
///
/// If the buffer is not big enough, only the first `self.remaining_capacity()` elements are pushed before returning an error.
/// If the buffer is not big enough, only the first [`CpuWriteGpuReadBuffer::remaining_capacity`] elements are pushed before returning an error.
/// Otherwise, returns the number of elements pushed for convenience.
#[inline]
pub fn extend(
&mut self,
elements: impl Iterator<Item = T>,
mut elements: impl Iterator<Item = T>,
) -> Result<usize, CpuWriteGpuReadError> {
re_tracing::profile_function!();

Expand All @@ -129,12 +132,16 @@ where
} else {
let num_written_before = self.num_written();

for element in elements {
while let Some(element) = elements.next() {
if self.unwritten_element_range.start >= self.unwritten_element_range.end {
let num_elements_actually_added = self.num_written() - num_written_before;

return Err(CpuWriteGpuReadError::BufferFull {
buffer_element_capacity: self.capacity(),
buffer_element_size: self.num_written(),
num_elements_attempted_to_add: 1,
buffer_capacity_elements: self.capacity(),
num_elements_attempted_to_add: num_elements_actually_added
+ elements.count()
+ 1,
num_elements_actually_added,
});
}

Expand All @@ -150,16 +157,22 @@ where
/// Fills the buffer with n instances of an element.
///
/// If the buffer is not big enough, only the first `self.remaining_capacity()` elements are pushed before returning an error.
pub fn fill_n(&mut self, element: T, num_elements: usize) -> Result<(), CpuWriteGpuReadError> {
pub fn add_n(&mut self, element: T, num_elements: usize) -> Result<(), CpuWriteGpuReadError> {
if num_elements == 0 {
return Ok(());
}

re_tracing::profile_function!();
let (result, num_elements) = if num_elements > self.remaining_capacity() {

let remaining_capacity = self.remaining_capacity();
let (result, num_elements) = if num_elements > remaining_capacity {
(
Err(CpuWriteGpuReadError::BufferFull {
buffer_element_capacity: self.capacity(),
buffer_element_size: self.num_written(),
buffer_capacity_elements: self.capacity(),
num_elements_attempted_to_add: num_elements,
num_elements_actually_added: remaining_capacity,
}),
self.remaining_capacity(),
remaining_capacity,
)
} else {
(Ok(()), num_elements)
Expand All @@ -181,14 +194,14 @@ where

/// Pushes a single element into the buffer and advances the write pointer.
///
/// Panics if the data no longer fits into the buffer.
/// Returns an error if the data no longer fits into the buffer.
#[inline]
pub fn push(&mut self, element: T) -> Result<(), CpuWriteGpuReadError> {
if self.remaining_capacity() == 0 {
return Err(CpuWriteGpuReadError::BufferFull {
buffer_element_capacity: self.capacity(),
buffer_element_size: self.num_written(),
buffer_capacity_elements: self.capacity(),
num_elements_attempted_to_add: 1,
num_elements_actually_added: 0,
});
}

Expand Down Expand Up @@ -255,12 +268,11 @@ where
let buffer_info = Texture2DBufferInfo::new(destination.texture.format(), copy_size);

// Validate that we stay within the written part of the slice (wgpu can't fully know our intention here, so we have to check).
// We go one step further and require the size to be exactly equal - it's too unlikely that you wrote more than is needed!
// (and if you did you probably have regrets anyways!)
if buffer_info.buffer_size_padded as usize != self.num_written() * std::mem::size_of::<T>()
// This is a bit of a leaky check since we haven't looked at copy_size which may limit the amount of memory we need.
if (buffer_info.buffer_size_padded as usize) < self.num_written() * std::mem::size_of::<T>()
{
return Err(CpuWriteGpuReadError::TargetTextureBufferSizeMismatch {
copy_size,
max_copy_size: buffer_info.buffer_size_padded,
written_data_size: self.num_written() * std::mem::size_of::<T>(),
});
}
Expand Down
Loading

0 comments on commit bf9a821

Please sign in to comment.