Improve frame storage #74

javierhonduco · 2023-02-27T21:48:04Z

This commit changes a few things, first, it gets rid of the random IDs used to identify frames, as the birthday paradox, and production, shows that we'll quickly find duplicates using 32 bits. This now uses a global array. In BPF arrays are zero initialised, so we don't need to do that explicitly. As we are using shared memory across multiple threads, we read + increment the IDs atomically.

In addition to these changes that should reduce the chances of collisions and hence of lost frames, also bump the number of frames we can store and reduce the number of processes we can profile at once, as it's a very high number. We can tune these later on.

Notes

Other approaches that I've checked for ID generation are to use a per CPU counter and then offset the CPU number. This works but care has to be taken to make this work in systems with an "unbounded" number of CPUs. Not using an atomic get+increment is cheaper, but we would have to call the CPU id helper, so perf-wise it's probably in the same ballpark.

Another avenue we could explore is to have a generic symbol table to store both method names and paths. This would result in a reduction of stored strings as we wouldn't need to have the combination of path names x directories, but we would waste more space as these two string buffers have different sizes.

This commit doesn't fully fix the issue of collisions. A better fix will come later by processing the samples in a streaming fashion.

Signed-off-by: Francisco Javier Honduvilla Coto javierhonduco@gmail.com

This commit changes a few things, first, it gets rid of the random IDs used to identify frames, as the birthday paradox, and production, shows that we'll quickly find duplicates using 32 bits. This now uses a global array. In BPF arrays are zero initialised, so we don't need to do that explicitly. As we are using shared memory across multiple threads, we read + increment the IDs atomically. In addition to these changes that should reduce the chances of collisions and hence of lost frames, also bump the number of frames we can store and reduce the amount of processes we can profile at once, as it's a very high number. We can tune these later on. Notes ===== Other approaches that I've checked for ID generation are to use a per CPU counter and then offset the CPU number. This works but care has to be taken to make this work in systems with an "unbounded" number of CPUs. Not using an atomic get+increment is cheaper, but we would have to call the CPU id helper, so perf wise it's probably in the same ballpark. Another avenue we could explore is to have a generic symbol table to store both method names and paths. This would result in a reduction of stored strings as we wouldn't need to have the combination of path names x directories, but we would waste more space as these two string buffers have different sizes. This commit doesn't fully fix the issue of collisions. A better fix will come later by processing the samples in a streaming fashion. Signed-off-by: Francisco Javier Honduvilla Coto <javierhonduco@gmail.com>

javierhonduco · 2023-02-27T21:49:00Z

@manuelfelipe This should help with #70, let me know if that's not the case 😄

javierhonduco merged commit dd7e327 into main Feb 27, 2023

javierhonduco deleted the frame-table-improvements branch February 27, 2023 21:56

manuelfelipe mentioned this pull request Feb 28, 2023

Do not use random numbers as IDs for the frames #70

Closed

javierhonduco mentioned this pull request Mar 12, 2023

Mismatched frame count issues #72

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve frame storage #74

Improve frame storage #74

javierhonduco commented Feb 27, 2023

javierhonduco commented Feb 27, 2023

Improve frame storage #74

Improve frame storage #74

Conversation

javierhonduco commented Feb 27, 2023

Notes

javierhonduco commented Feb 27, 2023