Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve frame storage #74

Merged
merged 1 commit into from
Feb 27, 2023
Merged

Improve frame storage #74

merged 1 commit into from
Feb 27, 2023

Conversation

javierhonduco
Copy link
Owner

This commit changes a few things, first, it gets rid of the random IDs used to identify frames, as the birthday paradox, and production, shows that we'll quickly find duplicates using 32 bits. This now uses a global array. In BPF arrays are zero initialised, so we don't need to do that explicitly. As we are using shared memory across multiple threads, we read + increment the IDs atomically.

In addition to these changes that should reduce the chances of collisions and hence of lost frames, also bump the number of frames we can store and reduce the number of processes we can profile at once, as it's a very high number. We can tune these later on.

Notes

Other approaches that I've checked for ID generation are to use a per CPU counter and then offset the CPU number. This works but care has to be taken to make this work in systems with an "unbounded" number of CPUs. Not using an atomic get+increment is cheaper, but we would have to call the CPU id helper, so perf-wise it's probably in the same ballpark.

Another avenue we could explore is to have a generic symbol table to store both method names and paths. This would result in a reduction of stored strings as we wouldn't need to have the combination of path names x directories, but we would waste more space as these two string buffers have different sizes.

This commit doesn't fully fix the issue of collisions. A better fix will come later by processing the samples in a streaming fashion.

Signed-off-by: Francisco Javier Honduvilla Coto javierhonduco@gmail.com

This commit changes a few things, first, it gets rid of the random IDs
used to identify frames, as the birthday paradox, and production, shows
that we'll quickly find duplicates using 32 bits. This now uses a global
array. In BPF arrays are zero initialised, so we don't need to do that
explicitly. As we are using shared memory across multiple threads, we
read + increment the IDs atomically.

In addition to these changes that should reduce the chances of
collisions and hence of lost frames, also bump the number of frames we
can store and reduce the amount of processes we can profile at once, as
it's a very high number. We can tune these later on.

Notes
=====

Other approaches that I've checked for ID generation are to use a per CPU
counter and then offset the CPU number. This works but care has to be
taken to make this work in systems with an "unbounded" number of CPUs.
Not using an atomic get+increment is cheaper, but we would have to call
the CPU id helper, so perf wise it's probably in the same ballpark.

Another avenue we could explore is to have a generic symbol table to
store both method names and paths. This would result in a reduction of
stored strings as we wouldn't need to have the combination of path names
x directories, but we would waste more space as these two string buffers
have different sizes.

This commit doesn't fully fix the issue of collisions. A better fix will
come later by processing the samples in a streaming fashion.

Signed-off-by: Francisco Javier Honduvilla Coto <javierhonduco@gmail.com>
@javierhonduco
Copy link
Owner Author

@manuelfelipe This should help with #70, let me know if that's not the case 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant