Releases: javierhonduco/rbperf
Releases · javierhonduco/rbperf
rbperf v0.4.0
Notable changes
- Add support for Ruby 3.2.0 + 3.2.1 @manuelfelipe and @javierhonduco in #64
- Allow running on hosts without PMU #63
- Bumped max stack size to 750 from 150 frames #65
- Reduce chances of dropped samples #74
- Fix panic on frame id 0 as it's valid #66
- Handle missing frames when reading BPF maps #68
- Improve sample statistics #75
Full Changelog: v0.3.0...v0.4.0
rbperf v0.3.0
Notable changes
- Only set a perf event on present CPUs. This was making
rbperf
fail for machines with certain CPU layouts, which seems to affect some AMD configurations. Found and fixed by @shaver in #50 🎉 . - Add support for Ruby 3.1.3 #53.
- Added support for rudimentary line number tracking in #52. This is opt-in with
--enable-linenos
as it might not be accurate. Fetching accurate line numbers can be quite complex and is not implemented yet. More details in the PR. - Updated kernel headers for BTF to kernel 6.0.18-200.
- Set all dependencies to published versions. Git was used for some unreleased features we needed. Changing this to released versions allows us to make
rbperf
packageable for distros. - Before this commit we used the system's libelf and libz, which might be different across hosts. Now we download and build them to ensure that we have more control over our non-rust dependencies and that we compile them with Clang + with frame pointers (more on this coming in a few weeks!). Starting from this release, we'll ship the release versions specified here.
- Updated CI to Rust 1.67.0 which revamps the queue we use to send events from BPF to the workers that process them.
rbperf v0.2.1
Changes
- Now
rbperf
is also shipped as a statically linked binary. Until now, libc and other libraries were required in the system, which was a problem in some distros that shipped older, incompatible versions. There's an added CI job to ensure that the static build works - Enabled LTO for release builds, reducing binary size enough to produce binaries with the same size even though now they are statically linked
- Fixes to allow profiling Ruby processes that are statically linked, rather than dynamically linking libruby
rbperf v0.2.0
Changes
- Added a
xtask
task to generate the Ruby configuration files which contain details of its ABI we need to walk the stack rather than having all values manually generated. There's still some work to do to avoid duplication and ensure that every value is programmatically generated ec4724d; - Simplify the Ruby stack walker, which used to be done in two phases. This reduces the number of CPU instructions needed to walk the stack and increases the readability of the code 6f4f78c;
rbperf v0.1.0
This is the first rbperf release! 🎉
New features
- Written in Rust, which brings a lot of performance improvements and excellent dev UX features. Stay tuned for a write-up on why Rust is an excellent fit for rbperf 50231c3;
- Using libbpf, via libbpf-rs, which brings us a lot of goodies such as not having to ship/use LLVM and recompile the BPF code every single time, BTF, CO-RE, among many other features;
- A bunch of correctness issues were squashed. On some occasions, the Ruby stack walker did not stop after the last frame and bogus frames were introduced;
- Added support for Ruby 3.0.0, 3.0.4, and 3.1.2;
- Added
rbperf info
, which shows useful information about the system and the BPF features it supports e81748a; - Added detection of PID reuse, to ensure that the right process is the one being profiled 272c8f3;
rbperf record --pid <pid> syscall --list
lists the available system calls we can trace eda4f21;- With
--ringbuf
the new ring buffer interface can be used instead of perf events. This new API can send data to userspace with lower overhead 8a1e048; - Added
--verbose-bpf-logging
to the record subcommand to enable BPF logging that can be tailed at/sys/kernel/debug/tracing/trace_pipe
. This is very useful while troubhleshooting BPF issues and having it as a flat helps reducerbperf
's overhead as the loader removes branches that can be proved as non-reachable 422c5ca; - Libelf and zlib are now statically linked libbpf/libbpf-sys@371a85d, libbpf/libbpf-rs@5bed52a;
- And many many others!
Feel free to send any bugs, feedback, ideas, or comments, either by opening an issue or directly to me!
Removed features (so far)
- Uprobe/USDTs have not been implemented yet. This is super useful, especially for allocation profiling and will come later on;
- There's no binary format. While it would be very useful to have an intermediate format that can be converted to a variety of outputs, such as flamegraphs and so on, I wanted to keep the focus on correctness in improving the current code and APIs. Once these things are fleshed out the binary format will be reconsidered
Acknowledgements
Thanks so much to all of you that have tried rbperf. Your feedback has been invaluable!