Releases: foxglove/mcap
Releases · foxglove/mcap
releases/mcap-cli/v0.0.45
go: reader: make ordered iteration faster (#1168) ### Changelog - Added `Message.PopulateFrom([]byte, bool) error` method to Message, which allows readers to avoid reallocating Message structs between messages. - Added `MessageIterator.NextInto(*Message) (*Schema, *Channel, *Message, error)` method to the MessageIterator interface. This method allows the caller to re-use memory between message reads, and to avoid allocating a new Message on the heap for every message read. - optimized indexed reading to read much faster and use much less memory. ### Docs - [x] Update docstrings - [x] Update examples ### Description This PR makes message iteration much faster. The included benchmark shows a pretty significant speedup: #### Before Note: the comparison benchmark based on `main` uses MessageIterator.Next() to read messages. ``` $ go test . -run=^$ -bench=BenchmarkReader -benchmem -benchtime=10x goos: darwin goarch: arm64 pkg: github.com/foxglove/mcap/go/mcap BenchmarkReader/inorder/no_index-8 10 532682221 ns/op 238.83 MB/s 7826155 msg/s 533306349 B/op 8054975 allocs/op BenchmarkReader/inorder/index_file_order-8 10 1886590288 ns/op 67.29 MB/s 2204813 msg/s 909884028 B/op 12114339 allocs/op BenchmarkReader/inorder/index_time_order-8 10 2248067917 ns/op 54.39 MB/s 1782145 msg/s 909889379 B/op 12114382 allocs/op BenchmarkReader/inorder/index_rev_order-8 10 2324738488 ns/op 48.90 MB/s 1602346 msg/s 910261216 B/op 12114355 allocs/op BenchmarkReader/inorder/bare_lexer-8 10 196806788 ns/op 660.42 MB/s 21640757 msg/s 17005039 B/op 4672 allocs/op BenchmarkReader/minor/no_index-8 10 509497992 ns/op 241.92 MB/s 7927082 msg/s 531637254 B/op 8054932 allocs/op BenchmarkReader/minor/index_file_order-8 10 1837735883 ns/op 66.97 MB/s 2194637 msg/s 909846889 B/op 12114373 allocs/op BenchmarkReader/minor/index_time_order-8 10 2250390946 ns/op 54.82 MB/s 1796497 msg/s 909844632 B/op 12114332 allocs/op BenchmarkReader/minor/index_rev_order-8 10 2360883250 ns/op 53.23 MB/s 1744292 msg/s 910212621 B/op 12114308 allocs/op BenchmarkReader/minor/bare_lexer-8 10 195830417 ns/op 638.20 MB/s 20912477 msg/s 15341232 B/op 4655 allocs/op BenchmarkReader/major/no_index-8 10 510946658 ns/op 241.74 MB/s 7921189 msg/s 532934768 B/op 8054945 allocs/op BenchmarkReader/major/index_file_order-8 10 1841807000 ns/op 66.35 MB/s 2174050 msg/s 909833931 B/op 12114348 allocs/op BenchmarkReader/major/index_time_order-8 10 2247866758 ns/op 54.53 MB/s 1786987 msg/s 909836941 B/op 12114379 allocs/op BenchmarkReader/major/index_rev_order-8 10 2328824133 ns/op 51.12 MB/s 1675101 msg/s 910215724 B/op 12114370 allocs/op BenchmarkReader/major/bare_lexer-8 10 198086167 ns/op 632.44 MB/s 20724011 msg/s 16635893 B/op 4661 allocs/op PASS ok github.com/foxglove/mcap/go/mcap 248.060s ``` #### After ``` % go test . -run=^$ -bench=BenchmarkReader -benchmem -benchtime=10x goos: darwin goarch: arm64 pkg: github.com/foxglove/mcap/go/mcap BenchmarkReader/inorder/no_index-8 10 209814421 ns/op 596.62 MB/s 19550071 msg/s 17491784 B/op 6310 allocs/op BenchmarkReader/inorder/index_file_order-8 10 340508775 ns/op 360.08 MB/s 11799088 msg/s 10446040 B/op 16981 allocs/op BenchmarkReader/inorder/index_time_order-8 10 341343088 ns/op 359.00 MB/s 11763672 msg/s 10443932 B/op 16955 allocs/op BenchmarkReader/inorder/index_rev_order-8 10 348526088 ns/op 356.74 MB/s 11689775 msg/s 9996309 B/op 16964 allocs/op BenchmarkReader/inorder/bare_lexer-8 10 187405846 ns/op 664.57 MB/s 21776806 msg/s 17439823 B/op 4674 allocs/op BenchmarkReader/minor/no_index-8 10 211110267 ns/op 587.06 MB/s 19236916 msg/s 16522652 B/op 6284 allocs/op BenchmarkReader/minor/index_file_order-8 10 356903517 ns/op 336.77 MB/s 11035283 msg/s 10419253 B/op 17006 allocs/op BenchmarkReader/minor/index_time_order-8 10 552369746 ns/op 218.44 MB/s 7157955 msg/s 10444996 B/op 17744 allocs/op BenchmarkReader/minor/index_rev_order-8 10 555191658 ns/op 220.27 MB/s 7217936 msg/s 9971279 B/op 17665 allocs/op BenchmarkReader/minor/bare_lexer-8 10 194812112 ns/op 554.57 MB/s 18172347 msg/s 16473271 B/op 4670 allocs/op BenchmarkReader/major/no_index-8 10 211406192 ns/op 579.88 MB/s 19001450 msg/s 17365727 B/op 6291 allocs/op BenchmarkReader/major/index_file_order-8 10 354124750 ns/op 347.12 MB/s 11374355 msg/s 10418725 B/op 16979 allocs/op BenchmarkReader/major/index_time_order-8 10 566783688 ns/op 215.38 MB/s 7057431 msg/s 16452847 B/op 17690 allocs/op BenchmarkReader/major/index_rev_order-8 10 563155871 ns/op 218.15 MB/s 7148236 msg/s 15986112 B/op 17699 allocs/op BenchmarkReader/major/bare_lexer-8 10 195610721 ns/op 633.25 MB/s 20750327 msg/s 17316992 B/op 4672 allocs/op PASS ok github.com/foxglove/mcap/go/mcap 68.716s ``` For the unindexed message iterator, all of the difference comes from: - being able to re-use a Message struct between calls to Next() - switching from storing channels and schemas in maps to slices. - using and re-using an internal buffer for lexing MCAP records. For the indexed message iterator, we do all of the same things plus: - We maintain a pool of decompressed chunk buffers, which are re-used after all of the messages from a given chunk are read out. - We no longer read message indexes from the file, choosing instead to read the chunk content and build our own message index in memory. This allows us to read files written without message indexes in order, and also reduces I/O, which in some cases is *probably faster* (slow network connections with large message index overheads) - we no longer use a heap to maintain order, instead we maintain a sorted array of unread message indexes. Every time we encounter a new chunk: - if this chunk does not overlap with the last, clear the message index array and write the new chunk's messages in. If they are already in order, do not sort. This makes ordered iteration over an in-order MCAP as fast as unordered iteration. - if the new chunk's messages are not in order, sort the new chunk's messages. - if the new chunk overlaps with the last, append the new chunk's messages, and sort all unread messages. #### New API justification The issues with `Next(p []byte) (*Schema, *Channel, *Message, error)` that caused me to explore alternatives are: - The buffer `p` is used to store the message data, but if it isn't big enough, a new buffer is allocated for every message. A common idiom is to return the newly allocated buffer so that the caller can take ownership of it and re-use it, but the current API doesn't allow that. - A new Message struct is allocated on the heap for every iteration. Even if the message goes out of scope on the next loop, this still causes significant work for the garbage collector to do. This new function signature re-uses the message passed in, if one is passed in. If `nil` is used, it creates a new Message.
releases/mcap-cli/v0.0.44
CLI v0.0.44 - Fix uint64 bug in `mcap cat` - Remove the ros1msg JSON transcoder and reference go-rosbag - Scan in log time order when using index
releases/mcap-cli/v0.0.43
Add strict order warning to `mcap doctor` (#1066) doctor will surface a warning when the log time of a message is earlier than the latest known log time of messages while reading the message data in file order. A new `--strict-message-order` flag can make this warning an error.
releases/mcap-cli/v0.0.42
cli: info does not segfault on large byte values (#1062) ### Public-Facing Changes * CLI: fixes a segfault when an MCAP contains >1024 GiB of data OR results in an greater "data rate" than 1024GiB/s ### Description Previously the `humanBytes` function would segfault when presented with a number > 1024^4.
releases/mcap-cli/v0.0.41
go: reader: fall back to linear scan if statistics show messages outs…
releases/mcap-cli/v0.0.40
mcap convert: Fix duplicate type definition in ROS2 schemas (#1058) ### Public-Facing Changes FIx `mcap convert` producing ROS2 schemas with duplicate type definitions ### Description It could happen that `mcap convert` produced ROS2 schemas with duplicated subtype definitions if the subtype was included multiple times by various `.msg` files. This PR fixes this by only processing subtypes that haven't been processed before. Resolves FG-6369
releases/mcap-cli/v0.0.39
CLI v0.0.39 - Fix schema deduplication in mcap convert (#1049) - Append attachments to merged mcap files (#1052)
releases/mcap-cli/v0.0.38
cli: fix segfault bug when using filter args (#1011) ### Public-Facing Changes Fixes bug where the MCAP cli would segfault when using `filter -y` or `filter -n`. ### Description Previously the `compileMatchers` function would build an array of matchers by appending to it, but the array was already initialized with a non-zero length, resulting in the first N entries of that array being uninitialized matchers. This caused a segfault when it came time to use those matchers. <!-- Link relevant Github issues. Use `Fixes #1234` to auto-close the issue after merging. -->
releases/mcap-cli/v0.0.37
Add subcommand for sorting a file (#1009) Adds a "sort" subcommand to the mcap CLI tool. This will rewrite the messages into a new file, physically sorted on time.
releases/mcap-cli/v0.0.36
Include metadata in mcap merge (#958) Includes metadata records from input files in mcap merge via a new read option. This required a breaking change to read options to avoid a dependency cycle: since I need to supply a callback option to apply to metadata records, the readopts package required awareness of "mcap" while "mcap" required awareness of readopts for configuration. To address this I have moved readopts.go under the mcap package. Users who upgrade the library will need to swap out the package name if they are using any options.