-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(conductor, relayer)!: brotli compress data blobs #1006
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very nice improvement.
I only have very minor nits, none of them blocking.
crates/astria-sequencer-relayer/src/relayer/write/conversion.rs
Outdated
Show resolved
Hide resolved
size_hint: data.len(), | ||
..Default::default() | ||
}; | ||
let mut output = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An easy perf improvement might be Vec::with_capacity(data.len())
(as an upper limit) or (maybe better because deterministic) a multiple of the buffer size set in CompressorWriter::with_params
: i.e. Vec::with_capacity(BUF_SIZE * 8)
with BUF_SIZE = 4096
.
Vec
grows exponentially. In it would reallocate like 0 -> 4096 -> 8192 -> 16384
(note that Vec::new()
is a no-op, so in this case doing Vec::with_capacity(4096)
or not does the same amount of work in this specific case).
crates/astria-sequencer-relayer/src/relayer/write/conversion.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too - just a couple of minor points.
crates/astria-sequencer-relayer/src/relayer/write/conversion.rs
Outdated
Show resolved
Hide resolved
crates/astria-core/src/brotli.rs
Outdated
}; | ||
// Capacity based on expecting best potential compression ratio of 8x (based on benchmarks) | ||
// only would need to resize 2 times to reach worst case. | ||
let mut output = Vec::with_capacity(data.len() / 8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think being overly optimistic doesn't buy us anything. We will likely incur at least 1 reallocation. Let's go with data.len() / 4
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looks great (especially now with @Fraser999 having caught the missing cast-to-f64 before calculating the ratio).
I also think this does not need extra tests past the blackbox tests in conductor (unit tests would just test if brotli
indeed compresses/decompresses).
I believe the compression buffer is chosen to optimistically. The 8x compression factor is likely never reached, so we'd incur at least 1 resize always. IMO going with Vec::with_capacity(data.len() / 4)
is more realistic.
But either way, this is not a blocker.
… Celestia fees (#1045) ## Summary Batch mutiple Sequencer blocks into single Celestia blobs to reduce fee payments. ## Background Until now, each Sequencer block was turned into multiple blobs (one for overall sequencer block metadata, and one blob per rollup that had transactiosn in the sequencer block). This wasn't as efficient as it could be because the new compression scheme introduced in #1006 can only come to bear with more bytes to compress. Relayer will collect sequencer blocks up to a total (compressed) size of 1MB (1/2 of the current max of 2MB that Celestia blocks can be). ## Changes - Introduce protobuf messages `astria.sequencerblock.v1alpha1.CelestiaHeaderList` and `astria.sequencerblock.v1alpha1.CelestiaRollupDataList` - Rename `astria.sequencerblock.v1alpha1.CelestiaRollupBlob` to `astria.sequencerblock.v1alpha1.CelestiaRollupData` - Rename `astria.sequencerblock.v1alpha1.CelestiaSequencerBlob` to `astria.sequencerblock.v1alpha1.CelestiaHeader` - Collect Sequencer Blocks into the `*List` protobuf messages before posting them to Celestia (instead of splitting up each Sequencer block into mutiple blobs and posting them one by one). ## Testing Add unit tests around the next submission aggregation logic. Update conductor blackbox tests. ## Metrics + `CELESTIA_PAYLOAD_CREATION_LATENCY`: histogram with microsecond units to track the time it takes to create a payload of Celestia blobs (encoding + compressing all protobufs) + metrics for reporting compression ratio and total compressed payload size were moved from the payload/blob construction phase to the submission phase. ## Breaking Changelist - Relayer and Conductor write/read new protobuf messages to/from Celestia. ## Related Issues Closes #1042 Closes #1049
Summary
Uses brotli level 5 to compress data posted to celestia in relayer, and decompresses from read in conductor.
Background
Data size is largest component of DA costs, compression can reduce costs of data posting. Based on research, and testing brotli level 5 provides us fast enough compression with a good compression ratio for encoded protobuf data.
Changes
Testing
updated blackbox tests, smoke test on the repo
Metrics
TOTAL_ASTRIA_BLOB_DATA_SIZE_FOR_BLOCK
- the total sum of bytes which are in the data field of blobs for a single sequencer blockCOMPRESSION_RATIO_FOR_ASTRIA_BLOCK
- the compression ratio of data for the sequencer block