Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(conductor, relayer)!: brotli compress data blobs #1006

Merged
merged 15 commits into from
Apr 25, 2024
Merged

Conversation

joroshiba
Copy link
Member

@joroshiba joroshiba commented Apr 25, 2024

Summary

Uses brotli level 5 to compress data posted to celestia in relayer, and decompresses from read in conductor.

Background

Data size is largest component of DA costs, compression can reduce costs of data posting. Based on research, and testing brotli level 5 provides us fast enough compression with a good compression ratio for encoded protobuf data.

Changes

  • compression with brotli in relayer
  • decompression in conductor

Testing

updated blackbox tests, smoke test on the repo

Metrics

  • TOTAL_ASTRIA_BLOB_DATA_SIZE_FOR_BLOCK - the total sum of bytes which are in the data field of blobs for a single sequencer block
  • COMPRESSION_RATIO_FOR_ASTRIA_BLOCK - the compression ratio of data for the sequencer block

@joroshiba joroshiba added the docker-build used to trigger docker builds on PRs label Apr 25, 2024
@joroshiba joroshiba requested a review from a team as a code owner April 25, 2024 03:57
@joroshiba joroshiba requested a review from noot April 25, 2024 03:57
@github-actions github-actions bot added conductor pertaining to the astria-conductor crate sequencer-relayer pertaining to the astria-sequencer-relayer crate labels Apr 25, 2024
@joroshiba joroshiba requested review from SuperFluffy and Fraser999 and removed request for noot April 25, 2024 03:58
@joroshiba joroshiba changed the title feat(conductor, relayer): brotli compress data blobs feat(conductor, relayer)!: brotli compress data blobs Apr 25, 2024
Copy link
Member

@SuperFluffy SuperFluffy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very nice improvement.

I only have very minor nits, none of them blocking.

crates/astria-conductor/tests/blackbox/helpers/mod.rs Outdated Show resolved Hide resolved
size_hint: data.len(),
..Default::default()
};
let mut output = Vec::new();
Copy link
Member

@SuperFluffy SuperFluffy Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An easy perf improvement might be Vec::with_capacity(data.len()) (as an upper limit) or (maybe better because deterministic) a multiple of the buffer size set in CompressorWriter::with_params: i.e. Vec::with_capacity(BUF_SIZE * 8) with BUF_SIZE = 4096.

Vec grows exponentially. In it would reallocate like 0 -> 4096 -> 8192 -> 16384 (note that Vec::new() is a no-op, so in this case doing Vec::with_capacity(4096) or not does the same amount of work in this specific case).

Copy link
Contributor

@Fraser999 Fraser999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too - just a couple of minor points.

crates/astria-sequencer-relayer/src/metrics_init.rs Outdated Show resolved Hide resolved
@joroshiba joroshiba requested a review from a team as a code owner April 25, 2024 19:34
@github-actions github-actions bot added the ci issues that are related to ci and github workflows label Apr 25, 2024
};
// Capacity based on expecting best potential compression ratio of 8x (based on benchmarks)
// only would need to resize 2 times to reach worst case.
let mut output = Vec::with_capacity(data.len() / 8);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think being overly optimistic doesn't buy us anything. We will likely incur at least 1 reallocation. Let's go with data.len() / 4?

Copy link
Member

@SuperFluffy SuperFluffy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looks great (especially now with @Fraser999 having caught the missing cast-to-f64 before calculating the ratio).

I also think this does not need extra tests past the blackbox tests in conductor (unit tests would just test if brotli indeed compresses/decompresses).

I believe the compression buffer is chosen to optimistically. The 8x compression factor is likely never reached, so we'd incur at least 1 resize always. IMO going with Vec::with_capacity(data.len() / 4) is more realistic.

But either way, this is not a blocker.

@joroshiba joroshiba enabled auto-merge April 25, 2024 22:20
@joroshiba joroshiba added this pull request to the merge queue Apr 25, 2024
Merged via the queue into main with commit 1398555 Apr 25, 2024
36 checks passed
@joroshiba joroshiba deleted the joroshiba/brotli branch April 25, 2024 22:33
github-merge-queue bot pushed a commit that referenced this pull request May 8, 2024
… Celestia fees (#1045)

## Summary
Batch mutiple Sequencer blocks into single Celestia blobs to reduce fee
payments.

## Background
Until now, each Sequencer block was turned into multiple blobs (one for
overall sequencer block metadata, and one blob per rollup that had
transactiosn in the sequencer block). This wasn't as efficient as it
could be because the new compression scheme introduced in
#1006 can only come to bear with
more bytes to compress.

Relayer will collect sequencer blocks up to a total (compressed) size of
1MB (1/2 of the current max of 2MB that Celestia blocks can be).

## Changes
- Introduce protobuf messages
`astria.sequencerblock.v1alpha1.CelestiaHeaderList` and
`astria.sequencerblock.v1alpha1.CelestiaRollupDataList`
- Rename `astria.sequencerblock.v1alpha1.CelestiaRollupBlob` to
`astria.sequencerblock.v1alpha1.CelestiaRollupData`
- Rename `astria.sequencerblock.v1alpha1.CelestiaSequencerBlob` to
`astria.sequencerblock.v1alpha1.CelestiaHeader`
- Collect Sequencer Blocks into the `*List` protobuf messages before
posting them to Celestia (instead of splitting up each Sequencer block
into mutiple blobs and posting them one by one).

## Testing
Add unit tests around the next submission aggregation logic. Update
conductor blackbox tests.

## Metrics
+ `CELESTIA_PAYLOAD_CREATION_LATENCY`: histogram with microsecond units
to track the time it takes to create a payload of Celestia blobs
(encoding + compressing all protobufs)
+ metrics for reporting compression ratio and total compressed payload
size were moved from the payload/blob construction phase to the
submission phase.

## Breaking Changelist
- Relayer and Conductor write/read new protobuf messages to/from
Celestia.

## Related Issues
Closes #1042
Closes #1049
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci issues that are related to ci and github workflows conductor pertaining to the astria-conductor crate docker-build used to trigger docker builds on PRs sequencer-relayer pertaining to the astria-sequencer-relayer crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants