-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: improve prom write requests decode performance #3478
feat: improve prom write requests decode performance #3478
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3478 +/- ##
==========================================
- Coverage 85.44% 84.93% -0.51%
==========================================
Files 895 900 +5
Lines 147093 149647 +2554
==========================================
+ Hits 125685 127105 +1420
- Misses 21408 22542 +1134 |
Great job! Do you think we could submit these enhancements to the main |
No, this optimization is safe if and only if:
|
} | ||
if buf.remaining() != limit { | ||
return Err(DecodeError::new("delimited length exceeded")); | ||
} | ||
self.series.add_to_table_data(&mut self.table_data); | ||
} | ||
3u32 => { | ||
// we can ignore metadata for now. | ||
prost::encoding::skip_field(wire_type, tag, &mut buf, ctx.clone())?; | ||
// todo(hl): metadata are skipped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly the TODO is?
May we create an issue for it with a bit description?
} | ||
} | ||
} | ||
|
||
#[inline(always)] | ||
fn copy_to_bytes(data: &mut Bytes, len: usize) -> Bytes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC this is already the impl of bytes::Bytes
?
We copy it here for inline?
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
N/A
What's changed and what's your intention?
This PR aims to improve the decode performance of
PromWriteRequest
, which has been addressed in #3425, but still we observe some room for optimization.Decoding
WriteRequest
copy_to_bytes
invocation by inlingprost::encoding::bytes::merge
copy_to_bytes
, and we observe thatcopy_to_bytes
is way slower than Golang's byte slice operationbytes = bytes[..idx]
operation repeated for 120k times (which is the case when decoding lables from 10k timeseries in aWriteRequest
), Rust'sBytes
takes 1.2ms, while Golang's byte slice takes 30usBytes
also handles reference counting when built fromVec<u8>
, while Golang's byte slice only hasptr
,len
,cap
fields and Garbage collector handles the reference counting, which bring little overhead when bytes are pooled.PromLabel
are short-lived and are converted to string and added toTableBuilder
soon after aPromTimeseries
decoding is finished, so that the original decompressedBytes
always outlivePromLabel::name
andPromLabel::value
, we can introduce some unsafe operation, such as directly construct a newBytes
from raw pointer and len/cap fields, that's whatservers::proto::copy_to_bytes
does.servers::proto::copy_to_bytes
takes 200us, still slower than Golang, but fairly acceptable.WriteRequest
cost time further improved from 3.0ms to 1.8ms. For refence, VictoriaMetrics' WriteRequets decoding takes 1.2ms.Decode-only benchmark result:
Building
RowInsertRequests
Aside from decoding, we also find out that
PromWriteRequest::as_row_insert_requests
takes more time than decoding.PromTimeseries
, we need to build the schemas for each metric (table), which involves two hashmap lookups per label, that sums to 120k hashmap lookup for a 10k timeseries request.Results
Still, decoding a
WriteRequest
with 10k timeseries and 60k labels and convert it toRowInsertRequests
:Checklist