feat: improve prom write requests decode performance#3478
Merged
waynexia merged 2 commits intoGreptimeTeam:mainfrom Mar 12, 2024
Merged
feat: improve prom write requests decode performance#3478waynexia merged 2 commits intoGreptimeTeam:mainfrom
waynexia merged 2 commits intoGreptimeTeam:mainfrom
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3478 +/- ##
==========================================
- Coverage 85.44% 84.93% -0.51%
==========================================
Files 895 900 +5
Lines 147093 149647 +2554
==========================================
+ Hits 125685 127105 +1420
- Misses 21408 22542 +1134 |
Member
|
Great job! Do you think we could submit these enhancements to the main |
killme2008
reviewed
Mar 11, 2024
Contributor
Author
No, this optimization is safe if and only if:
|
evenyag
reviewed
Mar 11, 2024
waynexia
reviewed
Mar 12, 2024
waynexia
approved these changes
Mar 12, 2024
evenyag
approved these changes
Mar 12, 2024
tisonkun
reviewed
Mar 13, 2024
| 3u32 => { | ||
| // we can ignore metadata for now. | ||
| prost::encoding::skip_field(wire_type, tag, &mut buf, ctx.clone())?; | ||
| // todo(hl): metadata are skipped. |
Collaborator
There was a problem hiding this comment.
What exactly the TODO is?
May we create an issue for it with a bit description?
tisonkun
reviewed
Mar 14, 2024
| } | ||
|
|
||
| #[inline(always)] | ||
| fn copy_to_bytes(data: &mut Bytes, len: usize) -> Bytes { |
Collaborator
There was a problem hiding this comment.
IIRC this is already the impl of bytes::Bytes?
We copy it here for inline?
3 tasks
12 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
N/A
What's changed and what's your intention?
This PR aims to improve the decode performance of
PromWriteRequest, which has been addressed in #3425, but still we observe some room for optimization.Decoding
WriteRequestcopy_to_bytesinvocation by inlingprost::encoding::bytes::mergecopy_to_bytes, and we observe thatcopy_to_bytesis way slower than Golang's byte slice operationbytes = bytes[..idx]operation repeated for 120k times (which is the case when decoding lables from 10k timeseries in aWriteRequest), Rust'sBytestakes 1.2ms, while Golang's byte slice takes 30usBytesalso handles reference counting when built fromVec<u8>, while Golang's byte slice only hasptr,len,capfields and Garbage collector handles the reference counting, which bring little overhead when bytes are pooled.PromLabelare short-lived and are converted to string and added toTableBuildersoon after aPromTimeseriesdecoding is finished, so that the original decompressedBytesalways outlivePromLabel::nameandPromLabel::value, we can introduce some unsafe operation, such as directly construct a newBytesfrom raw pointer and len/cap fields, that's whatservers::proto::copy_to_bytesdoes.servers::proto::copy_to_bytestakes 200us, still slower than Golang, but fairly acceptable.WriteRequestcost time further improved from 3.0ms to 1.8ms. For refence, VictoriaMetrics' WriteRequets decoding takes 1.2ms.Decode-only benchmark result:
Building
RowInsertRequestsAside from decoding, we also find out that
PromWriteRequest::as_row_insert_requeststakes more time than decoding.PromTimeseries, we need to build the schemas for each metric (table), which involves two hashmap lookups per label, that sums to 120k hashmap lookup for a 10k timeseries request.Results
Still, decoding a
WriteRequestwith 10k timeseries and 60k labels and convert it toRowInsertRequests:Checklist