Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: optimize series order #3345

Merged
merged 4 commits into from
Jun 10, 2024
Merged

feat: optimize series order #3345

merged 4 commits into from
Jun 10, 2024

Conversation

kolesnikovae
Copy link
Collaborator

@kolesnikovae kolesnikovae commented Jun 7, 2024

Series order determines the physical placement of profiling data in blocks. It has been observed that the standard prometheus label and series order is not aligned with our access patterns, leading to an increase in the read amplification factor and query latency.

The PR proposes an option to enforce the optimized order. The main access pattern is a query that targets a specific profile/sample type of a specific service, therefore we want all the profiles matching such query to be stored sequentially. To achieve this, we ensure that the first labels of any series are __profile_type__ and __service_name__.

The change should be applied with great caution. Since we can only enforce the order in ingesters (__profile_type__ is not available in distributors), we need to ensure that all ingester instances either apply the optimization for a given profile, or not. Otherwise, there's possibility that the profile replicas will be handled differently, and the same series will have two representations: one with the standard order, and another one optimized; this will make it challenging or impossible to resolve the conflicts at query time and compaction. This is solved by "marking" profiles with the new private label __order__ in distributors, which indicates whether we need to enforce the new order, and is removed before ingestion. This will result in doubling the active series temporarily but will allow for the normal deduplication process.

N.B.: No changes are required in the query path, because the order does not affect "user" labels. For the Series call that lists profile types and services, the order does not change.

Thus, the change should be applied in two steps:

  1. Rollout the new version: all ingester instances must run the new version.
  2. Enable the enforce_labels_order limit.

Eventually, this optimization should become the default behaviour.

The optimization is only recommended if you are using user-defined labels (static or dynamic) that begin with _ or with a capital letter and are experiencing very poor query performance.

@kolesnikovae kolesnikovae marked this pull request as ready for review June 7, 2024 09:18
@kolesnikovae kolesnikovae requested a review from a team as a code owner June 7, 2024 09:18
Copy link
Contributor

@simonswine simonswine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach looks good to me, but I suspect there is some other problem, at least in when I tried this on a local go run ./cmd/pyroscope --config.file ./cmd/pyroscope/pyroscope.yaml --pyroscopedb.max-block-duration 30s -validation.enforce-labels-order=true pyroscope I couldn't get it to produce the right order in TSDB:

❯ ./profilecli admin tsdb series index.tsdb
{"SeriesRef":57,"SeriesIndex":0,"Labels":{"Ahat_ever":"AAA","__name__":"memory","__period_type__":"space","__period_unit__":"bytes","__profile_type__":"memory:inuse_space:bytes:space:bytes","__service_name__":"profilecli-upload","__type__":"inuse_space","__unit__":"bytes","service_name":"profilecli-upload"}}
{"SeriesRef":61,"SeriesIndex":1,"Labels":{"Ahat_ever":"AAC","__name__":"memory","__period_type__":"space","__period_unit__":"bytes","__profile_type__":"memory:inuse_space:bytes:space:bytes","__service_name__":"profilecli-upload","__type__":"inuse_space","__unit__":"bytes","service_name":"profilecli-upload"}}
{"SeriesRef":65,"SeriesIndex":2,"Labels":{"___hat_ever":"AAA","__name__":"memory","__period_type__":"space","__period_unit__":"bytes","__profile_type__":"memory:inuse_space:bytes:space:bytes","__service_name__":"profilecli-upload","__type__":"inuse_space","__unit__":"bytes","service_name":"profilecli-upload"}}
{"SeriesRef":69,"SeriesIndex":3,"Labels":{"__name__":"block","__period_type__":"contentions","__period_unit__":"count","__profile_type__":"block:contentions:count:contentions:count","__service_name__":"pyroscope","__type__":"contentions","__unit__":"count","pyroscope_spy":"gospy","service_git_ref":"HEAD","service_name":"pyroscope","service_repository":"https://github.com/grafana/pyroscope","target":"all"}}
{"SeriesRef":73,"SeriesIndex":4,"Labels":{"__name__":"block","__period_type__":"contentions","__period_unit__":"count","__profile_type__":"block:delay:nanoseconds:contentions:count","__service_name__":"pyroscope","__type__":"delay","__unit__":"nanoseconds","pyroscope_spy":"gospy","service_git_ref":"HEAD","service_name":"pyroscope","service_repository":"https://github.com/grafana/pyroscope","target":"all"}}
{"SeriesRef":77,"SeriesIndex":5,"Labels":{"__name__":"goroutines","__period_type__":"goroutine","__period_unit__":"count","__profile_type__":"goroutines:goroutine:count:goroutine:count","__service_name__":"pyroscope","__type__":"goroutine","__unit__":"count","pyroscope_spy":"gospy","service_git_ref":"HEAD","service_name":"pyroscope","service_repository":"https://github.com/grafana/pyroscope","target":"all"}}
{"SeriesRef":81,"SeriesIndex":6,"Labels":{"__name__":"memory","__period_type__":"space","__period_unit__":"bytes","__profile_type__":"memory:alloc_objects:count:space:bytes","__service_name__":"pyroscope","__type__":"alloc_objects","__unit__":"count","pyroscope_spy":"gospy","service_git_ref":"HEAD","service_name":"pyroscope","service_repository":"https://github.com/grafana/pyroscope","target":"all"}}
{"SeriesRef":85,"SeriesIndex":7,"Labels":{"__name__":"memory","__period_type__":"space","__period_unit__":"bytes","__profile_type__":"memory:alloc_space:bytes:space:bytes","__service_name__":"pyroscope","__type__":"alloc_space","__unit__":"bytes","pyroscope_spy":"gospy","service_git_ref":"HEAD","service_name":"pyroscope","service_repository":"https://github.com/grafana/pyroscope","target":"all"}}
{"SeriesRef":89,"SeriesIndex":8,"Labels":{"__name__":"memory","__period_type__":"space","__period_unit__":"bytes","__profile_type__":"memory:inuse_objects:count:space:bytes","__service_name__":"pyroscope","__type__":"inuse_objects","__unit__":"count","pyroscope_spy":"gospy","service_git_ref":"HEAD","service_name":"pyroscope","service_repository":"https://github.com/grafana/pyroscope","target":"all"}}
{"SeriesRef":93,"SeriesIndex":9,"Labels":{"__name__":"memory","__period_type__":"space","__period_unit__":"bytes","__profile_type__":"memory:inuse_space:bytes:space:bytes","__service_name__":"pyroscope","__type__":"inuse_space","__unit__":"bytes","pyroscope_spy":"gospy","service_git_ref":"HEAD","service_name":"pyroscope","service_repository":"https://github.com/grafana/pyroscope","target":"all"}}
{"SeriesRef":97,"SeriesIndex":10,"Labels":{"__name__":"mutex","__period_type__":"contentions","__period_unit__":"count","__profile_type__":"mutex:contentions:count:contentions:count","__service_name__":"pyroscope","__type__":"contentions","__unit__":"count","pyroscope_spy":"gospy","service_git_ref":"HEAD","service_name":"pyroscope","service_repository":"https://github.com/grafana/pyroscope","target":"all"}}
{"SeriesRef":101,"SeriesIndex":11,"Labels":{"__name__":"mutex","__period_type__":"contentions","__period_unit__":"count","__profile_type__":"mutex:delay:nanoseconds:contentions:count","__service_name__":"pyroscope","__type__":"delay","__unit__":"nanoseconds","pyroscope_spy":"gospy","service_git_ref":"HEAD","service_name":"pyroscope","service_repository":"https://github.com/grafana/pyroscope","target":"all"}}
{"SeriesRef":105,"SeriesIndex":12,"Labels":{"__name__":"process_cpu","__period_type__":"cpu","__period_unit__":"nanoseconds","__profile_type__":"process_cpu:cpu:nanoseconds:cpu:nanoseconds","__service_name__":"pyroscope","__type__":"cpu","__unit__":"nanoseconds","pyroscope_spy":"gospy","service_git_ref":"HEAD","service_name":"pyroscope","service_repository":"https://github.com/grafana/pyroscope","target":"all"}}
{"SeriesRef":109,"SeriesIndex":13,"Labels":{"__name__":"process_cpu","__period_type__":"cpu","__period_unit__":"nanoseconds","__profile_type__":"process_cpu:samples:count:cpu:nanoseconds","__service_name__":"pyroscope","__type__":"samples","__unit__":"count","pyroscope_spy":"gospy","service_git_ref":"HEAD","service_name":"pyroscope","service_repository":"https://github.com/grafana/pyroscope","target":"all"}}

For completeness this is the order of the profiles table:

SeriesIndex: 0
SeriesIndex: 1
SeriesIndex: 2
SeriesIndex: 3
SeriesIndex: 3
SeriesIndex: 3
SeriesIndex: 4
SeriesIndex: 4
SeriesIndex: 4
SeriesIndex: 5
SeriesIndex: 5
SeriesIndex: 5
SeriesIndex: 6
SeriesIndex: 6
SeriesIndex: 6
SeriesIndex: 7
SeriesIndex: 7
SeriesIndex: 7
SeriesIndex: 8
SeriesIndex: 8
SeriesIndex: 8
SeriesIndex: 9
SeriesIndex: 9
SeriesIndex: 9
SeriesIndex: 10
SeriesIndex: 10
SeriesIndex: 10
SeriesIndex: 11
SeriesIndex: 11
SeriesIndex: 11
SeriesIndex: 12
SeriesIndex: 12
SeriesIndex: 12
SeriesIndex: 13
SeriesIndex: 13
SeriesIndex: 13

func (ls Labels) Delete(name string) Labels {
return slices.RemoveInPlace(ls, func(pair *typesv1.LabelPair, i int) bool {
return pair.Name == name
})
}

// Insert adds the given label to the set of labels.
// It assumes the labels are ordered
func (ls *Labels) Insert(name, value string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Insert comes with the risk of breaking the LabelsEnforcedOrder order, not a problem as long as we don't use user input with Insert.

@simonswine
Copy link
Contributor

Ok I have found the culprit, we probably need to loop throught he __order__ label to that point and remove it there:

--- a/pkg/phlaredb/tsdb/index.go
+++ b/pkg/phlaredb/tsdb/index.go
@@ -298,7 +298,7 @@ func (shard *indexShard) add(metric []*typesv1.LabelPair, fp model.Fingerprint)
                values.fps[fingerprints.value] = fingerprints
                internedLabels[i] = &typesv1.LabelPair{Name: values.name, Value: fingerprints.value}
        }
-       sort.Sort(internedLabels)
+       sort.Sort(phlaremodel.LabelsEnforcedOrder(internedLabels))
        return internedLabels
 }

I am also adding a test, to ensure the profile table is ordered as
expected.
@simonswine
Copy link
Contributor

simonswine commented Jun 7, 2024

Update: I have pushed a related test and fix ece0896

@kolesnikovae
Copy link
Collaborator Author

kolesnikovae commented Jun 8, 2024

I've also noticed that the fix does not have any effect at the last moment. The issue is that initially, I modified the default labels order (therefore this last sort had no effect, and the order on disk was enforced), but then I made changes for migration that broke it – I was planning to look into this today. Thank you @simonswine for fixing this!

@kolesnikovae kolesnikovae requested a review from a team as a code owner June 9, 2024 06:18
@kolesnikovae kolesnikovae merged commit fc023da into main Jun 10, 2024
16 checks passed
@kolesnikovae kolesnikovae deleted the feat/optimized-series-order branch June 10, 2024 08:41
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants