Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: sanatize structured metadata at query time #13983

Merged
merged 7 commits into from
Aug 28, 2024

Conversation

MasslessParticle
Copy link
Contributor

There's a bug in structured metadata where Loki can accept characters that are invalid in prometheus label names. A subsequent PR will reject inputs but the data that's already been ingested is not queryable.

As a workaround, this PR sanatizes structured metadata label names at query time. In the case where no bad inputs exist, there isn't a meaningful performance difference. Otherwise, there is 1 alloc per bad input. We could do this with byte slices to avoid incurring allocs, but doing it this way is the most straightforward way to ensure we aren't missing multi-byte characters that may exist in label names.

benchmark before:

goos: darwin
goarch: arm64
pkg: github.com/grafana/loki/v3/pkg/logql/log
Benchmark_Pipeline/pipeline_bytes-12              270484              4416 ns/op            1795 B/op         34 allocs/op
Benchmark_Pipeline/pipeline_string-12             266888              4432 ns/op            1859 B/op         35 allocs/op
Benchmark_Pipeline/line_extractor_bytes-12                                        241034              4940 ns/op            1489 B/op         33 allocs/op
Benchmark_Pipeline/line_extractor_string-12                                       243536              4933 ns/op            1489 B/op         33 allocs/op
Benchmark_Pipeline/label_extractor_bytes-12                                       236311              5224 ns/op            1489 B/op         33 allocs/op
Benchmark_Pipeline/label_extractor_string-12                                      239662              5015 ns/op            1489 B/op         33 allocs/op

benchmark after

goos: darwin
goarch: arm64
pkg: github.com/grafana/loki/v3/pkg/logql/log
Benchmark_Pipeline/pipeline_bytes-12              244771              5156 ns/op            1795 B/op         34 allocs/op
Benchmark_Pipeline/pipeline_string-12             244720              4869 ns/op            1859 B/op         35 allocs/op
Benchmark_Pipeline/pipeline_bytes_no_invalid_structured_metadata-12               238662              5326 ns/op            1555 B/op         35 allocs/op
Benchmark_Pipeline/pipeline_string_with_invalid_structured_metadata-12            228170              5219 ns/op            1664 B/op         37 allocs/op
Benchmark_Pipeline/line_extractor_bytes-12                                        206654              5378 ns/op            1490 B/op         33 allocs/op
Benchmark_Pipeline/line_extractor_string-12                                       222890              5429 ns/op            1490 B/op         33 allocs/op
Benchmark_Pipeline/label_extractor_bytes-12                                       218815              5601 ns/op            1490 B/op         33 allocs/op
Benchmark_Pipeline/label_extractor_string-12                                      208882              5470 ns/op            1490 B/op         33 allocs/op

@MasslessParticle MasslessParticle requested a review from a team as a code owner August 27, 2024 22:28
@rfratto
Copy link
Member

rfratto commented Aug 27, 2024

Will this break anything for keys coming from OTel attributes, since (IIRC) periods are common there?

@MasslessParticle
Copy link
Contributor Author

MasslessParticle commented Aug 27, 2024

@rfratto yes and no. Yes because Loki will happily ingest periods so users will be querying for something unexpected. No because this data is unqueryable right now because the invalid labels will break here.

Copy link
Member

@rfratto rfratto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me.

I have some concerns about the behaviour at query time if two different keys sanitize to the same value (i.e., hello.world and hello!world both sanitize to hello_world), but I don't know enough about the LogQL engine yet to know if this would be a real issue.


func replaceChars(str string, offsets []int) string {
offsets = offsets[:0]
for i, r := range str {
Copy link
Contributor

@ashwanthgoli ashwanthgoli Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at prometheus NormalizeLabel, we also need to handle labels starting with a number

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great find. I didn't realize NormalizeLabel was a thing!

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should validate structured metadata in the future at ingest time. I would vote instead to disable validation for returned labels and structured metadata. This seems trickier to implement but more future-proof.

Only labels should be validate at ingest time.

EDIT: we should avoid query-time validation/normalization.

@MasslessParticle MasslessParticle merged commit 3bf7fa9 into main Aug 28, 2024
62 checks passed
@MasslessParticle MasslessParticle deleted the tpatterson/structured-metadata-fix branch August 28, 2024 19:11
@MasslessParticle MasslessParticle added type/bug Somehing is not working as expected backport k217 labels Aug 28, 2024
grafanabot pushed a commit that referenced this pull request Aug 28, 2024
pascal-sochacki pushed a commit to pascal-sochacki/loki that referenced this pull request Aug 29, 2024
pascal-sochacki pushed a commit to pascal-sochacki/loki that referenced this pull request Aug 29, 2024
pascal-sochacki pushed a commit to pascal-sochacki/loki that referenced this pull request Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport k217 size/M type/bug Somehing is not working as expected
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants