Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Parquet block format #1479

Merged
merged 75 commits into from
Jun 23, 2022
Merged

Add Parquet block format #1479

merged 75 commits into from
Jun 23, 2022

Conversation

mdisibio
Copy link
Contributor

@mdisibio mdisibio commented Jun 8, 2022

What this PR does:
This PR adds a new experimental Parquet block format. Things are stable and working well, but there is more work needed before it's able to run at large scale and considered production ready.

It can be enabled by setting the new config option:

storage:
  trace:
    block:
      version: vParquet

This causes the ingesters to start flushing all new blocks in parquet format. The backend will be slowly converted over the retention period as existing v2 blocks are deleted. The compactor does not translate between encodings.

Which issue(s) this PR fixes:
Fixes tempo's lack of columns
See also: #1407

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

mdisibio and others added 30 commits May 16, 2022 15:52
starting parquet support. some things are still todo:  compaction, find trace by ID

Co-Authored-By: Annanay Agarwal <annanayagarwal@gmail.com>
Co-Authored-By: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
…ffer in findtracebyid

Signed-off-by: Annanay <annanayagarwal@gmail.com>
…ndTraceByID

Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>

Make compactor block complete async

Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
@joe-elliott joe-elliott marked this pull request as ready for review June 17, 2022 13:25
@mdisibio mdisibio added this to the v1.5 milestone Jun 17, 2022
@mdisibio mdisibio changed the title WIP: Add Parquet block format Add Parquet block format Jun 17, 2022
@altanozlu
Copy link
Contributor

Hi i was trying this pr and i got this crash:

level=info ts=2022-06-18T12:50:35.687296444Z caller=instance.go:46 tenant=single-tenant msg="creating WAL" dir=tempo-data/mg/wal/single-tenant
panic: duplicate metrics collector registration attempted

goroutine 2963 [running]:
github.com/prometheus/client_golang/prometheus.(*wrappingRegisterer).MustRegister(0xc0017d50b0, {0xc000a7fa60?, 0x1, 0x1f603fa?})
        /root/tempo/repo/tempo/vendor/github.com/prometheus/client_golang/prometheus/wrap.go:104 +0x151
github.com/prometheus/prometheus/tsdb/wal.NewWatcherMetrics({0x2435d40, 0xc0017d50b0})
        /root/tempo/repo/tempo/vendor/github.com/prometheus/prometheus/tsdb/wal/watcher.go:139 +0x374
github.com/prometheus/prometheus/storage/remote.NewWriteStorage({0x24268c0, 0xc000f2b800}, {0x2435d40?, 0xc0017d50b0}, {0xc000059880, 0x1f}, 0xdf8475800, {0x24258c0?, 0x34a4c58})
        /root/tempo/repo/tempo/vendor/github.com/prometheus/prometheus/storage/remote/write.go:77 +0xa5
github.com/prometheus/prometheus/storage/remote.NewStorage({0x24252e0?, 0xc002e7d9a0?}, {0x2435d40, 0xc0017d50b0}, 0x2018b90, {0xc000059880, 0x1f}, 0x1b8df00?, {0x24258c0, 0x34a4c58})
        /root/tempo/repo/tempo/vendor/github.com/prometheus/prometheus/storage/remote/storage.go:75 +0x116
github.com/grafana/tempo/modules/generator/storage.New(0xc0001979a8, {0x1f64ede, 0xd}, {0x2435d10?, 0xc00013ea00}, {0x24252e0, 0xc000248370})
        /root/tempo/repo/tempo/modules/generator/storage/instance.go:60 +0x4b8
github.com/grafana/tempo/modules/generator.(*Generator).createInstance(0xc000a91290, {0x1f64ede, 0xd})
        /root/tempo/repo/tempo/modules/generator/generator.go:231 +0x4f
github.com/grafana/tempo/modules/generator.(*Generator).getOrCreateInstance(0xc000a91290, {0x1f64ede, 0xd})
        /root/tempo/repo/tempo/modules/generator/generator.go:214 +0xe5
github.com/grafana/tempo/modules/generator.(*Generator).PushSpans(0xc000a91290, {0x243ba70?, 0xc0017d5020?}, 0x1b90b00?)
        /root/tempo/repo/tempo/modules/generator/generator.go:190 +0x1b2
github.com/grafana/tempo/pkg/tempopb._MetricsGenerator_PushSpans_Handler.func1({0x243ba70, 0xc0017d5020}, {0x1e7ad80?, 0xc00000dc98})
        /root/tempo/repo/tempo/pkg/tempopb/tempo.pb.go:1272 +0x78
github.com/grafana/tempo/cmd/tempo/app.glob..func2({0x243ba70, 0xc0017d4ff0}, {0x1e7ad80, 0xc00000dc98}, 0xc001c155a0?, 0xc000c0e3a8)
        /root/tempo/repo/tempo/cmd/tempo/app/fake_auth.go:22 +0x83
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1({0x243ba70?, 0xc0017d4ff0?}, {0x1e7ad80?, 0xc00000dc98?})
        /root/tempo/repo/tempo/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:25 +0x3a
github.com/weaveworks/common/middleware.UnaryServerInstrumentInterceptor.func1({0x243ba70, 0xc0017d4ff0}, {0x1e7ad80, 0xc00000dc98}, 0xc000893620, 0xc000893640)
        /root/tempo/repo/tempo/vendor/github.com/weaveworks/common/middleware/grpc_instrumentation.go:33 +0xa2
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1({0x243ba70?, 0xc0017d4ff0?}, {0x1e7ad80?, 0xc00000dc98?})
        /root/tempo/repo/tempo/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:25 +0x3a
github.com/opentracing-contrib/go-grpc.OpenTracingServerInterceptor.func1({0x243ba70, 0xc0017d4f60}, {0x1e7ad80, 0xc00000dc98}, 0xc000893620, 0xc000893660)
        /root/tempo/repo/tempo/vendor/github.com/opentracing-contrib/go-grpc/server.go:57 +0x402
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1({0x243ba70?, 0xc0017d4f60?}, {0x1e7ad80?, 0xc00000dc98?})
        /root/tempo/repo/tempo/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:25 +0x3a
github.com/weaveworks/common/middleware.GRPCServerLog.UnaryServerInterceptor({{0x244c250?, 0xc0004a3970?}, 0x20?}, {0x243ba70, 0xc0017d4f60}, {0x1e7ad80, 0xc00000dc98}, 0xc000893620, 0xc000893680)
        /root/tempo/repo/tempo/vendor/github.com/weaveworks/common/middleware/grpc_logging.go:29 +0xbe
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1({0x243ba70?, 0xc0017d4f60?}, {0x1e7ad80?, 0xc00000dc98?})
        /root/tempo/repo/tempo/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:25 +0x3a
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1({0x243ba70, 0xc0017d4f60}, {0x1e7ad80, 0xc00000dc98}, 0xc000830ae0?, 0x1c4d880?)
        /root/tempo/repo/tempo/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:34 +0xbf
github.com/grafana/tempo/pkg/tempopb._MetricsGenerator_PushSpans_Handler({0x1ecb0c0?, 0xc000a91290}, {0x243ba70, 0xc0017d4f60}, 0xc000a819e0, 0xc0007bd290)
        /root/tempo/repo/tempo/pkg/tempopb/tempo.pb.go:1274 +0x138
google.golang.org/grpc.(*Server).processUnaryRPC(0xc00059e380, {0x244ab60, 0xc0016bf040}, 0xc001217b00, 0xc000b0d200, 0x343eb50, 0x0)
        /root/tempo/repo/tempo/vendor/google.golang.org/grpc/server.go:1282 +0xccf
google.golang.org/grpc.(*Server).handleStream(0xc00059e380, {0x244ab60, 0xc0016bf040}, 0xc001217b00, 0x0)
        /root/tempo/repo/tempo/vendor/google.golang.org/grpc/server.go:1616 +0xa1b
google.golang.org/grpc.(*Server).serveStreams.func1.2()
        /root/tempo/repo/tempo/vendor/google.golang.org/grpc/server.go:921 +0x98
created by google.golang.org/grpc.(*Server).serveStreams.func1
        /root/tempo/repo/tempo/vendor/google.golang.org/grpc/server.go:919 +0x28a

metrics_generator was enabled and when i've removed data folder, it started to working.

@mdisibio
Copy link
Contributor Author

Hi i was trying this pr and i got this crash:

Hi thanks for posting this. I have also seen this error occasionally, but on the surface it doesn't appear related to parquet. It looks related to the metrics generator. @kvrhdn Any thoughts?

@yvrhdn
Copy link
Member

yvrhdn commented Jun 20, 2022

Hi i was trying this pr and i got this crash:

Hi thanks for posting this. I have also seen this error occasionally, but on the surface it doesn't appear related to parquet. It looks related to the metrics generator. @kvrhdn Any thoughts?

Yeah, I think you are hitting this issue: #1449. This is an issue in the metrics-generator and is unrelated to Parquet. It happens when creating the tenant-specific bits in the metrics-generator fails and we retry to create it the next request.
Figuring out what the issue is (maybe bad config?) and restarting the process should do the trick.

annanay25 and others added 2 commits June 21, 2022 18:28
mdisibio and others added 3 commits June 21, 2022 12:28
@annanay25 annanay25 merged commit 08a4995 into main Jun 23, 2022
@annanay25 annanay25 deleted the parquet branch June 23, 2022 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants