-
Notifications
You must be signed in to change notification settings - Fork 179
Vertical query merging and compaction #370
Vertical query merging and compaction #370
Conversation
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
chunks/chunks.go
Outdated
if len(chks) < 2 { | ||
return chks, nil | ||
} | ||
var newChks []Meta // Will contain the merged chunks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we allocate space for all of them?
newChks := make([]Meta, len(chks), len(chks))
. It will reduce allocations. And it's fine if we will not use all of this. Statistically we will use all of those (when they not overlap).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. I actually observed significant increase in allocs when I tested with #240. I will test again with this change.
compact.go
Outdated
@@ -611,7 +651,8 @@ func (c *LeveledCompactor) populateBlock(blocks []BlockReader, meta *BlockMeta, | |||
chks[i].Chunk = newChunk | |||
} | |||
} | |||
if err := chunkw.WriteChunks(chks...); err != nil { | |||
var err error | |||
if chks, err = chunkw.WriteChunks(chks...); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not use if _ = :...; err != nil
idiom then if we don't want to scope down those return variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! It looks really really good, just some suggestions after initial review.
- Can we split those two things? vertical query and improved compaction?
- Make it more streamable (merging chunks)
- Do not change ChunkWriter API - no need to IMO
- Can we have the overlaps instrumented? So having logging and metrics if we spot and resolve overlap on compaction? This is extremely important, because it might happen that overlap is caused by wrong configuration / other bug in compaction (we had that before). As the result it is extremely hard to track down what happened. So compacting overlapped blocks should be clearly logged and reported (bassically the clear log line what was the SOURCE of blocks)
@bwplotka So are you suggesting to keep vertical query merge and compaction in separate PR? That is possible, but ideally you would like to vendor both of them together into prometheus. |
Sure, just those are quite separate problems, so it would be easier to control those separatedly |
On a second thought, we would expect vertical query merge when there are overlapping blocks. Hence we also need to take care of them during compaction, where vertical merging is required. So taking into account vertical blocks, I think we should not split this PR. PS: I will address the reviews soon. |
Yup, can agree with that. It is separate in code, but you cannot allow overlapping blocks until having those two things working - my bad |
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
…rge-and-compact Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
…rge-and-compact Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Actions for vertical compaction: * Sorting chunk metas * Calling chunks.MergeOverlappingChunks on the chunks Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* BenchmarkNormalCompaction => BenchmarkCompaction * Moved the benchmark from db_test.go to compact_test.go Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
…rge-and-compact Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
I ran some benchmarks from this: https://github.com/prometheus/tsdb/blob/7ae4941221123cd485a227fa3edc152974a00097/compact_test.go#L722-L743
As I fallback to normal compactions when blocks are not overlapping, similar benchmark results were expected. PS: Am I benchmarking it right? |
And this is for overlapping blocks, so, only vertical compaction.
|
…rge-and-compact Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
http://prombench.prometheus.io/grafana/d/7gmLoNDmz/prombench?orgId=1&var-RuleGroup=All&var-pr-number=4861 (prometheus/prometheus#4861) Benchmark looks fine for non-overlapping blocks, almost identical. So this PR is working fine with non-overlapping blocks 🎉 |
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Iterator bench: 20 blocks, 1000 series, 2000 samples per series per block. All samples non overlapping.
Seek bench: 20 blocks, 100 series, 200 samples per series per block. All samples non overlapping.
Benchmark for iterator doesn't look good. @gouthamve suggested adding new querier for overlapping blocks, and use it only when we detect overlapping blocks. This way non-overlapping blocks wont suffer from the performance loss. |
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
I think the last bit is to add a benchmark to compare vertical querier vs a normal one. |
I already have some benchmark https://github.com/prometheus/tsdb/blob/47436d02bf6badd45e80ad4346d46e9a3cbd8afe/compact_test.go#L724-L790 |
I meant for the querier not the compaction. |
Apparently, I have benchmarks for query iterator and seek, but only for non-overlapping blocks. I will modify it to also have overlapping blocks, and use Querier from DB instead. |
I think we need something like this: Create some non overlapping blocks Create identical blocks to the bench with the non overlap but this time add some overlap. The amount of overlap probably affects the results so choose some average and if needed will modify later with more bench tests. |
Also has as overlapping blocks. Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
I have updated the benchmarks. |
Iterator (50% overlap got killed as entire benchmark ran too long):
Seek:
The reduction of time with overlap might also be a side affect of the reduced number of total |
@krasi-georgiev @gouthamve anything pending in this PR? |
An update about the benchmark: When creating blocks for the benchmark, |
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
New Benchmark ResultsIterator
|
…enchmarkQuerySeek. Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
d985113
to
5f8d911
Compare
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
…rge-and-compact Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a couple of nits.
compact.go
Outdated
@@ -93,6 +95,10 @@ func newCompactorMetrics(r prometheus.Registerer) *compactorMetrics { | |||
Name: "prometheus_tsdb_compactions_failed_total", | |||
Help: "Total number of compactions that failed for the partition.", | |||
}) | |||
m.overlappingBlocks = prometheus.NewCounter(prometheus.CounterOpts{ | |||
Name: "prometheus_tsdb_compactions_overlapping_total", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compactions_overlapping_total
makes it feel like the number of concurrent compactions. Maybe compactions_overlapping_blocks_total
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compactions_overlapping_blocks_total
sounds like number of blocks that were overlapping during compaction. How about vertical_compactions_total
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already done :)
compact.go
Outdated
@@ -438,7 +508,8 @@ func (w *instrumentedChunkWriter) WriteChunks(chunks ...chunks.Meta) error { | |||
|
|||
// write creates a new block that is the union of the provided blocks into dir. | |||
// It cleans up all files of the old blocks after completing successfully. | |||
func (c *LeveledCompactor) write(dest string, meta *BlockMeta, blocks ...BlockReader) (err error) { | |||
// The returned bool 'overlapping' is true if the parent blocks were overlapping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no bool
returned.
compact.go
Outdated
@@ -541,7 +612,8 @@ func (c *LeveledCompactor) write(dest string, meta *BlockMeta, blocks ...BlockRe | |||
|
|||
// populateBlock fills the index and chunk writers with new data gathered as the union | |||
// of the provided blocks. It returns meta information for the new block. | |||
func (c *LeveledCompactor) populateBlock(blocks []BlockReader, meta *BlockMeta, indexw IndexWriter, chunkw ChunkWriter) error { | |||
// The returned bool is true if the parent blocks were overlapping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing is returned ;)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@gouthamve I hope all your comments are answered? |
* Vertical series iterator Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Select overlapped blocks first in compactor Plan() Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Added vertical compaction Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Code cleanup and comments Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Fix review comments Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Fix tests Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Add benchmark for compaction Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Perform vertical compaction only when blocks are overlapping. Actions for vertical compaction: * Sorting chunk metas * Calling chunks.MergeOverlappingChunks on the chunks Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Benchmark for vertical compaction * BenchmarkNormalCompaction => BenchmarkCompaction * Moved the benchmark from db_test.go to compact_test.go Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Benchmark for query iterator and seek for non overlapping blocks Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Vertical query merge only for overlapping blocks Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Simplify logging in Compact(...) Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Updated CHANGELOG.md Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Calculate overlapping inside populateBlock Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * MinTime and MaxTime for BlockReader. Using this to find overlapping blocks in populateBlock() Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Sort blocks w.r.t. MinTime in reload() Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Log about overlapping in LeveledCompactor.write() instead of returning bool Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Log about overlapping inside LeveledCompactor.populateBlock() Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Fix review comments Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Refactor createBlock to take optional []Series Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * review1 Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com> * Updated CHANGELOG and minor nits Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * nits Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Updated CHANGELOG Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Refactor iterator and seek benchmarks for Querier. Also has as overlapping blocks. Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Additional test case Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * genSeries takes optional labels. Updated BenchmarkQueryIterator and BenchmarkQuerySeek. Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Split genSeries into genSeries and populateSeries Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Check error in benchmark Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Fix review comments Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Warn about overlapping blocks in reload() Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> Signed-off-by: naivewong <867245430@qq.com>
Fixes #90
Not tested in real world scenario yet.
Comments and suggestions are welcome.