-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dbnode] Re-add aggregator doc limit update #3137
Conversation
src/dbnode/storage/index/block.go
Outdated
lastField []byte | ||
lastFieldIsValid bool | ||
reuseLastEntry bool | ||
fieldsAdded, termsAdded bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newFieldAdded / newTermAdded?
src/dbnode/storage/index/block.go
Outdated
@@ -234,7 +234,6 @@ func NewBlock( | |||
iopts, | |||
) | |||
|
|||
aggAdded := opts.InstrumentOptions().MetricsScope().Counter("aggregate-added-counter") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This metric is essentially a copy of the total aggregate results metric here, so not too useful
src/dbnode/storage/index/block.go
Outdated
@@ -711,27 +715,58 @@ func (b *block) aggregateWithSpan( | |||
} | |||
|
|||
field, term := iter.Current() | |||
batch, numAdded = b.appendFieldAndTermToBatch(batch, field, term, iterateTerms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the problem here was that numAdded included duplicates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the writeup of the problem on the other comment, hopefully explains what was wrong
if results.EnforceLimits() { | ||
if err := b.docsLimit.Inc(len(batch), source); err != nil { | ||
return false, err | ||
if lastField == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we're essentially only incrementing the docs limit if we're working on a new field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So essentially what was happening previously was this:
- incoming field/term tuples were added to the batch [here]
-- if terms were included in the query (e.g.label/foo/values
), and the term matched the previous term, it would just alloc and append the field onto the existing entry’s terms; this was the underlying problem that took out our test cluster when their cardinality exploded, as it would allocate a huge slice of tags, as well as the tags themselves, here. This is because the batch size is calculated as len(terms), and would cause the batch max to apply only to new(different) label names.
-- Once either batch is full (note that for a label/foo/values request, this would never happen since the batch length would be at most 1) or incoming entries are exhausted, we’d increment the docs limit with len(terms).
Although there were no changes to how the docs limit was incremented, because the batch size calculation was changed from len(terms) to ~~len(terms) + len(all entries)
, we’d increment the docs limit far more often, which is a big problem when the terms are mostly a single value, as is always the case with label/foo/values
, which is probably the most widely used metadata query endpoint; in essence we'd be reporting a doc increase for each 256 terms, rather than once per query as previously.
Codecov Report
@@ Coverage Diff @@
## master #3137 +/- ##
==========================================
+ Coverage 65.7% 72.3% +6.5%
==========================================
Files 234 1087 +853
Lines 24512 100786 +76274
==========================================
+ Hits 16127 72893 +56766
- Misses 7333 22829 +15496
- Partials 1052 5064 +4012
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
if ns == nil { | ||
return &usageMetrics{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Shouldn't this just return an error and the caller should return an error if construction fails?
For tests we should probably just update callsites to always pass a non-nil ident.ID for the namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you're probably right, this felt really wonky
Will remove this; to be honest we don't really get a lot out of per namespace metrics here, will just drop namespace from the tags entirely
Resolving in following pr
if err := r.addFieldWithLock(field.Name, field.Value); err != nil { | ||
return fmt.Errorf("unable to add document [%+v]: %w", document, err) | ||
// NB: cannot insert more than max docs, so that acts as the upper bound here. | ||
remainingInserts := remainingDocs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this not just be:
remainingInserts := remainingDocs
if r.aggregateOpts.SizeLimit != 0 {
remainingInserts = int(math.Min(float64(remainingInserts), float64(r.aggregateOpts.SizeLimit - r.size)))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* master: (30 commits) [dbnode] Use go context to cancel index query workers after timeout (#3194) [aggregator] Fix change ActivePlacement semantics on close (#3201) [aggregator] Simplify (Active)StagedPlacement API (#3199) [aggregator] Checking if metadata is set to default should not cause copying (#3198) [dbnode] Remove readers and writer from aggregator API (#3122) [aggregator] Avoid large copies in entry rollup comparisons by making them more inline-friendly (#3195) [dbnode] Re-add aggregator doc limit update (#3137) [m3db] Do not close reader in filterFieldsIterator.Close() (#3196) Revert "Remove disk series read limit (#3174)" (#3193) [instrument] Improve sampled timer and stopwatch performance (#3191) Omit unset fields in metadata json (#3189) [dbnode] Remove left-over code in storage/bootstrap/bootstrapper (#3190) [dbnode][coordinator] Support match[] in label endpoints (#3180) Instrument the worker pool with the wait time (#3188) Instrument query path (#3182) [aggregator] Remove indirection, large copy from unaggregated protobuf decoder (#3186) [aggregator] Sample timers completely (#3184) [aggregator] Reduce error handling overhead in rawtcp server (#3183) [aggregator] Move shardID calculation out of critical section (#3179) Move instrumentation cleanup to FetchTaggedResultIterator Close() (#3173) ...
Reverts #3133 and uses the regression test suite from #3135 to use a backwards compatible doc limit while maintaining the per-query controls that had been reverted.
Review guide: commit 77100c4 is a revert, so it may be easier to look at the diff from then on.