GC task too eagerly closes chunks #844

woodsaj · 2018-02-07T00:53:17Z

with default config, the GC task closes chunks if they have received no data for between 1-2hours (chunk-max-stale + gc-interval)

However, if a chunk has a chunkspan of say 6hours and you send 1hour of data then dont send anything for 3hours. When you start sending again the data will be rejected as the chunk will have already been closed and flushed to cassandra.

We need to make sure that chunks are not closed until after the chunk window has passed.

replay · 2018-02-07T04:11:38Z

actually this makes me think it would make sense to be able to configure chunk-max-stale per retention and not just globally. but as a quick fix that would be too much effort, so better first just check if the chunk window has passed.

woodsaj · 2018-02-07T08:30:30Z

i dont think so. We need chunks to be persisted before the datapoints become older then the kafka retention. So if the kafka retention is 7.5hours we just need to make sure that the chunk is persisted within 7.5hours of the first datapoint being received.

So rather than having a chunk-max-stale setting, i think we should just have a max-chunk-age setting. where the maximum age allowed is kafka-retention - gc-interval to ensure that we never have unflushed data that is older then kafka-retention.

Looks like this issue is a dup of #614

woodsaj · 2018-03-14T13:25:23Z

re-opening this. #614 covers a much wider scope of problems but doesnt look like they are going to be fixed anytime soon. This specific issue is customer impacting and needs to be fixed ASAP.

Dieterbe · 2018-03-14T13:34:24Z

it sounds reasonable and correct to only close chunks via GC after max-stale AND after the chunk end has passed. (eg when wall clock > last ts of the chunk).
this may still not solve the issue if the data is sent with a big lag, but this sounds uncommon.
for realtime or semi-realtime, this should work well. maybe we should add another 5minute offset or so to accommodate a slight delay.

@woodsaj if this sounds good to you i'll make the PR

woodsaj · 2018-03-14T15:02:31Z

@Dieterbe yep, lets make that change until #614 is implemented.

fix: GC task too eagerly closes chunks. #844

woodsaj closed this as completed Feb 7, 2018

woodsaj reopened this Mar 14, 2018

woodsaj assigned Dieterbe Mar 14, 2018

woodsaj added bug customer-impacting labels Mar 14, 2018

Dieterbe closed this as completed in fd5a932 Mar 15, 2018

Dieterbe added a commit that referenced this issue Mar 15, 2018

Merge pull request #869 from grafana/issue-844

fb2a833

fix: GC task too eagerly closes chunks. #844

Dieterbe added this to the 0.8.2 milestone Dec 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GC task too eagerly closes chunks #844

GC task too eagerly closes chunks #844

woodsaj commented Feb 7, 2018

replay commented Feb 7, 2018

woodsaj commented Feb 7, 2018

woodsaj commented Mar 14, 2018

Dieterbe commented Mar 14, 2018 •

edited

Loading

woodsaj commented Mar 14, 2018

GC task too eagerly closes chunks #844

GC task too eagerly closes chunks #844

Comments

woodsaj commented Feb 7, 2018

replay commented Feb 7, 2018

woodsaj commented Feb 7, 2018

woodsaj commented Mar 14, 2018

Dieterbe commented Mar 14, 2018 • edited Loading

woodsaj commented Mar 14, 2018

Dieterbe commented Mar 14, 2018 •

edited

Loading