Skip to content

Commit

Permalink
Store: Make initial sync more robust
Browse files Browse the repository at this point in the history
Added re-try mechanism for store inital sync, where if the initial sync fails, it tries to do the initial sync again every 5 seconds for 15 seconds duration (total 3 re-tries for initial sync of store).

Signed-off-by: Kartik-Garg <kartik.garg@infracloud.io>

Store: Make initial sync more robust

Added re-try mechanism for store inital sync, where if the initial sync fails, it tries to do the initial sync again every 5 seconds for 15 seconds duration (total 3 re-tries for initial sync of store).

Signed-off-by: Kartik-Garg <kartik.garg@infracloud.io>

Merge release 0.30 into main (thanos-io#6041)

* compact: remove cancel on SyncMetas errors (thanos-io#5923)

in a favour of 86b4039 SyncMetas will retry if it's retriable.
Also, the cleanPartialMarked calls are surrounded by runutil.Repeat() will be repeated,
the ones not and are not retriable will throw an interrupt to run.Group() by returning err
and Group will call cancel() as it's configured for its interrupt func.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>

Signed-off-by: Seena Fallah <seenafallah@gmail.com>

* Cut v0.30.0-rc.0 (thanos-io#5992)

* Cut v0.30.0-rc.0

Signed-off-by: bwplotka <bwplotka@gmail.com>

* mdox fix.

Signed-off-by: bwplotka <bwplotka@gmail.com>

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Cut 0.30.0 (thanos-io#6011)

Signed-off-by: bwplotka <bwplotka@gmail.com>

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* *: cut 0.30.1 (thanos-io#6017)

* fix duplicate metrics registration in redis client (thanos-io#6009)

* fix duplicate metrics registration in redis client

Signed-off-by: Kama Huang <kamatogo13@gmail.com>

* fixed test

Signed-off-by: Kama Huang <kamatogo13@gmail.com>

Signed-off-by: Kama Huang <kamatogo13@gmail.com>

* *: cut 0.30.1

Add CHANGELOG entry.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

Signed-off-by: Kama Huang <kamatogo13@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Co-authored-by: Kama Huang <121007071+kama910@users.noreply.github.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Tracing: Fix sampler defaults (thanos-io#5887)

* Fix sampler defaults

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Add CHANGELOG

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Replace checkout with git-shallow-clone (thanos-io#5829)

Signed-off-by: Matej Gera <matejgera@gmail.com>

Signed-off-by: Matej Gera <matejgera@gmail.com>

Signed-off-by: Matej Gera <matejgera@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* CHANGELOG: fix

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Signed-off-by: Kama Huang <kamatogo13@gmail.com>
Signed-off-by: Matej Gera <matejgera@gmail.com>
Co-authored-by: Seena Fallah <seenafallah@gmail.com>
Co-authored-by: Kama Huang <121007071+kama910@users.noreply.github.com>
  • Loading branch information
3 people committed Jan 18, 2023
2 parents 853b960 + e85223b commit 0b5a1d4
Showing 1 changed file with 5 additions and 15 deletions.
20 changes: 5 additions & 15 deletions cmd/thanos/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,10 @@ import (
"github.com/thanos-io/thanos/pkg/ui"
)

// const (
// timeoutDuration = 15
// intervalDuration = 5
// )
const (
timeoutDuration = 15
intervalDuration = 5
)

type storeConfig struct {
indexCacheConfigs extflag.PathOrContent
Expand Down Expand Up @@ -386,17 +386,7 @@ func runStore(

level.Info(logger).Log("msg", "initializing bucket store")
begin := time.Now()

//This will stop retrying after 15 seconds.
initialSyncCtx, cancel := context.WithTimeout(context.Background(), 15*time.Second)
defer cancel()

//If error occurs, it will re-try after every 5 seconds, but only for 15 seconds, (so total re-try is three).
err := runutil.Retry(5*time.Second, initialSyncCtx.Done(), func() error {
return bs.InitialSync(ctx)
})

if err != nil {
if err := bs.InitialSync(ctx); err != nil {
close(bucketStoreReady)
return errors.Wrap(err, "bucket store initial sync")
}
Expand Down

0 comments on commit 0b5a1d4

Please sign in to comment.