Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dbnode] Add config option to set repair strategy (default/full_sweep) and concurrency #3573

Merged
merged 14 commits into from
Jul 1, 2021

Conversation

robskillington
Copy link
Collaborator

What this PR does / why we need it:

Adds the ability to set repair strategy to full_sweep to sweep from beginning of time to end without restarting when encountering blocks that need repair.

This mode may be more ideal in clusters that have never had repair enabled to ensure that historical data gets repaired at least once on a full sweep before switching back to the default strategy.

It also adds the ability to set the repair concurrency from config.

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:

NONE

Does this PR require updating code package or user-facing documentation?:

NONE

@robskillington robskillington enabled auto-merge (squash) June 25, 2021 21:49
@codecov
Copy link

codecov bot commented Jun 25, 2021

Codecov Report

Merging #3573 (1c1a7ad) into master (1c1a7ad) will not change coverage.
The diff coverage is n/a.

❗ Current head 1c1a7ad differs from pull request most recent head 1f1e8ef. Consider uploading reports for the commit 1f1e8ef to get more accurate results

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #3573   +/-   ##
======================================
  Coverage    55.8%   55.8%           
======================================
  Files         550     550           
  Lines       62174   62174           
======================================
  Hits        34699   34699           
  Misses      24379   24379           
  Partials     3096    3096           
Flag Coverage Δ
aggregator 57.3% <0.0%> (ø)
cluster ∅ <0.0%> (∅)
collector 58.4% <0.0%> (ø)
dbnode 60.1% <0.0%> (ø)
m3em 46.4% <0.0%> (ø)
metrics 19.7% <0.0%> (ø)
msg 74.3% <0.0%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1c1a7ad...1f1e8ef. Read the comment docs.

@@ -681,6 +695,13 @@ func (r *dbRepairer) Repair() error {
leastRecentlyRepairedBlockStartLastRepairTime xtime.UnixNano
)
repairRange.IterateBackward(blockSize, func(blockStart xtime.UnixNano) bool {
// Update metrics around progress of repair.
blockStartUnixSeconds := blockStart.ToTime().Unix()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: blockStart.Seconds()

Copy link
Collaborator

@arnikola arnikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@robskillington robskillington merged commit 89868e6 into master Jul 1, 2021
@robskillington robskillington deleted the r/repair-shard-concurrency-config branch July 1, 2021 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants