Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jobs: improve the status reporting of index backfill jobs #39773

Merged
merged 1 commit into from
Aug 23, 2019

Conversation

spaskob
Copy link
Contributor

@spaskob spaskob commented Aug 20, 2019

Before this change index backfill workers would write their updated
spans concurrently to the job's row which can potentially lead to blow
up in the jobs table. This PR is a first step in improving this. Instead
of updating the finished spans in each worker, we propagate this info to
the gateway node and do bulk updates there. Next step is to let the
workers run longer by not canceling them at 2 mins which is currently
done for historical reasons.

Fixes #36601.

Release note: None

@spaskob spaskob requested a review from dt August 20, 2019 17:41
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Before this change index backfill workers would write their updated
spans concurrently to the job's row which can potentially lead to blow
up in the jobs table. This PR is a first step in improving this. Instead
of updating the finished spans in each worker, we propagate this info to
the gateway node and do bulk updates there. Next step is to let the
workers run longer by not canceling them at 2 mins which is currently
done for historical reasons.

Fixes cockroachdb#36601.

Release note: None
@@ -843,10 +841,16 @@ func (sc *SchemaChanger) distBackfill(
otherTableDescs = append(otherTableDescs, *table.TableDesc())
}
}
rw := &errOnlyResultWriter{}
metaFn := func(_ context.Context, meta *distsqlpb.ProducerMetadata) {
if meta.BulkProcessorProgress != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we avoid wrapping todoSpans access in a mutex because all the access from the callback is from the same goroutine (the distsql inbound goroutine) and all happens while this goroutine is blocked on Run.

If that is correct, that's a slightly brittle assumption but it works for me for now.

@spaskob
Copy link
Contributor Author

spaskob commented Aug 23, 2019

bors r+

craig bot pushed a commit that referenced this pull request Aug 23, 2019
39773: jobs: improve the status reporting of index backfill jobs r=spaskob a=spaskob

Before this change index backfill workers would write their updated
spans concurrently to the job's row which can potentially lead to blow
up in the jobs table. This PR is a first step in improving this. Instead
of updating the finished spans in each worker, we propagate this info to
the gateway node and do bulk updates there. Next step is to let the
workers run longer by not canceling them at 2 mins which is currently
done for historical reasons.

Fixes #36601.

Release note: None

Co-authored-by: Spas Bojanov <spas@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Aug 23, 2019

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

backfiller: Index/Column backfiller ResumeSpans need to be reworked
3 participants