jobs: improve the status reporting of index backfill jobs #39773

spaskob · 2019-08-20T17:41:38Z

Before this change index backfill workers would write their updated
spans concurrently to the job's row which can potentially lead to blow
up in the jobs table. This PR is a first step in improving this. Instead
of updating the finished spans in each worker, we propagate this info to
the gateway node and do bulk updates there. Next step is to let the
workers run longer by not canceling them at 2 mins which is currently
done for historical reasons.

Fixes #36601.

Release note: None

cockroach-teamcity · 2019-08-20T17:41:44Z

This change is

Before this change index backfill workers would write their updated spans concurrently to the job's row which can potentially lead to blow up in the jobs table. This PR is a first step in improving this. Instead of updating the finished spans in each worker, we propagate this info to the gateway node and do bulk updates there. Next step is to let the workers run longer by not canceling them at 2 mins which is currently done for historical reasons. Fixes cockroachdb#36601. Release note: None

dt · 2019-08-23T11:07:27Z

pkg/sql/backfill.go

@@ -843,10 +841,16 @@ func (sc *SchemaChanger) distBackfill(
 						otherTableDescs = append(otherTableDescs, *table.TableDesc())
 					}
 				}
-				rw := &errOnlyResultWriter{}
+				metaFn := func(_ context.Context, meta *distsqlpb.ProducerMetadata) {
+					if meta.BulkProcessorProgress != nil {


I think we avoid wrapping todoSpans access in a mutex because all the access from the callback is from the same goroutine (the distsql inbound goroutine) and all happens while this goroutine is blocked on Run.

If that is correct, that's a slightly brittle assumption but it works for me for now.

spaskob · 2019-08-23T14:05:18Z

bors r+

39773: jobs: improve the status reporting of index backfill jobs r=spaskob a=spaskob Before this change index backfill workers would write their updated spans concurrently to the job's row which can potentially lead to blow up in the jobs table. This PR is a first step in improving this. Instead of updating the finished spans in each worker, we propagate this info to the gateway node and do bulk updates there. Next step is to let the workers run longer by not canceling them at 2 mins which is currently done for historical reasons. Fixes #36601. Release note: None Co-authored-by: Spas Bojanov <spas@cockroachlabs.com>

craig · 2019-08-23T14:29:09Z

Build succeeded

GitHub CI (Cockroach)

spaskob requested a review from dt August 20, 2019 17:41

spaskob force-pushed the index-backfill-progress branch from 8f4be12 to 9bce82e Compare August 22, 2019 20:14

dt approved these changes Aug 23, 2019

View reviewed changes

dt reviewed Aug 23, 2019

View reviewed changes

craig bot merged commit 9bce82e into cockroachdb:master Aug 23, 2019

knz mentioned this pull request Nov 10, 2019

User-facing changes in 19.2 that were not picked up in release notes cockroachdb/docs#5819

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jobs: improve the status reporting of index backfill jobs #39773

jobs: improve the status reporting of index backfill jobs #39773

spaskob commented Aug 20, 2019

cockroach-teamcity commented Aug 20, 2019

dt Aug 23, 2019

spaskob commented Aug 23, 2019

craig bot commented Aug 23, 2019

jobs: improve the status reporting of index backfill jobs #39773

jobs: improve the status reporting of index backfill jobs #39773

Conversation

spaskob commented Aug 20, 2019

cockroach-teamcity commented Aug 20, 2019

dt Aug 23, 2019

Choose a reason for hiding this comment

spaskob commented Aug 23, 2019

craig bot commented Aug 23, 2019

Build succeeded