Skip to content

Commit

Permalink
sql: don't distribute migration queries
Browse files Browse the repository at this point in the history
This patch inhibits DistSQL distribution for the queries that the
migrations run. This was prompted by #44101, which is causing a
distributed query done soon after a node startup to sometimes fail.

I've considered more bluntly disabling distribution for any query for a
short period of time after the node starts up, but I went with the more
targeted change to migrations because I think it's a bad idea for
migrations to use query distribution even outside of #44101 -
distributed queries are more fragile than local execution in general
(for example, because of DistSender retries). And migrations can't
tolerate any flakiness.

Fixes #43957
Fixes #44005
Touches #44101

Release note: None
  • Loading branch information
andreimatei committed Jan 23, 2020
1 parent 630bc57 commit e12735f
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 2 deletions.
1 change: 0 additions & 1 deletion pkg/cmd/roachtest/acceptance.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@ func registerAcceptance(r *testRegistry) {
{name: "status-server", fn: runStatusServer},
{
name: "version-upgrade",
skip: "#43957",
fn: runVersionUpgrade,
// This test doesn't like running on old versions because it upgrades to
// the latest released version and then it tries to "head", where head is
Expand Down
15 changes: 14 additions & 1 deletion pkg/server/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -1599,10 +1599,23 @@ func (s *Server) Start(ctx context.Context) error {
if migrationManagerTestingKnobs := s.cfg.TestingKnobs.SQLMigrationManager; migrationManagerTestingKnobs != nil {
mmKnobs = *migrationManagerTestingKnobs.(*sqlmigrations.MigrationManagerTestingKnobs)
}
migrationsExecutor := sql.MakeInternalExecutor(
ctx, s.pgServer.SQLServer, s.internalMemMetrics, s.ClusterSettings())
migrationsExecutor.SetSessionData(
&sessiondata.SessionData{
// Migrations need an executor with query distribution turned off. This is
// because the node crashes if migrations fail to execute, and query
// distribution introduces more moving parts. Local execution is more
// robust; for example, the DistSender has retries if it can't connect to
// another node, but DistSQL doesn't. Also see #44101 for why DistSQL is
// particularly fragile immediately after a node is started (i.e. the
// present situation).
DistSQLMode: sessiondata.DistSQLOff,
})
migMgr := sqlmigrations.NewManager(
s.stopper,
s.db,
s.internalExecutor,
&migrationsExecutor,
s.clock,
mmKnobs,
s.NodeID().String(),
Expand Down

0 comments on commit e12735f

Please sign in to comment.