Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest/costfuzz: join reordering issue #88659

Closed
cockroach-teamcity opened this issue Sep 24, 2022 · 20 comments · Fixed by #88779
Closed

roachtest/costfuzz: join reordering issue #88659

cockroach-teamcity opened this issue Sep 24, 2022 · 20 comments · Fixed by #88779
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. S-3-erroneous-edge-case Database produces or stores erroneous data without visible error/warning, in rare edge cases. T-sql-queries SQL Queries Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Sep 24, 2022

roachtest.costfuzz failed with artifacts on master @ 89f4ad907a1756551bd6864c3e8516eeff6b0e0a:

		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runOneRoundQueryComparison
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/query_comparison_util.go:233
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runQueryComparison
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/query_comparison_util.go:66
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runCostFuzz
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/costfuzz.go:41
		  | main.(*testRunner).runTest.func2
		  | 	main/pkg/cmd/roachtest/test_runner.go:928
		  | runtime.goexit
		  | 	GOROOT/src/runtime/asm_amd64.s:1594
		Wraps: (4) expected unperturbed and perturbed results to be equal
		  |   []string{
		  |   	... // 3 identical elements
		  |   	"104,1.2345678901234568e-31,1664008168391878756.0000000000,1.2345"...,
		  |   	"104,1.2345678901234568e-31,1664008168391878756.0000000000,1.2345"...,
		  | + 	"104,1.2345678901234568e-31,1664008168391878756.0000000000,1.2345678901234563e-07,1.2345678901234566e-20,1.2345678901234566e-20,1.7976931348623157e+308,1.2345678901234566e-20,6.596932228060170625E+37",
		  |   	"104,1.2345678901234568e-31,1664008168391878756.0000000000,1.2345"...,
		  |   }
		  | sql: SELECT
		  | 	tab_27925.tableoid AS col_61799,
		  | 	1.2345678901234568e-31:::FLOAT8 AS col_61800,
		  | 	tab_27923.crdb_internal_mvcc_timestamp AS col_61801,
		  | 	1.2345678901234563e-07:::FLOAT8 AS col_61802,
		  | 	tab_27926.col1_1 AS col_61803,
		  | 	tab_27924.col1_1 AS col_61804,
		  | 	tab_27922.col1_0 AS col_61805,
		  | 	tab_27924.col1_1 AS col_61806,
		  | 	6.596932228060170625E+37:::DECIMAL AS col_61807
		  | FROM
		  | 	defaultdb.public.table1@[0] AS tab_27922
		  | 	RIGHT JOIN defaultdb.public.table1@[0] AS tab_27923 ON
		  | 			(tab_27922.col1_1) = (tab_27923.col1_1)
		  | 			AND (tab_27922.crdb_internal_mvcc_timestamp) >= (tab_27923.crdb_internal_mvcc_timestamp)
		  | 			AND (tab_27922.col1_1) = (tab_27923.col1_0)
		  | 	JOIN defaultdb.public.table1@[0] AS tab_27924 ON
		  | 			(tab_27923.col1_0) = (tab_27924.col1_0) AND (tab_27923.col1_1) = (tab_27924.col1_0)
		  | 	JOIN defaultdb.public.table1@[0] AS tab_27925 ON
		  | 			(tab_27923.col1_0) = (tab_27925.col1_0)
		  | 			AND (tab_27924.crdb_internal_mvcc_timestamp) = (tab_27925.crdb_internal_mvcc_timestamp)
		  | 			AND (tab_27922.col1_1) = (tab_27925.col1_0)
		  | 			AND (tab_27922.crdb_internal_mvcc_timestamp) = (tab_27925.crdb_internal_mvcc_timestamp)
		  | 			AND (tab_27924.col1_0) = (tab_27925.col1_0)
		  | 			AND (tab_27924.col1_1) = (tab_27925.col1_1)
		  | 	JOIN defaultdb.public.table1@[0] AS tab_27926 ON
		  | 			(tab_27925.col1_0) = (tab_27926.col1_1)
		  | 			AND (tab_27924.col1_1) = (tab_27926.col1_0)
		  | 			AND (tab_27924.col1_0) = (tab_27926.col1_0)
		  | ORDER BY
		  | 	tab_27924.crdb_internal_mvcc_timestamp
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.leafError

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=4 , ROACHTEST_encrypted=false , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

/cc @cockroachdb/sql-queries

This test on roachdash | Improve this report!

Jira issue: CRDB-19846

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Sep 24, 2022
@cockroach-teamcity cockroach-teamcity added this to the 22.2 milestone Sep 24, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Sep 24, 2022
@cockroach-teamcity

This comment was marked as off-topic.

@cockroach-teamcity

This comment was marked as duplicate.

@mgartner
Copy link
Collaborator

Created #88710 to track the last (hidden) failure.

@mgartner
Copy link
Collaborator

reduce is having difficulties with the original report at the top. I've partially reduced it to:

CREATE TABLE table1 (
  col1_0 FLOAT8 NOT NULL,
  col1_1 FLOAT8 NOT NULL,
  PRIMARY KEY (col1_0 ASC),
  INDEX (col1_1 DESC),
  UNIQUE ((col1_1 + col1_0) ASC) STORING (col1_1),
  UNIQUE (col1_0 ASC) WHERE table1.col1_1 > (-1.0):::FLOAT8
);

ALTER TABLE table1 INJECT STATISTICS '[{"avg_size": 4, "columns": ["col1_1"], "created_at": "2000-01-01 00:00:00+00:00", "distinct_count": 1319231781637867192, "histo_buckets": [{"distinct_range": 0, "num_eq": 900000, "num_range": 0, "upper_bound": "-0.9022742731095619"}, {"distinct_range": 0, "num_eq": 0, "num_range": 200000000, "upper_bound": "-0.03298270844476303"}, {"distinct_range": 90000000, "num_eq": 800000000, "num_range": 90000000, "upper_bound": "0.1759378487060853"}, {"distinct_range": 8000000000, "num_eq": 2493550396308236629, "num_range": 8000000000, "upper_bound": "1.0648625381898507"}], "histo_col_type": "FLOAT8", "name": "__auto__", "null_count": 0, "row_count": 5164449749148844825}, {"avg_size": 5, "columns": ["col1_0"], "created_at": "2000-01-01 00:00:00+00:00", "distinct_count": 2103359280294413962, "histo_buckets": [{"distinct_range": 0, "num_eq": 40, "num_range": 0, "upper_bound": "-1.143144685560062"}, {"distinct_range": 1000000, "num_eq": 6858931304526079805, "num_range": 1000000, "upper_bound": "1.4729149128589558"}], "histo_col_type": "FLOAT8", "name": "__auto__", "null_count": 0, "row_count": 5164449749148844825}]':::JSONB;

SET unconstrained_non_covering_index_scan_enabled = true;

INSERT INTO table1 AS tab_841 (col1_0, col1_1)
VALUES (1.2345678901234566e-49:::FLOAT8, 0.5956640437568941:::FLOAT8);

INSERT INTO table1 AS tab_842 (col1_0, col1_1)
VALUES (1.2345678901234566e-20:::FLOAT8, 1.2345678901234557e+48:::FLOAT8);

UPDATE table1 AS tab_845 SET col1_1 = tab_845.col1_0 WHERE true ORDER BY tab_845.col1_0 LIMIT 20:::INT8;

INSERT INTO table1 AS tab_866 (col1_0, col1_1)
VALUES (1.7976931348623157e+308:::FLOAT8, 1.2345678901234566e-20:::FLOAT8);

SELECT 1
FROM
	table1 AS tab_27922
	RIGHT JOIN table1 AS tab_27923 ON
			(tab_27922.col1_1) = (tab_27923.col1_1)
			AND (tab_27922.crdb_internal_mvcc_timestamp) >= (tab_27923.crdb_internal_mvcc_timestamp)
			AND (tab_27922.col1_1) = (tab_27923.col1_0)
	JOIN table1 AS tab_27924 ON
			(tab_27923.col1_0) = (tab_27924.col1_0) AND (tab_27923.col1_1) = (tab_27924.col1_0)
	JOIN table1 AS tab_27925 ON
			(tab_27923.col1_0) = (tab_27925.col1_0)
			AND (tab_27924.crdb_internal_mvcc_timestamp) = (tab_27925.crdb_internal_mvcc_timestamp)
			AND (tab_27922.col1_1) = (tab_27925.col1_0)
			AND (tab_27922.crdb_internal_mvcc_timestamp) = (tab_27925.crdb_internal_mvcc_timestamp)
			AND (tab_27924.col1_0) = (tab_27925.col1_0)
			AND (tab_27924.col1_1) = (tab_27925.col1_1)
	JOIN table1 AS tab_27926 ON
			AND (tab_27924.col1_1) = (tab_27926.col1_0)
			AND (tab_27924.col1_0) = (tab_27926.col1_0);

SET testing_optimizer_random_seed = 2758112374651167630;

SET testing_optimizer_cost_perturbation = 1.0;

SELECT 1
FROM
	table1 AS tab_27922
	RIGHT JOIN table1 AS tab_27923 ON
			(tab_27922.col1_1) = (tab_27923.col1_1)
			AND (tab_27922.crdb_internal_mvcc_timestamp) >= (tab_27923.crdb_internal_mvcc_timestamp)
			AND (tab_27922.col1_1) = (tab_27923.col1_0)
	JOIN table1 AS tab_27924 ON
			(tab_27923.col1_0) = (tab_27924.col1_0) AND (tab_27923.col1_1) = (tab_27924.col1_0)
	JOIN table1 AS tab_27925 ON
			(tab_27923.col1_0) = (tab_27925.col1_0)
			AND (tab_27924.crdb_internal_mvcc_timestamp) = (tab_27925.crdb_internal_mvcc_timestamp)
			AND (tab_27922.col1_1) = (tab_27925.col1_0)
			AND (tab_27922.crdb_internal_mvcc_timestamp) = (tab_27925.crdb_internal_mvcc_timestamp)
			AND (tab_27924.col1_0) = (tab_27925.col1_0)
			AND (tab_27924.col1_1) = (tab_27925.col1_1)
	JOIN table1 AS tab_27926 ON
			AND (tab_27924.col1_1) = (tab_27926.col1_0)
			AND (tab_27924.col1_0) = (tab_27926.col1_0);

@mgartner
Copy link
Collaborator

Reduced a bit more below. I'm trying to figure out if the joins can be reduced or not...

CREATE TABLE table1 (
  col1_0 INT PRIMARY KEY,
  col1_1 INT NOT NULL,
  INDEX (col1_1 DESC),
  UNIQUE ((col1_1 + col1_0) ASC) STORING (col1_1)
);

ALTER TABLE table1 INJECT STATISTICS '[{"columns": ["col1_1"], "created_at": "2000-01-01 00:00:00+00:00", "distinct_count": 999999999, "name": "__auto__", "null_count": 0, "row_count": 999999999999}]':::JSONB;

SET unconstrained_non_covering_index_scan_enabled = true;

INSERT INTO table1 AS tab_841 (col1_0, col1_1) VALUES (1, 1), (3, 3);

INSERT INTO table1 AS tab_866 (col1_0, col1_1) VALUES (5, 3);

SELECT 1
FROM
	table1 AS tab_27922
	RIGHT JOIN table1 AS tab_27923 ON
			(tab_27922.col1_1) = (tab_27923.col1_1)
			AND (tab_27922.crdb_internal_mvcc_timestamp) >= (tab_27923.crdb_internal_mvcc_timestamp)
			AND (tab_27922.col1_1) = (tab_27923.col1_0)
	JOIN table1 AS tab_27924 ON
			(tab_27923.col1_0) = (tab_27924.col1_0) AND (tab_27923.col1_1) = (tab_27924.col1_0)
	JOIN table1 AS tab_27925 ON
			(tab_27923.col1_0) = (tab_27925.col1_0)
			AND (tab_27924.crdb_internal_mvcc_timestamp) = (tab_27925.crdb_internal_mvcc_timestamp)
			AND (tab_27922.col1_1) = (tab_27925.col1_0)
			AND (tab_27922.crdb_internal_mvcc_timestamp) = (tab_27925.crdb_internal_mvcc_timestamp)
			AND (tab_27924.col1_0) = (tab_27925.col1_0)
			AND (tab_27924.col1_1) = (tab_27925.col1_1)
	JOIN table1 AS tab_27926 ON
			(tab_27925.col1_0) = (tab_27926.col1_1)
			AND (tab_27924.col1_1) = (tab_27926.col1_0)
			AND (tab_27924.col1_0) = (tab_27926.col1_0);

SET testing_optimizer_random_seed = 2758112374651167630;

SET testing_optimizer_cost_perturbation = 1.0;

SELECT 1
FROM
	table1 AS tab_27922
	RIGHT JOIN table1 AS tab_27923 ON
			(tab_27922.col1_1) = (tab_27923.col1_1)
			AND (tab_27922.crdb_internal_mvcc_timestamp) >= (tab_27923.crdb_internal_mvcc_timestamp)
			AND (tab_27922.col1_1) = (tab_27923.col1_0)
	JOIN table1 AS tab_27924 ON
			(tab_27923.col1_0) = (tab_27924.col1_0) AND (tab_27923.col1_1) = (tab_27924.col1_0)
	JOIN table1 AS tab_27925 ON
			(tab_27923.col1_0) = (tab_27925.col1_0)
			AND (tab_27924.crdb_internal_mvcc_timestamp) = (tab_27925.crdb_internal_mvcc_timestamp)
			AND (tab_27922.col1_1) = (tab_27925.col1_0)
			AND (tab_27922.crdb_internal_mvcc_timestamp) = (tab_27925.crdb_internal_mvcc_timestamp)
			AND (tab_27924.col1_0) = (tab_27925.col1_0)
			AND (tab_27924.col1_1) = (tab_27925.col1_1)
	JOIN table1 AS tab_27926 ON
			(tab_27925.col1_0) = (tab_27926.col1_1)
			AND (tab_27924.col1_1) = (tab_27926.col1_0)
			AND (tab_27924.col1_0) = (tab_27926.col1_0);

@cockroach-teamcity

This comment was marked as duplicate.

@mgartner
Copy link
Collaborator

I'm having trouble reducing it further. Updating the queries to:

CREATE TABLE t (
  a INT PRIMARY KEY,
  b INT NOT NULL,
  c INT,
  INDEX idx (b DESC),
  UNIQUE INDEX uniq ((b + a) ASC) STORING (b),
  FAMILY (a, b)
);

ALTER TABLE t INJECT STATISTICS '[{"columns": ["b"], "created_at": "2000-01-01 00:00:00+00:00", "distinct_count": 999999999, "name": "__auto__", "null_count": 0, "row_count": 999999999999}]':::JSONB;

SET unconstrained_non_covering_index_scan_enabled = true;

INSERT INTO t AS tab_841 (a, b, c) VALUES (1, 1, 0), (3, 3, 0);

INSERT INTO t AS tab_866 (a, b, c) VALUES (5, 3, 100000);

SELECT 1, t0.a, t0.crdb_internal_mvcc_timestamp, t2.a, t2.crdb_internal_mvcc_timestamp, t3.a, t3.crdb_internal_mvcc_timestamp
FROM
	t AS t0
	RIGHT JOIN t AS t1 ON
			(t0.c) >= (t1.c)
	JOIN t AS t2 ON
			(t1.b) = (t2.a)
	JOIN t AS t3 ON
			(t1.a) = (t3.a)
			AND (t2.crdb_internal_mvcc_timestamp) = (t3.crdb_internal_mvcc_timestamp)
			AND (t0.b) = (t3.a)
			AND (t0.crdb_internal_mvcc_timestamp) = (t3.crdb_internal_mvcc_timestamp)
			AND (t2.a) = (t3.a)
			AND (t2.b) = (t3.b)
	JOIN t AS t4 ON
			(t3.a) = (t4.b)
			AND (t2.b) = (t4.a)
			AND (t2.a) = (t4.a);

SET testing_optimizer_random_seed = 2758112374651167630;

SET testing_optimizer_cost_perturbation = 1.0;

SELECT 1, t0.a, t0.crdb_internal_mvcc_timestamp, t2.a, t2.crdb_internal_mvcc_timestamp, t3.a, t3.crdb_internal_mvcc_timestamp
FROM
	t AS t0
	RIGHT JOIN t AS t1 ON
			(t0.c) >= (t1.c)
	JOIN t AS t2 ON
			(t1.b) = (t2.a)
	JOIN t AS t3 ON
			(t1.a) = (t3.a)
			AND (t2.crdb_internal_mvcc_timestamp) = (t3.crdb_internal_mvcc_timestamp)
			AND (t0.b) = (t3.a)
			AND (t0.crdb_internal_mvcc_timestamp) = (t3.crdb_internal_mvcc_timestamp)
			AND (t2.a) = (t3.a)
			AND (t2.b) = (t3.b)
	JOIN t AS t4 ON
			(t3.a) = (t4.b)
			AND (t2.b) = (t4.a)
			AND (t2.a) = (t4.a);

yields:

  ?column? | a |  crdb_internal_mvcc_timestamp  | a |  crdb_internal_mvcc_timestamp  | a |  crdb_internal_mvcc_timestamp                                                                                                                              [0/34171]
-----------+---+--------------------------------+---+--------------------------------+---+---------------------------------
         1 | 3 | 1664219554005001000.0000000000 | 3 | 1664219554005001000.0000000000 | 3 | 1664219554005001000.0000000000
         1 | 1 | 1664219554005001000.0000000000 | 1 | 1664219554005001000.0000000000 | 1 | 1664219554005001000.0000000000
(2 rows)


  ?column? | a |  crdb_internal_mvcc_timestamp  | a |  crdb_internal_mvcc_timestamp  | a |  crdb_internal_mvcc_timestamp
-----------+---+--------------------------------+---+--------------------------------+---+---------------------------------
         1 | 1 | 1664219554005001000.0000000000 | 1 | 1664219554005001000.0000000000 | 1 | 1664219554005001000.0000000000
         1 | 3 | 1664219554005001000.0000000000 | 3 | 1664219554005001000.0000000000 | 3 | 1664219554005001000.0000000000
         1 | 5 | 1664219554011737000.0000000000 | 3 | 1664219554005001000.0000000000 | 3 | 1664219554005001000.0000000000

It looks like the latter result is incorrect. All three of the crdb_internal_mvcc_timestamp columns should be equal, based on the join filters. But in the last row of the second query result, t0.crdb_internal_mvcc_timestamp is not equal to the other two.

The query plan shows that the (t0.crdb_internal_mvcc_timestamp) = (t3.crdb_internal_mvcc_timestamp) filter is omitted. Notice only the single filter referecing crdb_internal_mvcc_timestamp columns.

  project
   ├── columns: "?column?":31 a:1 crdb_internal_mvcc_timestamp:5 a:13 crdb_internal_mvcc_timestamp:17 a:19 crdb_internal_mvcc_timestamp:23
   ├── immutable
   ├── stats: [rows=3.169967e+08]
   ├── cost: 1.68026691e+11
   ├── key: (1)
   ├── fd: ()-->(31), (1)-->(5,13,17,19,23), (13)-->(17), (13)==(19), (19)-->(23), (5)==(17,23), (17)==(5,23), (23)==(5,17), (19)==(13)
   ├── distribution: us-east1
   ├── prune: (1,5,13,17,19,23,31)
   ├── inner-join (lookup t)
   │    ├── columns: a:1 b:2 c:3 crdb_internal_mvcc_timestamp:5 a:7 b:8 c:9 a:13 b:14 crdb_internal_mvcc_timestamp:17 a:19 b:20 crdb_internal_mvcc_timestamp:23 a:25 b:26
   │    ├── key columns: [2] = [19]
   │    ├── lookup columns are key
   │    ├── immutable
   │    ├── stats: [rows=3.169967e+08, distinct(13)=100, null(13)=0, distinct(19)=100, null(19)=0, distinct(25)=100, null(25)=0, distinct(26)=100, null(26)=0]
   │    ├── cost: 1.68020888e+11
   │    ├── key: (1)
   │    ├── fd: (7)-->(9), (7)==(2,8,13,14,19,20,25,26), (8)==(2,7,13,14,19,20,25,26), (1)-->(2,3,5), (2)==(7,8,13,14,19,20,25,26), (13)-->(17), (13)==(2,7,8,14,19,20,25,26), (14)==(2,7,8,13,19,20,25,26), (19)-->(23), (5)==(17,23), (17)==(5,23), (23)==(5,17), (19)==(2,7,8,13,14,20,25,26), (20)==(2,7,8,13,14,19,25,26), (26)==(2,7,8,13,14,19,20,25), (25)==(2,7,8,13,14,19,20,26)
   │    ├── distribution: us-east1
   │    ├── inner-join (lookup t@idx)
   │    │    ├── columns: a:1 b:2 c:3 crdb_internal_mvcc_timestamp:5 a:7 b:8 c:9 a:13 b:14 crdb_internal_mvcc_timestamp:17 a:25 b:26
   │    │    ├── key columns: [2 13] = [26 25]
   │    │    ├── lookup columns are key
   │    │    ├── immutable
   │    │    ├── stats: [rows=320198.7, distinct(1)=273532, null(1)=0, distinct(2)=100, null(2)=0, distinct(3)=320199, null(3)=0, distinct(5)=99, null(5)=0, distinct(7)=100, null(7)=0, distinct(8)=100, null(8)=0, distinct(9)=100, null(9)=0, distinct(13)=100, null(13)=0, distinct(14)=100, null(14)=0, distinct(17)=99, null(17)=0, distinct(25)=100, null(25)=0, distinct(26)=100, null(26)=0]
   │    │    ├── cost: 1.67920272e+11
   │    │    ├── key: (1)
   │    │    ├── fd: (7)-->(9), (7)==(2,8,13,14,25,26), (8)==(2,7,13,14,25,26), (1)-->(2,3,5), (13)-->(17), (13)==(2,7,8,14,25,26), (14)==(2,7,8,13,25,26), (25)==(2,7,8,13,14,26), (26)==(2,7,8,13,14,25), (2)==(7,8,13,14,25,26), (5)==(17), (17)==(5)
   │    │    ├── distribution: us-east1
   │    │    ├── interesting orderings: (+(7|8)) (-(7|8)) (+1) (-2,+1) (+(13|14)) (-(13|14)) (+25) (-26,+25)
   │    │    ├── inner-join (lookup t)
   │    │    │    ├── columns: a:1 b:2 c:3 crdb_internal_mvcc_timestamp:5 a:7 b:8 c:9 a:13 b:14 crdb_internal_mvcc_timestamp:17
   │    │    │    ├── key columns: [1] = [1]
   │    │    │    ├── lookup columns are key
   │    │    │    ├── stats: [rows=32670, distinct(1)=20711.7, null(1)=0, distinct(2)=100, null(2)=0, distinct(3)=20711.7, null(3)=0, distinct(5)=20711.7, null(5)=326.7, distinct(7)=100, null(7)=0, distinct(8)=100, null(8)=0, distinct(9)=100, null(9)=0, distinct(13)=100, null(13)=0, distinct(14)=100, null(14)=0, distinct(17)=100, null(17)=326.7]
   │    │    │    ├── cost: 1.67919834e+11
   │    │    │    ├── key: (1)
   │    │    │    ├── fd: (7)-->(9), (7)==(2,8,13,14), (8)==(2,7,13,14), (1)-->(2,3,5), (2)==(7,8,13,14), (13)-->(17), (13)==(2,7,8,14), (14)==(2,7,8,13)
   │    │    │    ├── distribution: us-east1
   │    │    │    ├── prune: (1,5,17)
   │    │    │    ├── interesting orderings: (+(7|8)) (-(7|8)) (+1) (-2,+1) (+(13|14)) (-(13|14))
   │    │    │    ├── inner-join (lookup t@idx)
   │    │    │    │    ├── columns: a:1 b:2 a:7 b:8 c:9 a:13 b:14 crdb_internal_mvcc_timestamp:17
   │    │    │    │    ├── key columns: [7] = [2]
   │    │    │    │    ├── stats: [rows=100000, distinct(1)=100000, null(1)=0, distinct(2)=100, null(2)=0, distinct(7)=100, null(7)=0, distinct(8)=63.3968, null(8)=0, distinct(13)=100, null(13)=0, distinct(14)=63.3968, null(14)=0]
   │    │    │    │    ├── cost: 1.6791381e+11
   │    │    │    │    ├── key: (1)
   │    │    │    │    ├── fd: (7)-->(9), (7)==(2,8,13,14), (8)==(2,7,13,14), (13)-->(17), (13)==(2,7,8,14), (14)==(2,7,8,13), (1)-->(2), (2)==(7,8,13,14)
   │    │    │    │    ├── distribution: us-east1
   │    │    │    │    ├── interesting orderings: (+(2|7|8|13|14)) (-(2|7|8|13|14))
   │    │    │    │    ├── inner-join (lookup t)
   │    │    │    │    │    ├── columns: a:7 b:8 c:9 a:13 b:14 crdb_internal_mvcc_timestamp:17
   │    │    │    │    │    ├── key columns: [13] = [7]
   │    │    │    │    │    ├── lookup columns are key
   │    │    │    │    │    ├── stats: [rows=100, distinct(7)=100, null(7)=0, distinct(8)=63.3968, null(8)=0, distinct(13)=100, null(13)=0, distinct(14)=63.3968, null(14)=0, distinct(17)=63.3968, null(17)=1]
   │    │    │    │    │    ├── cost: 1.67913484e+11
   │    │    │    │    │    ├── key: (13)
   │    │    │    │    │    ├── fd: (7)-->(9), (7)==(8,13,14), (8)==(7,13,14), (13)-->(17), (13)==(7,8,14), (14)==(7,8,13)
   │    │    │    │    │    ├── distribution: us-east1
   │    │    │    │    │    ├── interesting orderings: (+(7|8)) (-(7|8)) (+(13|14)) (-(13|14))
   │    │    │    │    │    ├── select
   │    │    │    │    │    │    ├── columns: a:13 b:14 crdb_internal_mvcc_timestamp:17
   │    │    │    │    │    │    ├── stats: [rows=100, distinct(13)=100, null(13)=0, distinct(14)=100, null(14)=0, distinct(17)=100, null(17)=1]
   │    │    │    │    │    │    ├── cost: 1.67913459e+11
   │    │    │    │    │    │    ├── key: (13)
   │    │    │    │    │    │    ├── fd: (13)-->(17), (13)==(14), (14)==(13)
   │    │    │    │    │    │    ├── distribution: us-east1
   │    │    │    │    │    │    ├── prune: (17)
   │    │    │    │    │    │    ├── interesting orderings: (+(13|14)) (-(13|14))
   │    │    │    │    │    │    ├── scan t
   │    │    │    │    │    │    │    ├── columns: a:13 b:14 crdb_internal_mvcc_timestamp:17
   │    │    │    │    │    │    │    ├── computed column expressions
   │    │    │    │    │    │    │    │    └── crdb_internal_idx_expr:16
   │    │    │    │    │    │    │    │         └── b:14 + a:13
   │    │    │    │    │    │    │    ├── stats: [rows=1e+12, distinct(13)=1e+12, null(13)=0, distinct(14)=1e+09, null(14)=0, distinct(17)=1e+11, null(17)=1e+10]
   │    │    │    │    │    │    │    ├── cost: 1.6014708e+11
   │    │    │    │    │    │    │    ├── key: (13)
   │    │    │    │    │    │    │    ├── fd: (13)-->(14,17)
   │    │    │    │    │    │    │    ├── distribution: us-east1
   │    │    │    │    │    │    │    ├── prune: (13,14,17)
   │    │    │    │    │    │    │    ├── interesting orderings: (+13) (-14,+13)
   │    │    │    │    │    │    │    └── unfiltered-cols: (13-18)
   │    │    │    │    │    │    └── filters
   │    │    │    │    │    │         └── a:13 = b:14 [outer=(13,14), constraints=(/13: (/NULL - ]; /14: (/NULL - ]), fd=(13)==(14), (14)==(13)]
   │    │    │    │    │    └── filters
   │    │    │    │    │         └── a:7 = b:8 [outer=(7,8), constraints=(/7: (/NULL - ]; /8: (/NULL - ]), fd=(7)==(8), (8)==(7)]
   │    │    │    │    └── filters (true)
   │    │    │    └── filters
   │    │    │         └── c:3 >= c:9 [outer=(3,9), constraints=(/3: (/NULL - ]; /9: (/NULL - ])]
   │    │    └── filters (true)
   │    └── filters
   │         ├── crdb_internal_mvcc_timestamp:17 = crdb_internal_mvcc_timestamp:23 [outer=(17,23), immutable, constraints=(/17: (/NULL - ]; /23: (/NULL - ]), fd=(17)==(23), (23)==(17)]
   │         └── b:14 = b:20 [outer=(14,20), constraints=(/14: (/NULL - ]; /20: (/NULL - ]), fd=(14)==(20), (20)==(14)]
   └── projections
        └── 1 [as="?column?":31]

@mgartner
Copy link
Collaborator

mgartner commented Sep 26, 2022

Slight reduction by using a DECIMAL column instead of crdb_internal_mvcc_timestamp.

CREATE TABLE t (
  a INT PRIMARY KEY,
  b INT NOT NULL,
  c DECIMAL,
  INDEX idx (b DESC),
  UNIQUE INDEX uniq ((b + a) ASC) STORING (b),
  FAMILY (a, b)
);

ALTER TABLE t INJECT STATISTICS '[{"columns": ["b"], "created_at": "2000-01-01 00:00:00+00:00", "distinct_count": 999999999, "name": "__auto__", "null_count": 0, "row_count": 999999999999}]':::JSONB;

SET unconstrained_non_covering_index_scan_enabled = true;

INSERT INTO t AS tab_841 (a, b, c) VALUES (1, 1, 1.0), (3, 3, 1.0);

INSERT INTO t AS tab_866 (a, b, c) VALUES (5, 3, 3.0);

SELECT 1, t0.a, t0.c, t2.a, t2.c, t3.a, t3.c
FROM
	t AS t0
	RIGHT JOIN t AS t1 ON
			(t0.c) >= (t1.c)
	JOIN t AS t2 ON
			(t1.b) = (t2.a)
	JOIN t AS t3 ON
			(t1.a) = (t3.a)
			AND (t2.c) = (t3.c)
			AND (t0.b) = (t3.a)
			AND (t0.c) = (t3.c)
			AND (t2.a) = (t3.a)
			AND (t2.b) = (t3.b)
	JOIN t AS t4 ON
			(t3.a) = (t4.b)
			AND (t2.b) = (t4.a)
			AND (t2.a) = (t4.a);

SET testing_optimizer_random_seed = 2758112374651167630;

SET testing_optimizer_cost_perturbation = 1.0;

SELECT 1, t0.a, t0.c, t2.a, t2.c, t3.a, t3.c
FROM
	t AS t0
	RIGHT JOIN t AS t1 ON
			(t0.c) >= (t1.c)
	JOIN t AS t2 ON
			(t1.b) = (t2.a)
	JOIN t AS t3 ON
			(t1.a) = (t3.a)
			AND (t2.c) = (t3.c)
			AND (t0.b) = (t3.a)
			AND (t0.c) = (t3.c)
			AND (t2.a) = (t3.a)
			AND (t2.b) = (t3.b)
	JOIN t AS t4 ON
			(t3.a) = (t4.b)
			AND (t2.b) = (t4.a)
			AND (t2.a) = (t4.a);

@mgartner
Copy link
Collaborator

This might be related to join reordering. I noticed that when reordering the joins, a filter holding c:3 equal to c:15, or euivalently holidng c:3 = c:21, is sometimes omitted. For example:

New expression 9 of 29:
  project
   ├── columns: "?column?":31!null a:1!null c:3!null a:13!null c:15!null a:19!null c:21!null
   ├── immutable
   ├── key: (1)
   ├── fd: ()-->(31), (1)-->(3,13,15,19,21), (13)-->(15), (13)==(19), (19)-->(21), (3)==(15,21), (15)==(3,21), (21)==(3,15), (19)==(13)
   ├── inner-join (hash)
   │    ├── columns: a:1!null b:2!null c:3!null a:7!null b:8!null c:9!null a:13!null b:14!null c:15!null a:19!null b:20!null c:21!null a:25!null b:26!null
   │    ├── multiplicity: left-rows(zero-or-more), right-rows(zero-or-one)
   │    ├── immutable
   │    ├── key: (1)
   │    ├── fd: (7)-->(9), (7)==(2,8,13,14,19,20,25,26), (8)==(2,7,13,14,19,20,25,26), (1)-->(2,3), (2)==(7,8,13,14,19,20,25,26), (13)-->(15), (13)==(2,7,8,14,19,20,25,26), (14)==(2,7,8,13,19,20,25,26), (19)-->(21), (3)==(15,21), (15)==(3,21), (21)==(3,15), (19)==(2,7,8,13,14,20,25,26), (20)==(2,7,8,13,14,19,25,26), (26)==(2,7,8,13,14,19,20,25), (25)==(2,7,8,13,14,19,20,26)
   │    ├── inner-join (lookup t)
   │    │    ├── columns: a:7!null b:8!null c:9 a:13!null b:14!null c:15
   │    │    ├── key columns: [13] = [7]
   │    │    ├── lookup columns are key
   │    │    ├── key: (13)
   │    │    ├── fd: (7)-->(9), (7)==(8,13,14), (8)==(7,13,14), (13)-->(15), (13)==(7,8,14), (14)==(7,8,13)
   │    │    ├── select
   │    │    │    ├── columns: a:13!null b:14!null c:15
   │    │    │    ├── key: (13)
   │    │    │    ├── fd: (13)-->(15), (13)==(14), (14)==(13)
   │    │    │    ├── scan t
   │    │    │    │    ├── columns: a:13!null b:14!null c:15
   │    │    │    │    ├── computed column expressions
   │    │    │    │    │    └── crdb_internal_idx_expr:18
   │    │    │    │    │         └── b:14 + a:13
   │    │    │    │    ├── key: (13)
   │    │    │    │    └── fd: (13)-->(14,15)
   │    │    │    └── filters
   │    │    │         └── a:13 = b:14 [outer=(13,14), constraints=(/13: (/NULL - ]; /14: (/NULL - ]), fd=(13)==(14), (14)==(13)]
   │    │    └── filters
   │    │         └── a:7 = b:8 [outer=(7,8), constraints=(/7: (/NULL - ]; /8: (/NULL - ]), fd=(7)==(8), (8)==(7)]
   │    ├── inner-join (hash)
   │    │    ├── columns: a:1!null b:2!null c:3!null a:19!null b:20!null c:21!null a:25!null b:26!null
   │    │    ├── multiplicity: left-rows(zero-or-more), right-rows(zero-or-one)
   │    │    ├── immutable
   │    │    ├── key: (1)
   │    │    ├── fd: (1)-->(2,3), (19)-->(21), (19)==(2,20,25,26), (20)==(2,19,25,26), (25)==(2,19,20,26), (26)==(2,19,20,25), (2)==(19,20,25,26), (3)==(21), (21)==(3)
   │    │    ├── scan t@idx
   │    │    │    ├── columns: a:25!null b:26!null
   │    │    │    ├── key: (25)
   │    │    │    └── fd: (25)-->(26)
   │    │    ├── inner-join (merge)
   │    │    │    ├── columns: a:1!null b:2!null c:3!null a:19!null b:20!null c:21!null
   │    │    │    ├── left ordering: -2,+3
   │    │    │    ├── right ordering: -19,+21
   │    │    │    ├── immutable
   │    │    │    ├── key: (1)
   │    │    │    ├── fd: (1)-->(2,3), (19)-->(21), (2)==(19,20), (19)==(2,20), (20)==(2,19), (3)==(21), (21)==(3)
   │    │    │    ├── sort
   │    │    │    │    ├── columns: a:1!null b:2!null c:3
   │    │    │    │    ├── key: (1)
   │    │    │    │    ├── fd: (1)-->(2,3)
   │    │    │    │    ├── ordering: -2,+3
   │    │    │    │    └── scan t
   │    │    │    │         ├── columns: a:1!null b:2!null c:3
   │    │    │    │         ├── computed column expressions
   │    │    │    │         │    └── crdb_internal_idx_expr:6
   │    │    │    │         │         └── b:2 + a:1
   │    │    │    │         ├── key: (1)
   │    │    │    │         └── fd: (1)-->(2,3)
   │    │    │    ├── scan t,rev
   │    │    │    │    ├── columns: a:19!null b:20!null c:21
   │    │    │    │    ├── computed column expressions
   │    │    │    │    │    └── crdb_internal_idx_expr:24
   │    │    │    │    │         └── b:20 + a:19
   │    │    │    │    ├── key: (19)
   │    │    │    │    ├── fd: (19)-->(20,21)
   │    │    │    │    └── ordering: -19
   │    │    │    └── filters
   │    │    │         └── b:2 = b:20 [outer=(2,20), constraints=(/2: (/NULL - ]; /20: (/NULL - ]), fd=(2)==(20), (20)==(2)]
   │    │    └── filters
   │    │         ├── a:19 = b:26 [outer=(19,26), constraints=(/19: (/NULL - ]; /26: (/NULL - ]), fd=(19)==(26), (26)==(19)]
   │    │         └── b:2 = a:25 [outer=(2,25), constraints=(/2: (/NULL - ]; /25: (/NULL - ]), fd=(2)==(25), (25)==(2)]
   │    └── filters
   │         ├── c:3 >= c:9 [outer=(3,9), immutable, constraints=(/3: (/NULL - ]; /9: (/NULL - ])]
   │         ├── a:7 = b:2 [outer=(2,7), constraints=(/2: (/NULL - ]; /7: (/NULL - ]), fd=(2)==(7), (7)==(2)]
   │         └── c:15 = c:21 [outer=(15,21), immutable, constraints=(/15: (/NULL - ]; /21: (/NULL - ]), fd=(15)==(21), (21)==(15)]
   └── projections
        └── 1 [as="?column?":31]

This smells a lot like #76522.

@DrewKimball
Copy link
Collaborator

DrewKimball commented Sep 26, 2022

I noticed that when reordering the joins, a filter holding c:3 equal to c:15, or euivalently holidng c:3 = c:21, is sometimes omitted. For example:

I don't see 3=15 or 3=21 being omitted there, maybe you're missing the 3=21 in the merge join?

@DrewKimball
Copy link
Collaborator

DrewKimball commented Sep 26, 2022

I'm getting null rejection requested on non-null column when I try to run the newest reproduction under optsteps, I wonder if it's related? It's kind of surprising, since no rules should have been disabled...

@DrewKimball
Copy link
Collaborator

I'm getting null rejection requested on non-null column when I try to run the newest reproduction under optsteps, I wonder if it's related? It's kind of surprising, since no rules should have been disabled...

Ok, this problem seems unrelated. The optsteps tool effectively disables all rules once the required number of steps has been reached for an iteration, which the DeriveRejectNullCols doesn't expect.

@DrewKimball
Copy link
Collaborator

Something is amiss:

================================================================================
GenerateMergeJoins
  Cost: 324852422981.15
================================================================================
   project
    ├── columns: "?column?":31!null a:1!null c:3!null a:13!null c:15!null a:19!null c:21!null
    ├── immutable
    ├── key: (1)
    ├── fd: ()-->(31), (1)-->(3,13,15,19,21), (13)-->(15), (13)==(19), (19)-->(21), (3)==(15,21), (15)==(3,21), (21)==(3,15), (19)==(13)
  - ├── inner-join (lookup t)
  + ├── inner-join (lookup t@idx)
    │    ├── columns: a:1!null b:2!null c:3!null a:7!null b:8!null c:9!null a:13!null b:14!null c:15!null a:19!null b:20!null c:21!null a:25!null b:26!null
  - │    ├── key columns: [13] = [25]
  + │    ├── key columns: [19 13] = [26 25]
    │    ├── lookup columns are key
    │    ├── immutable
    │    ├── key: (1)
    │    ├── fd: (7)-->(9), (7)==(2,8,13,14,19,20,25,26), (8)==(2,7,13,14,19,20,25,26), (1)-->(2,3), (2)==(7,8,13,14,19,20,25,26), (13)-->(15), (13)==(2,7,8,14,19,20,25,26), (14)==(2,7,8,13,19,20,25,26), (19)-->(21), (3)==(15,21), (15)==(3,21), (21)==(3,15), (19)==(2,7,8,13,14,20,25,26), (20)==(2,7,8,13,14,19,25,26), (26)==(2,7,8,13,14,19,20,25), (25)==(2,7,8,13,14,19,20,26)
    │    ├── inner-join (lookup t)
    │    │    ├── columns: a:1!null b:2!null c:3!null a:7!null b:8!null c:9!null a:13!null b:14!null c:15!null a:19!null b:20!null c:21!null
    │    │    ├── key columns: [2] = [19]
    │    │    ├── lookup columns are key
    │    │    ├── immutable
    │    │    ├── key: (1)
    │    │    ├── fd: (7)-->(9), (7)==(2,8,13,14,19,20), (8)==(2,7,13,14,19,20), (1)-->(2,3), (2)==(7,8,13,14,19,20), (13)-->(15), (13)==(2,7,8,14,19,20), (14)==(2,7,8,13,19,20), (19)-->(21), (3)==(15,21), (15)==(3,21), (21)==(3,15), (19)==(2,7,8,13,14,20), (20)==(2,7,8,13,14,19)
    │    │    ├── inner-join (lookup t)
    │    │    │    ├── columns: a:1!null b:2!null c:3!null a:7!null b:8!null c:9!null a:13!null b:14!null c:15
    │    │    │    ├── key columns: [2] = [7]
    │    │    │    ├── lookup columns are key
    │    │    │    ├── immutable
    │    │    │    ├── key: (1)
    │    │    │    ├── fd: (7)-->(9), (7)==(2,8,13,14), (8)==(2,7,13,14), (1)-->(2,3), (2)==(7,8,13,14), (13)-->(15), (13)==(2,7,8,14), (14)==(2,7,8,13)
    │    │    │    ├── inner-join (lookup t)
    │    │    │    │    ├── columns: a:1!null b:2!null c:3 a:13!null b:14!null c:15
    │    │    │    │    ├── key columns: [1] = [1]
    │    │    │    │    ├── lookup columns are key
    │    │    │    │    ├── key: (1)
    │    │    │    │    ├── fd: (1)-->(2,3), (13)-->(15), (13)==(2,14), (14)==(2,13), (2)==(13,14)
    │    │    │    │    ├── inner-join (lookup t@idx)
    │    │    │    │    │    ├── columns: a:1!null b:2!null a:13!null b:14!null c:15
    │    │    │    │    │    ├── key columns: [13] = [2]
    │    │    │    │    │    ├── key: (1)
    │    │    │    │    │    ├── fd: (13)-->(15), (13)==(2,14), (14)==(2,13), (1)-->(2), (2)==(13,14)
    │    │    │    │    │    ├── select
    │    │    │    │    │    │    ├── columns: a:13!null b:14!null c:15
    │    │    │    │    │    │    ├── key: (13)
    │    │    │    │    │    │    ├── fd: (13)-->(15), (13)==(14), (14)==(13)
    │    │    │    │    │    │    ├── scan t
    │    │    │    │    │    │    │    ├── columns: a:13!null b:14!null c:15
    │    │    │    │    │    │    │    ├── computed column expressions
    │    │    │    │    │    │    │    │    └── crdb_internal_idx_expr:18
    │    │    │    │    │    │    │    │         └── b:14 + a:13
    │    │    │    │    │    │    │    ├── key: (13)
    │    │    │    │    │    │    │    └── fd: (13)-->(14,15)
    │    │    │    │    │    │    └── filters
    │    │    │    │    │    │         └── a:13 = b:14 [outer=(13,14), constraints=(/13: (/NULL - ]; /14: (/NULL - ]), fd=(13)==(14), (14)==(13)]
    │    │    │    │    │    └── filters (true)
    │    │    │    │    └── filters (true)
    │    │    │    └── filters
    │    │    │         ├── c:3 >= c:9 [outer=(3,9), immutable, constraints=(/3: (/NULL - ]; /9: (/NULL - ])]
    │    │    │         └── a:7 = b:8 [outer=(7,8), constraints=(/7: (/NULL - ]; /8: (/NULL - ]), fd=(7)==(8), (8)==(7)]
    │    │    └── filters
    │    │         ├── c:15 = c:21 [outer=(15,21), immutable, constraints=(/15: (/NULL - ]; /21: (/NULL - ]), fd=(15)==(21), (21)==(15)]
    │    │         ├── c:3 = c:21 [outer=(3,21), immutable, constraints=(/3: (/NULL - ]; /21: (/NULL - ]), fd=(3)==(21), (21)==(3)]
    │    │         └── b:14 = b:20 [outer=(14,20), constraints=(/14: (/NULL - ]; /20: (/NULL - ]), fd=(14)==(20), (20)==(14)]
  - │    └── filters
  - │         └── a:19 = b:26 [outer=(19,26), constraints=(/19: (/NULL - ]; /26: (/NULL - ]), fd=(19)==(26), (26)==(19)]
  + │    └── filters (true)
    └── projections
         └── 1 [as="?column?":31]

@rytaft
Copy link
Collaborator

rytaft commented Sep 26, 2022

Something is amiss:

The change looks ok to me, but it's not a merge join -- is that what you're concerned by?

@DrewKimball
Copy link
Collaborator

The change looks ok to me, but it's not a merge join -- is that what you're concerned by?

Yes, is that expected?

@rytaft
Copy link
Collaborator

rytaft commented Sep 26, 2022

Does seem odd...

@DrewKimball
Copy link
Collaborator

Ok, I think I've got it: the joinorderbuilder expects filters to have been pushed down as far as possible in the original join tree, but we aren't doing that for 3=15 in this case. We end up adding a join tree with the 3=15 filter to a group that doesn't have it (but joins the same base relations).

@mgartner
Copy link
Collaborator

So the filters are missing, correct? Just not in the last example I posted, because I missed them in the merge join. Correct?

@DrewKimball
Copy link
Collaborator

Yes. So this is a real bug, not just a testing problem.

DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Sep 27, 2022
The `JoinOrderBuilder` builds reordered join plans from the bottom up, and
expects filters to be pushed down as far as possible at each step. It also
reuses the original matched joins when possible, to avoid duplicate work.
This could previously cause filters to be dropped in the case when they
weren't pushed all the way down.

This commit adds a check to the `JoinOrderBuilder` to identify cases where
filters weren't pushed all the way down in the original join tree. When this
is true, none of the originally matched joins can be reused when reordered
joins are built except for the root join. This solution may perform some
duplicate work when filters aren't pushed down, but it shouldn't matter
because this case is rare (and should be avoided whenever possible).

Fixes cockroachdb#88659

Release note (bug fix): Fixed a bug introduced in 20.2 that could cause
filters to be dropped from a query plan in rare cases.
@mgartner
Copy link
Collaborator

Since this is present on older versions, I'm removing the release blocker label.

@mgartner mgartner removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. blocks-22.2.0-beta.2 labels Sep 27, 2022
@yuzefovich yuzefovich changed the title roachtest: costfuzz failed roachtest/costfuzz: join reordering issue Sep 27, 2022
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Sep 30, 2022
The `JoinOrderBuilder` builds reordered join plans from the bottom up.
It expects filters to be pushed down as far as possible at each step, and
that transitive closure has been calculated over Inner Join equality filters
(e.g. `a=b` and `b=c` => `a=c`). It also reuses the original matched joins
when possible to avoid duplicate work by adding to the original memo groups.

This could previously cause filters to be dropped in the case when the
original join tree did not compute transitive closure and push filters down
as far as possible. More specifically, the `JoinOrderBuilder` could add new
reordered joins with new filters synthesized and pushed down as far as possible
to an original memo group that didn't have one of those filters. Subsequent
joins would then expect the filter to be part of the memo group, and so
wouldn't add it later on in the plan. In the rare case when the expression
without the filter was chosen, this could manifest as a dropped filter in the
final plan. This was rare because dropping a filter usually does not produce
a lower-cost plan.

As an example, take this original join tree:
```
(xy join ab on true) join uv on x = u and a = u;
```
Here it is possible to sythesize and push down a `x = a` filter, and so the
`JoinOrderBuilder` would do this and add it to the group:
```
group (xy join ab on true), (xy join ab on x = a)
```
Later joins would use this group as an input, an expect the `x = a` filter to
be present. If costing happened to choose the first expression in the group,
we would end up choosing a plan like this:
```
(xy join ab on true) join uv on x = u
```
Where the `a = u` filter isn't included in the top-level join because it
would be redundant to add it when `x = u` and `x = a` are already present.
This is a bit of a simplification, but is essentially correct.

This commit adds a check to the `JoinOrderBuilder` to identify cases where
filters (including ones sythesized from the transitive closure) weren't
pushed all the way down in the original join tree. When this is true, none
of the originally matched joins can be reused when reordered joins are
built except for the root join. This solution may perform some duplicate
work when filters aren't pushed down, but it shouldn't matter because this
case is rare (and should be avoided whenever possible).

Fixes cockroachdb#88659

Release note (bug fix): Fixed a bug introduced in 20.2 that could cause
filters to be dropped from a query plan with many joins in rare cases.
craig bot pushed a commit that referenced this issue Oct 2, 2022
88779: opt: don't add reordered join with extra filters to original memo group r=DrewKimball a=DrewKimball

The `JoinOrderBuilder` builds reordered join plans from the bottom up.
It expects filters to be pushed down as far as possible at each step, and
that transitive closure has been calculated over Inner Join equality filters
(e.g. `a=b` and `b=c` => `a=c`). It also reuses the original matched joins
when possible to avoid duplicate work by adding to the original memo groups.

This could previously cause filters to be dropped in the case when the
original join tree did not compute transitive closure and push filters down
as far as possible. More specifically, the `JoinOrderBuilder` could add new
reordered joins with new filters synthesized and pushed down as far as possible
to an original memo group that didn't have one of those filters. Subsequent
joins would then expect the filter to be part of the memo group, and so it
wouldn't be added later on in the plan. In the rare case when the expression
without the filter was chosen, this could manifest as a dropped filter in the
final plan. This was rare because dropping a filter usually does not produce
a lower-cost plan.

As an example, take this original join tree:
```
(xy join ab on true) join uv on x = u and a = u;
```
Here it is possible to sythesize and push down a `x = a` filter, and so the
`JoinOrderBuilder` would do this and add it to the group:
```
group (xy join ab on true), (xy join ab on x = a)
```
Later joins would use this group as an input, an expect the `x = a` filter to
be present. If costing happened to choose the first expression in the group,
we would end up choosing a plan like this:
```
(xy join ab on true) join uv on x = u
```
Where the `a = u` filter isn't included in the top-level join because it
would be redundant to add it when `x = u` and `x = a` are already present.
This is a bit of a simplification, but is essentially the problem fixed
by this commit.

This commit adds a check to the `JoinOrderBuilder` to identify cases where
filters (including ones sythesized from the transitive closure) weren't
pushed all the way down in the original join tree. When this is true, none
of the originally matched joins can be reused when reordered joins are
built except for the root join. This solution may perform some duplicate
work when filters aren't pushed down, but it shouldn't matter because this
case is rare (and should be avoided whenever possible).

Fixes #88659

Release note (bug fix): Fixed a bug introduced in 20.2 that could cause
filters to be dropped from a query plan with many joins in rare cases.

Co-authored-by: DrewKimball <drewk@cockroachlabs.com>
@craig craig bot closed this as completed in 73e8b28 Oct 2, 2022
blathers-crl bot pushed a commit that referenced this issue Oct 2, 2022
The `JoinOrderBuilder` builds reordered join plans from the bottom up.
It expects filters to be pushed down as far as possible at each step, and
that transitive closure has been calculated over Inner Join equality filters
(e.g. `a=b` and `b=c` => `a=c`). It also reuses the original matched joins
when possible to avoid duplicate work by adding to the original memo groups.

This could previously cause filters to be dropped in the case when the
original join tree did not compute transitive closure and push filters down
as far as possible. More specifically, the `JoinOrderBuilder` could add new
reordered joins with new filters synthesized and pushed down as far as possible
to an original memo group that didn't have one of those filters. Subsequent
joins would then expect the filter to be part of the memo group, and so it
wouldn't be added later on in the plan. In the rare case when the expression
without the filter was chosen, this could manifest as a dropped filter in the
final plan. This was rare because dropping a filter usually does not produce
a lower-cost plan.

As an example, take this original join tree:
```
(xy join ab on true) join uv on x = u and a = u;
```
Here it is possible to sythesize and push down a `x = a` filter, and so the
`JoinOrderBuilder` would do this and add it to the group:
```
group (xy join ab on true), (xy join ab on x = a)
```
Later joins would use this group as an input, an expect the `x = a` filter to
be present. If costing happened to choose the first expression in the group,
we would end up choosing a plan like this:
```
(xy join ab on true) join uv on x = u
```
Where the `a = u` filter isn't included in the top-level join because it
would be redundant to add it when `x = u` and `x = a` are already present.
This is a bit of a simplification, but is essentially the problem fixed
by this commit.

This commit adds a check to the `JoinOrderBuilder` to identify cases where
filters (including ones sythesized from the transitive closure) weren't
pushed all the way down in the original join tree. When this is true, none
of the originally matched joins can be reused when reordered joins are
built except for the root join. This solution may perform some duplicate
work when filters aren't pushed down, but it shouldn't matter because this
case is rare (and should be avoided whenever possible).

Fixes #88659

Release note (bug fix): Fixed a bug introduced in 20.2 that could cause
filters to be dropped from a query plan with many joins in rare cases.
@rytaft rytaft added S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. S-3-erroneous-edge-case Database produces or stores erroneous data without visible error/warning, in rare edge cases. labels Oct 3, 2022
blathers-crl bot pushed a commit that referenced this issue Oct 4, 2022
The `JoinOrderBuilder` builds reordered join plans from the bottom up.
It expects filters to be pushed down as far as possible at each step, and
that transitive closure has been calculated over Inner Join equality filters
(e.g. `a=b` and `b=c` => `a=c`). It also reuses the original matched joins
when possible to avoid duplicate work by adding to the original memo groups.

This could previously cause filters to be dropped in the case when the
original join tree did not compute transitive closure and push filters down
as far as possible. More specifically, the `JoinOrderBuilder` could add new
reordered joins with new filters synthesized and pushed down as far as possible
to an original memo group that didn't have one of those filters. Subsequent
joins would then expect the filter to be part of the memo group, and so it
wouldn't be added later on in the plan. In the rare case when the expression
without the filter was chosen, this could manifest as a dropped filter in the
final plan. This was rare because dropping a filter usually does not produce
a lower-cost plan.

As an example, take this original join tree:
```
(xy join ab on true) join uv on x = u and a = u;
```
Here it is possible to sythesize and push down a `x = a` filter, and so the
`JoinOrderBuilder` would do this and add it to the group:
```
group (xy join ab on true), (xy join ab on x = a)
```
Later joins would use this group as an input, an expect the `x = a` filter to
be present. If costing happened to choose the first expression in the group,
we would end up choosing a plan like this:
```
(xy join ab on true) join uv on x = u
```
Where the `a = u` filter isn't included in the top-level join because it
would be redundant to add it when `x = u` and `x = a` are already present.
This is a bit of a simplification, but is essentially the problem fixed
by this commit.

This commit adds a check to the `JoinOrderBuilder` to identify cases where
filters (including ones sythesized from the transitive closure) weren't
pushed all the way down in the original join tree. When this is true, none
of the originally matched joins can be reused when reordered joins are
built except for the root join. This solution may perform some duplicate
work when filters aren't pushed down, but it shouldn't matter because this
case is rare (and should be avoided whenever possible).

Fixes #88659

Release note (bug fix): Fixed a bug introduced in 20.2 that could cause
filters to be dropped from a query plan with many joins in rare cases.
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 11, 2022
The `JoinOrderBuilder` builds reordered join plans from the bottom up.
It expects filters to be pushed down as far as possible at each step, and
that transitive closure has been calculated over Inner Join equality filters
(e.g. `a=b` and `b=c` => `a=c`). It also reuses the original matched joins
when possible to avoid duplicate work by adding to the original memo groups.

This could previously cause filters to be dropped in the case when the
original join tree did not compute transitive closure and push filters down
as far as possible. More specifically, the `JoinOrderBuilder` could add new
reordered joins with new filters synthesized and pushed down as far as possible
to an original memo group that didn't have one of those filters. Subsequent
joins would then expect the filter to be part of the memo group, and so it
wouldn't be added later on in the plan. In the rare case when the expression
without the filter was chosen, this could manifest as a dropped filter in the
final plan. This was rare because dropping a filter usually does not produce
a lower-cost plan.

As an example, take this original join tree:
```
(xy join ab on true) join uv on x = u and a = u;
```
Here it is possible to sythesize and push down a `x = a` filter, and so the
`JoinOrderBuilder` would do this and add it to the group:
```
group (xy join ab on true), (xy join ab on x = a)
```
Later joins would use this group as an input, an expect the `x = a` filter to
be present. If costing happened to choose the first expression in the group,
we would end up choosing a plan like this:
```
(xy join ab on true) join uv on x = u
```
Where the `a = u` filter isn't included in the top-level join because it
would be redundant to add it when `x = u` and `x = a` are already present.
This is a bit of a simplification, but is essentially the problem fixed
by this commit.

This commit adds a check to the `JoinOrderBuilder` to identify cases where
filters (including ones sythesized from the transitive closure) weren't
pushed all the way down in the original join tree. When this is true, none
of the originally matched joins can be reused when reordered joins are
built except for the root join. This solution may perform some duplicate
work when filters aren't pushed down, but it shouldn't matter because this
case is rare (and should be avoided whenever possible).

Fixes cockroachdb#88659

Release note (bug fix): Fixed a bug introduced in 20.2 that could cause
filters to be dropped from a query plan with many joins in rare cases.
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 17, 2022
The `JoinOrderBuilder` builds reordered join plans from the bottom up.
It expects filters to be pushed down as far as possible at each step, and
that transitive closure has been calculated over Inner Join equality filters
(e.g. `a=b` and `b=c` => `a=c`). It also reuses the original matched joins
when possible to avoid duplicate work by adding to the original memo groups.

This could previously cause filters to be dropped in the case when the
original join tree did not compute transitive closure and push filters down
as far as possible. More specifically, the `JoinOrderBuilder` could add new
reordered joins with new filters synthesized and pushed down as far as possible
to an original memo group that didn't have one of those filters. Subsequent
joins would then expect the filter to be part of the memo group, and so it
wouldn't be added later on in the plan. In the rare case when the expression
without the filter was chosen, this could manifest as a dropped filter in the
final plan. This was rare because dropping a filter usually does not produce
a lower-cost plan.

As an example, take this original join tree:
```
(xy join ab on true) join uv on x = u and a = u;
```
Here it is possible to sythesize and push down a `x = a` filter, and so the
`JoinOrderBuilder` would do this and add it to the group:
```
group (xy join ab on true), (xy join ab on x = a)
```
Later joins would use this group as an input, an expect the `x = a` filter to
be present. If costing happened to choose the first expression in the group,
we would end up choosing a plan like this:
```
(xy join ab on true) join uv on x = u
```
Where the `a = u` filter isn't included in the top-level join because it
would be redundant to add it when `x = u` and `x = a` are already present.
This is a bit of a simplification, but is essentially the problem fixed
by this commit.

This commit adds a check to the `JoinOrderBuilder` to identify cases where
filters (including ones sythesized from the transitive closure) weren't
pushed all the way down in the original join tree. When this is true, none
of the originally matched joins can be reused when reordered joins are
built except for the root join. This solution may perform some duplicate
work when filters aren't pushed down, but it shouldn't matter because this
case is rare (and should be avoided whenever possible).

Fixes cockroachdb#88659

Release note (bug fix): Fixed a bug introduced in 20.2 that could cause
filters to be dropped from a query plan with many joins in rare cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. S-3-erroneous-edge-case Database produces or stores erroneous data without visible error/warning, in rare edge cases. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants