-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
opt: session setting to enforce queries access only home region rows
Informs #83819 This commit adds a new session setting, `enforce_home_region`, which causes queries to error out if they need to talk to regions other than the gateway region to answer the query. A home region is like the primary region in a multiregion database, and specifies the region(s) from which consistent reads from a table can be served locally. The home region for a set of rows in a multiregion table is determined differently depending on the type of multiregion table involved: | Locality | Home Region | | -------- | ----------- | | REGIONAL BY ROW | Home region determined by crdb_region column value | | REGIONAL BY TABLE | All rows share a single home region | | GLOBAL | Any region can act as the home region | When `enforce_home_region` is true, and a query has no home region (for example, reading from different home regions in a REGIONAL BY ROW table), error code 42899 (`QueryHasNoHomeRegion`) is returned. When `enforce_home_region` is true, and a query's home region differs from the gateway region, error code 42898 (`QueryNotRunningInHomeRegion`) is returned. The error message, in some instances, provides a useful tip on possible steps to take to allow the query to run entirely in the gateway region, e.g., ``` Query is not running in its home region. Try running the query from region 'ap-southeast-2'. Query has no home region. Try removing the join expression. Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability. ``` Support for this new session mode is being added in 3 phases. This commit consists of phase 1, which include only simple static checks during query compilation for the following allowed cases: - A scan of a table with `LOCALITY REGIONAL BY TABLE` with primary region matching the gateway region - A scan of a table with `LOCALITY GLOBAL` - A scan of a table with `LOCALITY REGIONAL BY ROW` using only local constraints (e.g. crdb_region = 'ca-central-1') - A scan of a table with `LOCALITY REGIONAL BY ROW` using locality-optimized search. - A lookup join into a table with `LOCALITY REGIONAL BY ROW` using locality-optimized lookup. Only tables in multiregion databases with ZONE survivability may be scanned without error because with REGION survivability, ranges in a down region may be served non-local to the gateway region, so are not guaranteed to have low latency. Note that locality-optimized search and lookup join are not guaranteed to scan no remote rows, but are still allowed. Release note (sql change): A new session setting, enforce_home_region, is added, which when true causes queries which may scan rows via a database connection outside of the query's home region to error out. Also, only tables in multiregion databases with ZONE survivability may be scanned without error when this setting is true because with REGION survivability, ranges in a down region may be served non-local to the gateway region, so are not guaranteed to have low latency.
- Loading branch information
Mark Sirek
committed
Aug 7, 2022
1 parent
7d06d8d
commit 3a914bd
Showing
39 changed files
with
892 additions
and
51 deletions.
There are no files selected for viewing
354 changes: 354 additions & 0 deletions
354
pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,354 @@ | ||
# tenant-cluster-setting-override-opt: allow-multi-region-abstractions-for-secondary-tenants | ||
# LogicTest: multiregion-9node-3region-3azs !metamorphic | ||
|
||
# Set the closed timestamp interval to be short to shorten the amount of time | ||
# we need to wait for the system config to propagate. | ||
statement ok | ||
SET CLUSTER SETTING kv.closed_timestamp.side_transport_interval = '10ms'; | ||
|
||
statement ok | ||
SET CLUSTER SETTING kv.closed_timestamp.target_duration = '10ms'; | ||
|
||
# Start with SURVIVE ZONE FAILURE for positive tests. | ||
# SURVIVE REGION FAILURE cases will always error out. | ||
statement ok | ||
CREATE DATABASE multi_region_test_db PRIMARY REGION "ap-southeast-2" REGIONS "ca-central-1", "us-east-1" SURVIVE ZONE FAILURE; | ||
|
||
statement ok | ||
USE multi_region_test_db | ||
|
||
query T | ||
SELECT gateway_region(); | ||
---- | ||
ap-southeast-2 | ||
|
||
statement ok | ||
CREATE TABLE messages_global ( | ||
account_id INT NOT NULL, | ||
message_id UUID DEFAULT gen_random_uuid(), | ||
message STRING NOT NULL, | ||
crdb_region crdb_internal_region NOT NULL, | ||
PRIMARY KEY (account_id), | ||
INDEX msg_idx(message) | ||
) LOCALITY GLOBAL | ||
|
||
statement ok | ||
CREATE TABLE messages_rbt ( | ||
account_id INT NOT NULL, | ||
message_id UUID DEFAULT gen_random_uuid(), | ||
message STRING NOT NULL, | ||
crdb_region crdb_internal_region NOT NULL, | ||
PRIMARY KEY (account_id), | ||
INDEX msg_idx(message) | ||
) LOCALITY REGIONAL BY TABLE | ||
|
||
statement ok | ||
CREATE TABLE messages_rbr ( | ||
account_id INT NOT NULL, | ||
message_id UUID DEFAULT gen_random_uuid(), | ||
message STRING NOT NULL, | ||
crdb_region crdb_internal_region NOT NULL, | ||
PRIMARY KEY (account_id), | ||
INDEX msg_idx(message) | ||
) | ||
LOCALITY REGIONAL BY ROW | ||
|
||
statement ok | ||
CREATE TABLE customers ( | ||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(), | ||
name STRING NOT NULL | ||
) LOCALITY REGIONAL BY ROW; | ||
|
||
statement ok | ||
ALTER TABLE customers INJECT STATISTICS '[ | ||
{ | ||
"columns": ["id"], | ||
"created_at": "2018-05-01 1:00:00.00000+00:00", | ||
"row_count": 100, | ||
"distinct_count": 100 | ||
}, | ||
{ | ||
"columns": ["crdb_region"], | ||
"created_at": "2018-05-01 1:00:00.00000+00:00", | ||
"row_count": 100, | ||
"distinct_count": 3 | ||
}, | ||
{ | ||
"columns": ["crdb_region","id"], | ||
"created_at": "2018-05-01 1:00:00.00000+00:00", | ||
"row_count": 100, | ||
"distinct_count": 100 | ||
} | ||
]' | ||
|
||
statement ok | ||
CREATE TABLE orders ( | ||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(), | ||
cust_id UUID NOT NULL, | ||
items STRING NOT NULL, | ||
INDEX (cust_id), | ||
FOREIGN KEY (cust_id, crdb_region) REFERENCES customers (id, crdb_region) ON UPDATE CASCADE | ||
) LOCALITY REGIONAL BY ROW; | ||
|
||
statement ok | ||
ALTER TABLE orders INJECT STATISTICS '[ | ||
{ | ||
"columns": ["id"], | ||
"created_at": "2018-05-01 1:00:00.00000+00:00", | ||
"row_count": 100, | ||
"distinct_count": 100 | ||
}, | ||
{ | ||
"columns": ["cust_id"], | ||
"created_at": "2018-05-01 1:00:00.00000+00:00", | ||
"row_count": 100, | ||
"distinct_count": 10 | ||
}, | ||
{ | ||
"columns": ["crdb_region"], | ||
"created_at": "2018-05-01 1:00:00.00000+00:00", | ||
"row_count": 100, | ||
"distinct_count": 3 | ||
}, | ||
{ | ||
"columns": ["crdb_region","id"], | ||
"created_at": "2018-05-01 1:00:00.00000+00:00", | ||
"row_count": 100, | ||
"distinct_count": 100 | ||
} | ||
]' | ||
|
||
statement ok | ||
SET enforce_home_region = true | ||
|
||
statement ok | ||
CREATE TABLE parent ( | ||
p_id INT PRIMARY KEY, | ||
FAMILY (p_id) | ||
) LOCALITY REGIONAL BY ROW; | ||
|
||
statement ok | ||
CREATE TABLE child ( | ||
c_id INT PRIMARY KEY, | ||
c_p_id INT REFERENCES parent (p_id), | ||
INDEX (c_p_id), | ||
FAMILY (c_id, c_p_id) | ||
) LOCALITY REGIONAL BY ROW; | ||
|
||
# Non-locality-optimized lookup join should error out. | ||
statement error pq: Query has no home region. Try removing the join expression. | ||
EXPLAIN(OPT) SELECT * FROM parent p, child c WHERE p_id = c_p_id AND | ||
p.crdb_region = c.crdb_region LIMIT 1 | ||
|
||
# Locality optimized join is allowed with the enforce_home_region setting. | ||
# EXPLAIN(OPT) SELECT * FROM parent p, child c WHERE p_id = c_p_id AND | ||
# p.crdb_region = c.crdb_region LIMIT 1 | ||
query T | ||
EXPLAIN(OPT) SELECT * FROM parent p, child c WHERE c_id = 10 AND p_id = c_p_id | ||
---- | ||
project | ||
└── inner-join (lookup parent [as=p]) | ||
├── lookup columns are key | ||
├── locality-optimized-search | ||
│ ├── scan child [as=c] | ||
│ │ └── constraint: /13/11: [/'ap-southeast-2'/10 - /'ap-southeast-2'/10] | ||
│ └── scan child [as=c] | ||
│ └── constraint: /18/16 | ||
│ ├── [/'ca-central-1'/10 - /'ca-central-1'/10] | ||
│ └── [/'us-east-1'/10 - /'us-east-1'/10] | ||
└── filters (true) | ||
|
||
|
||
# Locality optimized lookup join is allowed in phase 1, though it is | ||
# not guaranteed to run with low latency. | ||
query T | ||
EXPLAIN(OPT) SELECT * FROM child WHERE NOT EXISTS (SELECT * FROM parent WHERE p_id = c_p_id) AND c_id = 10 | ||
---- | ||
anti-join (lookup parent) | ||
├── lookup columns are key | ||
├── anti-join (lookup parent) | ||
│ ├── lookup columns are key | ||
│ ├── locality-optimized-search | ||
│ │ ├── scan child | ||
│ │ │ └── constraint: /13/11: [/'ap-southeast-2'/10 - /'ap-southeast-2'/10] | ||
│ │ └── scan child | ||
│ │ └── constraint: /18/16 | ||
│ │ ├── [/'ca-central-1'/10 - /'ca-central-1'/10] | ||
│ │ └── [/'us-east-1'/10 - /'us-east-1'/10] | ||
│ └── filters (true) | ||
└── filters (true) | ||
|
||
|
||
statement ok | ||
SET locality_optimized_partitioned_index_scan = false | ||
|
||
# This query should error out because it is not locality optimized. | ||
statement error pq: Query has no home region. Try removing the join expression. | ||
EXPLAIN SELECT * FROM child WHERE NOT EXISTS (SELECT * FROM parent WHERE p_id = c_p_id) AND c_id = 10 | ||
|
||
statement ok | ||
RESET locality_optimized_partitioned_index_scan | ||
|
||
# Locality optimized search is allowed. | ||
query T | ||
EXPLAIN(OPT) SELECT * FROM parent LIMIT 1 | ||
---- | ||
locality-optimized-search | ||
├── scan parent | ||
│ ├── constraint: /7/6: [/'ap-southeast-2' - /'ap-southeast-2'] | ||
│ └── limit: 1 | ||
└── scan parent | ||
├── constraint: /11/10: [/'ca-central-1' - /'us-east-1'] | ||
└── limit: 1 | ||
|
||
# Locality optimized search with lookup join will be supported in phase 2 or 3 | ||
# when we can dynamically determine if the lookup will access a remote region. | ||
statement error pq: Query has no home region. Try removing the join expression. | ||
EXPLAIN SELECT * FROM customers c JOIN orders o ON c.id = o.cust_id AND | ||
(c.crdb_region = o.crdb_region) WHERE c.id = '69a1c2c2-5b18-459e-94d2-079dc53a4dd0' | ||
|
||
# Select from a global table is OK with ZONE survivability. | ||
query T | ||
EXPLAIN SELECT * FROM messages_global@messages_global_pkey | ||
---- | ||
distribution: local | ||
vectorized: true | ||
· | ||
• scan | ||
missing stats | ||
table: messages_global@messages_global_pkey | ||
spans: FULL SCAN | ||
|
||
# Select from REGIONAL BY TABLE is OK with ZONE survivability. | ||
query T | ||
EXPLAIN SELECT message from messages_rbt@messages_rbt_pkey | ||
---- | ||
distribution: local | ||
vectorized: true | ||
· | ||
• scan | ||
missing stats | ||
table: messages_rbt@messages_rbt_pkey | ||
spans: FULL SCAN | ||
|
||
statement ok | ||
ALTER TABLE messages_rbt SET LOCALITY REGIONAL BY TABLE IN "us-east-1"; | ||
|
||
# Select from REGIONAL BY TABLE should indicate the gateway region to use. | ||
statement error pq: Query is not running in its home region. Try running the query from region 'ap-southeast-2'. | ||
EXPLAIN SELECT message from messages_rbt@messages_rbt_pkey WHERE crdb_region = 'ap-southeast-2' | ||
|
||
# Logging in through the appropriate gateway region allows reading from an RBR | ||
# table with a span on crdb_region. | ||
query T nodeidx=4 | ||
SET enforce_home_region = true; | ||
USE multi_region_test_db; | ||
EXPLAIN(OPT) SELECT message from messages_rbr@msg_idx WHERE crdb_region = 'ca-central-1' | ||
---- | ||
project | ||
└── scan messages_rbr@msg_idx | ||
├── constraint: /4/3/1: [/'ca-central-1' - /'ca-central-1'] | ||
└── flags: force-index=msg_idx | ||
|
||
# Equality predicate on crdb_region of an RBR table is allowed. | ||
query T | ||
EXPLAIN(OPT) SELECT message from messages_rbr@msg_idx WHERE crdb_region = 'ap-southeast-2' | ||
---- | ||
project | ||
└── scan messages_rbr@msg_idx | ||
├── constraint: /4/3/1: [/'ap-southeast-2' - /'ap-southeast-2'] | ||
└── flags: force-index=msg_idx | ||
|
||
statement ok | ||
PREPARE s AS EXPLAIN SELECT message from messages_rbr@msg_idx WHERE crdb_region = $1 | ||
|
||
# Prepared statement accessing the local span is allowed. | ||
query T | ||
EXECUTE s('ap-southeast-2') | ||
---- | ||
distribution: local | ||
vectorized: true | ||
· | ||
• scan | ||
missing stats | ||
table: messages_rbr@msg_idx | ||
spans: [/'ap-southeast-2' - /'ap-southeast-2'] | ||
|
||
# Prepared statement accessing a remote span is disallowed. | ||
statement error pq: Query is not running in its home region. Try running the query from region 'ap-southeast-2'. | ||
EXECUTE s('us-east-1') | ||
|
||
statement ok | ||
RESET enforce_home_region | ||
|
||
statement ok | ||
CREATE DATABASE non_multiregion_test_db; | ||
|
||
statement ok | ||
USE non_multiregion_test_db | ||
|
||
statement ok | ||
CREATE TABLE messages ( | ||
account_id INT NOT NULL, | ||
message_id UUID DEFAULT gen_random_uuid(), | ||
message STRING NOT NULL, | ||
PRIMARY KEY (account_id), | ||
INDEX msg_idx(message) | ||
) | ||
|
||
statement ok | ||
SET enforce_home_region = true | ||
|
||
# Tables in non-multiregion databases have no home region. | ||
statement error pq: Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability. | ||
EXPLAIN SELECT * FROM messages | ||
|
||
# If any table in a query has no home region, error out. | ||
statement error pq: Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability. | ||
EXPLAIN SELECT * FROM non_multiregion_test_db.messages, multi_region_test_db.messages_global | ||
|
||
# Scans with contradictions in predicates are allowed. | ||
query T | ||
EXPLAIN SELECT * FROM messages WHERE account_id = 1 AND account_id = 2 | ||
---- | ||
distribution: local | ||
vectorized: true | ||
· | ||
• norows | ||
|
||
# A lookup join from a multiregion table to non-multiregion table is not | ||
# allowed. | ||
statement error pq: Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability. | ||
EXPLAIN SELECT * FROM multi_region_test_db.messages_global mr INNER LOOKUP JOIN non_multiregion_test_db.messages nmr | ||
ON mr.account_id = nmr.account_id | ||
|
||
statement ok | ||
ALTER DATABASE multi_region_test_db SURVIVE REGION FAILURE | ||
|
||
statement ok | ||
USE multi_region_test_db | ||
|
||
# Statements which previously succeeded should now fail under REGION survivability. | ||
statement error pq: Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability. | ||
EXPLAIN(OPT) SELECT * FROM parent p, child c WHERE c_id = 10 AND p_id = c_p_id | ||
|
||
statement error pq: Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability. | ||
EXPLAIN(OPT) SELECT * FROM child WHERE NOT EXISTS (SELECT * FROM parent WHERE p_id = c_p_id) AND c_id = 10 | ||
|
||
statement error pq: Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability. | ||
EXPLAIN(OPT) SELECT * FROM parent LIMIT 1 | ||
|
||
statement error pq: Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability. | ||
EXPLAIN SELECT * FROM messages_global@messages_global_pkey | ||
|
||
statement error pq: Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability. | ||
EXPLAIN SELECT message from messages_rbt@messages_rbt_pkey | ||
|
||
statement error pq: Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability. | ||
EXPLAIN(OPT) SELECT message from messages_rbr@msg_idx WHERE crdb_region = 'ap-southeast-2' | ||
|
||
statement error pq: Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability. | ||
EXECUTE s('ap-southeast-2') | ||
|
||
statement ok | ||
RESET enforce_home_region |
7 changes: 7 additions & 0 deletions
7
pkg/ccl/logictestccl/tests/multiregion-9node-3region-3azs/generated_test.go
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.