You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Case 1: flattening (via row filter) + yb_hash_code filter
createtablet (i int, j int, k int, l int);
createindexion t (l, k, j);
/*+IndexScan(t i)*/select*from t where yb_hash_code(l) >5and row(k, j) > row(1, 2);
2023-07-19 17:06:34.903 PDT [508432] WARNING: server process (PID 508504) was terminated by signal 11: Segmentation fault
2023-07-19 17:06:34.903 PDT [508432] DETAIL: Failed process was running: /*+IndexScan(t i)*/ select * from t where yb_hash_code(l) > 5 and row(k, j) > row(1, 2);
Created the repro because this logic looked suspicious:
for (inti=0; i<nkeys; ++i)
{
ScanKeykey=&keys[i];
/* * Keys for hash code search should be placed after regular keys. * For this purpose they are written into keys array from * right to left. * We also flatten out row keys */ybScan->keys[YbIsHashCodeSearch(key)
? (nkeys- (++ybScan->nhash_keys))
: ybScan->nkeys++] =key;
if (YbIsRowHeader(&keys[i]))
{
ScanKeycurrent= (ScanKey) keys[i].sk_argument;
do
{
ybScan->keys[ybScan->nkeys++] =current;
}
while (((current++)->sk_flags&SK_ROW_END) ==0);
}
}
It looks like it assumes nkeys of the original keys is the total size of ybScan->keys, but due to flattening of header keys, it could end up putting more than nkeys keys in ybScan->keys. The logic writes forward and backward, so this likely results in an overlap in case both hash code search and row filter is used.
Case 2: flattening overflows buffer
The following check also is not proper because it doesn't account for flattening, so it can lead to buffer overflow:
if (nkeys>YB_MAX_SCAN_KEYS)
Repro:
bin/ysqlsh -c "create table t (i int); create index on t (i asc); select i from t where row($(for _ in {1..100};doecho -n i,; done)i) > row($(for _ in {1..100};doecho -n 0,; done)0);"
2023-07-19 17:35:14.079 PDT [510322] WARNING: server process (PID 510587) was terminated by signal 11: Segmentation fault
2023-07-19 17:35:14.079 PDT [510322] DETAIL: Failed process was running: create table t (i int); create index on t (i asc); select i from t where row(i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i) > row(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0);
Why was flattening done in the first place? It seems to make the logic more complicated, so consider going back to preserving the original format.
(This was recent master commit 94ba9c7, alma8 fastdebug gcc11.)
Warning: Please confirm that this issue does not contain any sensitive information
I confirm this issue does not contain any sensitive information.
The text was updated successfully, but these errors were encountered:
jasonyb
changed the title
[YSQL] Segfault on index scan with yb_hash_code and row filter
[YSQL] Segfault on index scan with scankey flattening
Jul 20, 2023
Summary:
Commit 12e23ad, titled
[#14068] YSQL: Allow Batched Nested Loop joins for conjunction clauses
introduces bugs for lsm IndexScan/IndexOnlyScan when, during
execution, scan keys are transferred to YbScanDesc. The main reason is
that it flattens subkeys but doesn't modify bounds checks accordingly.
This can lead to segfaults or wrong results.
1. yb_hash_code and regular keys: both types of keys share the same
array, one writing forward in the array and the other writing
backwards, with a calculation to make sure they don't overlap.
Flattening causes more keys to be written than anticipated, so
regular keys go beyond the calculated midpoint into the same area
that yb_hash_code keys are written (or even further beyond that).
Fix by separating yb_hash_code keys into a separate List to avoid
having to deal with calculations and sharing space for two separate
collections of keys.
1. Overall array bounds check: YbScanDesc->keys is a fixed length array,
and the logic to prevent writing outside the array boundaries doesn't
account for flattening, resulting in potential buffer overflow. Fix
the check.
Add tests to exercise these cases and more. Uncover more bugs #18347
and #18360 in the process.
Jira: DB-7311
Test Plan:
./yb_build.sh fastdebug --gcc11 --java-test TestPgRegressDml
Backport-through: 2.18
Original commit: bfe385d / D27169
Reviewers: tnayak
Reviewed By: tnayak
Subscribers: yql
Differential Revision: https://phorge.dev.yugabyte.com/D27415
Jira Link: DB-7311
Description
Case 1: flattening (via row filter) + yb_hash_code filter
Created the repro because this logic looked suspicious:
It looks like it assumes
nkeys
of the originalkeys
is the total size ofybScan->keys
, but due to flattening of header keys, it could end up putting more thannkeys
keys inybScan->keys
. The logic writes forward and backward, so this likely results in an overlap in case both hash code search and row filter is used.Case 2: flattening overflows buffer
The following check also is not proper because it doesn't account for flattening, so it can lead to buffer overflow:
Repro:
bin/ysqlsh -c "create table t (i int); create index on t (i asc); select i from t where row($(for _ in {1..100}; do echo -n i,; done)i) > row($(for _ in {1..100}; do echo -n 0,; done)0);"
Why was flattening done in the first place? It seems to make the logic more complicated, so consider going back to preserving the original format.
(This was recent master commit 94ba9c7, alma8 fastdebug gcc11.)
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: