[YSQL] Segfault on index scan with scankey flattening #18322

jasonyb · 2023-07-20T00:18:08Z

Jira Link: DB-7311

Description

Case 1: flattening (via row filter) + yb_hash_code filter

create table t (i int, j int, k int, l int);
create index i on t (l, k, j);
/*+IndexScan(t i)*/ select * from t where yb_hash_code(l) > 5 and row(k, j) > row(1, 2);

2023-07-19 17:06:34.903 PDT [508432] WARNING:  server process (PID 508504) was terminated by signal 11: Segmentation fault
2023-07-19 17:06:34.903 PDT [508432] DETAIL:  Failed process was running: /*+IndexScan(t i)*/ select * from t where yb_hash_code(l) > 5 and row(k, j) > row(1, 2);

Created the repro because this logic looked suspicious:

    for (int i = 0; i < nkeys; ++i)
    {
        ScanKey key = &keys[i];
        /*
         * Keys for hash code search should be placed after regular keys.
         * For this purpose they are written into keys array from
         * right to left.
         * We also flatten out row keys
         */
        ybScan->keys[YbIsHashCodeSearch(key)
            ? (nkeys - (++ybScan->nhash_keys))
            : ybScan->nkeys++] = key;

        if (YbIsRowHeader(&keys[i]))
        {
            ScanKey current = (ScanKey) keys[i].sk_argument;
            do
            {
                ybScan->keys[ybScan->nkeys++] = current;
            }
            while (((current++)->sk_flags & SK_ROW_END) == 0);
        }
    }

It looks like it assumes nkeys of the original keys is the total size of ybScan->keys, but due to flattening of header keys, it could end up putting more than nkeys keys in ybScan->keys. The logic writes forward and backward, so this likely results in an overlap in case both hash code search and row filter is used.

Case 2: flattening overflows buffer

The following check also is not proper because it doesn't account for flattening, so it can lead to buffer overflow:

if (nkeys > YB_MAX_SCAN_KEYS)

Repro:

bin/ysqlsh -c "create table t (i int); create index on t (i asc); select i from t where row($(for _ in {1..100}; do echo -n i,; done)i) > row($(for _ in {1..100}; do echo -n 0,; done)0);"

2023-07-19 17:35:14.079 PDT [510322] WARNING:  server process (PID 510587) was terminated by signal 11: Segmentation fault
2023-07-19 17:35:14.079 PDT [510322] DETAIL:  Failed process was running: create table t (i int); create index on t (i asc); select i from t where row(i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i,i) > row(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0);

Why was flattening done in the first place? It seems to make the logic more complicated, so consider going back to preserving the original format.

(This was recent master commit 94ba9c7, alma8 fastdebug gcc11.)

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

The text was updated successfully, but these errors were encountered:

Summary: Commit 12e23ad, titled [#14068] YSQL: Allow Batched Nested Loop joins for conjunction clauses introduces bugs for lsm IndexScan/IndexOnlyScan when, during execution, scan keys are transferred to YbScanDesc. The main reason is that it flattens subkeys but doesn't modify bounds checks accordingly. This can lead to segfaults or wrong results. 1. yb_hash_code and regular keys: both types of keys share the same array, one writing forward in the array and the other writing backwards, with a calculation to make sure they don't overlap. Flattening causes more keys to be written than anticipated, so regular keys go beyond the calculated midpoint into the same area that yb_hash_code keys are written (or even further beyond that). Fix by separating yb_hash_code keys into a separate List to avoid having to deal with calculations and sharing space for two separate collections of keys. 1. Overall array bounds check: YbScanDesc->keys is a fixed length array, and the logic to prevent writing outside the array boundaries doesn't account for flattening, resulting in potential buffer overflow. Fix the check. Add tests to exercise these cases and more. Uncover more bugs #18347 and #18360 in the process. Jira: DB-7311 Test Plan: ./yb_build.sh fastdebug --gcc11 --java-test TestPgRegressDml Backport-through: 2.18 Original commit: bfe385d / D27169 Reviewers: tnayak Reviewed By: tnayak Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D27415

jasonyb added kind/bug This issue is a bug area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Jul 20, 2023

jasonyb self-assigned this Jul 20, 2023

yugabyte-ci added the priority/medium Medium priority issue label Jul 20, 2023

jasonyb changed the title ~~[YSQL] Segfault on index scan with yb_hash_code and row filter~~ [YSQL] Segfault on index scan with scankey flattening Jul 20, 2023

yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Jul 22, 2023

jasonyb closed this as completed in bfe385d Jul 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[YSQL] Segfault on index scan with scankey flattening #18322

[YSQL] Segfault on index scan with scankey flattening #18322

jasonyb commented Jul 20, 2023 •

edited

Loading

[YSQL] Segfault on index scan with scankey flattening #18322

[YSQL] Segfault on index scan with scankey flattening #18322

Comments

jasonyb commented Jul 20, 2023 • edited Loading

Description

Warning: Please confirm that this issue does not contain any sensitive information

jasonyb commented Jul 20, 2023 •

edited

Loading