-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL] Batching queries that use index lookup of elements inside an ANY array which uses only hash key component #7836
Labels
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Comments
ramsrivatsa
added
the
kind/enhancement
This is an enhancement of an existing feature
label
Mar 28, 2021
ramsrivatsa
changed the title
[YSQL] Batching queries that use index lookup of elements inside an ANY array
[YSQL] Batching queries that use index lookup of elements inside an ANY array which uses only hash key component
Mar 29, 2021
8 tasks
tanujnay112
added a commit
that referenced
this issue
Mar 17, 2023
Summary: Before this change, IN conditions bound to hash key columns produce one request per possible values of the hash keys. For example, consider a query `SELECT * FROM sample_table WHERE h1 IN (1,4,6,8);` where sample_table has a primary index with `h1` as its full hash component. We send 4 requests, one per each in element, of the form `SELECT * FROM sample_table WHERE h1 = 1;`, `SELECT * FROM sample_table WHERE h1 = 4;` etc. If the IN condition was bound to a range column, we would send the entire filter at once as a singular condition and send just one request per partition of `sample_table_pkey`. The reason why we couldn't do this with hash column IN filters was because the `DocRowwiseIterator` could not perform skip scans over hash columns so it did not have the necessary infrastructure to process IN conditions on hash columns. This diff fixes the above issue by having IN conditions on hash columns behave similar to those on range columns. In order to do this, we did the following changes: - We adjusted pgsql/qlscanspec and ScanChoices to be able to carry out skip scans on hash column IN conditions. - We added infrastructure in pg_doc_op.h to convert IN filters of the form `h1 IN (v1,v2,...,vn)` a condition expression of the form `(yb_hash_code(h1), h1) IN ((yb_hash_code(v1), v1), (yb_hash_code(v2), v2), (yb_hash_code(v3), v3), ..., (yb_hash_code(vn), vn))`. If we have multiple hash partitions on the table we form one request per partition and the RHS of the hash condition on each partition request is ensured to only have values from (v1,v2,...vn) that are relevant to it. This feature also works similarly for multicolumn hash keys. This feature is disabled when serializable isolation level is used for now as there isn't infrastructure to lock multiple non-contiguous rows as such filters would require. This feature's enablement is controlled by the autoflag GUC `yb_enable_hash_batch_in`. We also added a tserver flag `ysql_hash_batch_permutation_limit` that specifies a limit on the number of hash permutations a query must produce in order to be eligible to use this feature. Without this check, we can materialize an unbounded number of hash permutations in memory and cause an OOM crash. Test Plan: ``` ./yb_build.sh release --java-test org.yb.pgsql.TestPgRegressIndex ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressHashInQueries' ``` Reviewers: smishra, neil, amartsinchyk, kpopali Reviewed By: kpopali Subscribers: mbautin, kpopali, kannan, ssong, yql, mihnea, bogdan Differential Revision: https://phabricator.dev.yugabyte.com/D19672
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Jira Link: DB-3200
If the following set of queries are executed 5 docDB requests are being sent. These requests can be batched into 1 docDB request.
Note: docDB requests can be viewed in this file
~/yugabyte-data/node-1/disk-1/yb-data/tserver/logs/postgresql*.log
The text was updated successfully, but these errors were encountered: