-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ysql: cannot count, list, delete many rows #4692
Comments
not sure if related, but similar errors happen with i.e. primary index is composite of two fields, the the first field is provided for the delete |
in effect - YB can act as a good key-value store, if you happen to know the exact ids, but the rest of functionality is quite limited.
|
table info:
|
Thinking about the issue, between counting, deleting and listing items I'd choose listing first, as the two former ones could be done manually having the list, but of course, the basic expectation of an sql db is to have these all working |
cc: @psudheer21 , @m-iancu |
Did you run a big transaction insert/update/delete query before the timeouts started appearing ? Regarding:
Can you specify the full Can you also paste logs from yb-tserver .INFO and .WARNING while the query is running ? (how to find logs) |
the delete statement:
there should be ~76k items in this bucket, according to another DB. the smaller buckets are working ok - e.g. the one with 17k items was processed well (however 76k is not the highest limit )
|
also no other big transactions running in parallel or before this statememt |
A quick question - when I select all data in the table, does it buffer the whole response somewhere, or does it stream the rows to the client (either one by one or batched)? |
Hi @svalaskevicius , can you please change About your question, no, we don't stream rows to the client currently. Have you tried running |
@ndeodhar simple limit 10 works, however I can raise the timeout value (will try at some point later), however, it seems that that will only move the limit, or worse, will fail because of not enough RAM (btw, the servers the current DB is running are not big, with some other processes consuming memory too, so I wouldn't be surprised if the memory limit has been reached even now - i.e. depends on how much additional RAM is needed for the response/search buffer, however, regardless of the server, this indicates that at some point (diff amount of data) any server would have this issue). Is the streaming rows planned to be implemented, or is not to stream them a design decision? Although the limit 10 query shows that this is not purely to nonstreaming, but also filtering / offsetting. Is there an alternative approach to export all ids in the table? |
@svalaskevicius A query like In the interim, we do have plans to make scans more efficient, for example: There are 2 alternatives to getting the data/counts (not great alternatives but available in case you need it):
|
Thanks @ndeodhar, glad to see the ongoing scan improvements and the planned row streaming. I'll try the increased timeout, or the ysql_dump approach in the meanwhile |
Summary: Previously ysql_scan_timeout_multiplier was used to stop scan before the timeout, default value of 0.5 meant scan should return a response after running for 1/2 of current client timeout. The feature was broken at some point, and the multiplier effectively become 1.0, so scans were wrapped up too late. This defunctioned Gflag is deprecated and replaced with ysql_scan_deadline_margin_ms, which is an absolute value, and its default value of 1000 means the scan should be wrapped up after client timeout minus one second. For maximum efficiency the margin should be just above duration of network round trip, and one second is good for most setups. It can be increased if ping between nodes is longer than 500ms. Minor fix: check deadline every iteration, instead of every 1024th, the check is cheap and iterations are potentially lengthy, doing 1024 of them may take longer than the margin. Test Plan: ybd --java-test 'org.yb.pgsql.TestPgReadTimeout' Reviewers: mihnea, sergei, jason Reviewed By: jason Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D15359
Summary: Good chance to reach desired timing in 2 iterations. Amendment on top of already landed https://phabricator.dev.yugabyte.com/D15359 Test Plan: ybd --java-test 'org.yb.pgsql.TestPgReadTimeout' Reviewers: jason Reviewed By: jason Subscribers: amitanand, yql Differential Revision: https://phabricator.dev.yugabyte.com/D15424
Summary: Previously ysql_scan_timeout_multiplier was used to stop scan before the timeout, default value of 0.5 meant scan should return a response after running for 1/2 of current client timeout. The feature was broken at some point, and the multiplier effectively become 1.0, so scans were wrapped up too late. This defunctioned Gflag is deprecated and replaced with ysql_scan_deadline_margin_ms, which is an absolute value, and its default value of 1000 means the scan should be wrapped up after client timeout minus one second. For maximum efficiency the margin should be just above duration of network round trip, and one second is good for most setups. It can be increased if ping between nodes is longer than 500ms. Minor fix: check deadline every iteration, instead of every 1024th, the check is cheap and iterations are potentially lengthy, doing 1024 of them may take longer than the margin. Original commits: https://phabricator.dev.yugabyte.com/D15359 / [[https://github.com/yugabyte/yugabyte-db/commit/16bbda3ef0ad37c321be313be09661a7df5159dc|16bbda3]] https://phabricator.dev.yugabyte.com/D15424 / [[https://github.com/yugabyte/yugabyte-db/commit/77f69a396fd39ce19d1baf6e97ee74d4802193e5|77f69a3]] Test Plan: ybd --java-test 'org.yb.pgsql.TestPgReadTimeout' Reviewers: mihnea, jason Reviewed By: jason Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D15511
Summary: Previously ysql_scan_timeout_multiplier was used to stop scan before the timeout, default value of 0.5 meant scan should return a response after running for 1/2 of current client timeout. The feature was broken at some point, and the multiplier effectively become 1.0, so scans were wrapped up too late. This defunctioned Gflag is deprecated and replaced with ysql_scan_deadline_margin_ms, which is an absolute value, and its default value of 1000 means the scan should be wrapped up after client timeout minus one second. For maximum efficiency the margin should be just above duration of network round trip, and one second is good for most setups. It can be increased if ping between nodes is longer than 500ms. Minor fix: check deadline every iteration, instead of every 1024th, the check is cheap and iterations are potentially lengthy, doing 1024 of them may take longer than the margin. Original commits: https://phabricator.dev.yugabyte.com/D15359 / [[https://github.com/yugabyte/yugabyte-db/commit/16bbda3ef0ad37c321be313be09661a7df5159dc|16bbda3]] https://phabricator.dev.yugabyte.com/D15424 / [[https://github.com/yugabyte/yugabyte-db/commit/77f69a396fd39ce19d1baf6e97ee74d4802193e5|77f69a3]] Test Plan: ybd --java-test 'org.yb.pgsql.TestPgReadTimeout' Reviewers: mihnea, jason Reviewed By: jason Subscribers: jenkins-bot, bogdan, sanketh, yql Differential Revision: https://phabricator.dev.yugabyte.com/D15512
Summary: Previously ysql_scan_timeout_multiplier was used to stop scan before the timeout, default value of 0.5 meant scan should return a response after running for 1/2 of current client timeout. The feature was broken at some point, and the multiplier effectively become 1.0, so scans were wrapped up too late. This defunctioned Gflag is deprecated and replaced with ysql_scan_deadline_margin_ms, which is an absolute value, and its default value of 1000 means the scan should be wrapped up after client timeout minus one second. For maximum efficiency the margin should be just above duration of network round trip, and one second is good for most setups. It can be increased if ping between nodes is longer than 500ms. Minor fix: check deadline every iteration, instead of every 1024th, the check is cheap and iterations are potentially lengthy, doing 1024 of them may take longer than the margin. Backport conflict: target branch does not has ANALYZE support, changes to respective code are skipped. Original commits: https://phabricator.dev.yugabyte.com/D15359 / [[https://github.com/yugabyte/yugabyte-db/commit/16bbda3ef0ad37c321be313be09661a7df5159dc|16bbda3]] https://phabricator.dev.yugabyte.com/D15424 / [[https://github.com/yugabyte/yugabyte-db/commit/77f69a396fd39ce19d1baf6e97ee74d4802193e5|77f69a3]] Test Plan: ybd --java-test 'org.yb.pgsql.TestPgReadTimeout' Reviewers: mihnea, jason Reviewed By: jason Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D15546
Summary: Previously ysql_scan_timeout_multiplier was used to stop scan before the timeout, default value of 0.5 meant scan should return a response after running for 1/2 of current client timeout. The feature was broken at some point, and the multiplier effectively become 1.0, so scans were wrapped up too late. This defunctioned Gflag is deprecated and replaced with ysql_scan_deadline_margin_ms, which is an absolute value, and its default value of 1000 means the scan should be wrapped up after client timeout minus one second. For maximum efficiency the margin should be just above duration of network round trip, and one second is good for most setups. It can be increased if ping between nodes is longer than 500ms. Minor fix: check deadline every iteration, instead of every 1024th, the check is cheap and iterations are potentially lengthy, doing 1024 of them may take longer than the margin. Test Plan: ybd --java-test 'org.yb.pgsql.TestPgReadTimeout' Reviewers: mihnea, sergei, jason Reviewed By: jason Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D15359
Summary: Good chance to reach desired timing in 2 iterations. Amendment on top of already landed https://phabricator.dev.yugabyte.com/D15359 Test Plan: ybd --java-test 'org.yb.pgsql.TestPgReadTimeout' Reviewers: jason Reviewed By: jason Subscribers: amitanand, yql Differential Revision: https://phabricator.dev.yugabyte.com/D15424
There have been many improvements in scan path and related to timeout has been added. since there is no reproducible case listed in the issue, so unable to verify, hence closing the issue. Please reopen if this error is reported again. |
Jira Link: DB-1342
The text was updated successfully, but these errors were encountered: