Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ysql: cannot count, list, delete many rows #4692

Closed
svalaskevicius opened this issue Jun 5, 2020 · 14 comments
Closed

Ysql: cannot count, list, delete many rows #4692

svalaskevicius opened this issue Jun 5, 2020 · 14 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) community/request Issues created by external users kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@svalaskevicius
Copy link

svalaskevicius commented Jun 5, 2020

Jira Link: DB-1342

yugabyte=# select count(*) from <table>;

 -- waits for up to a minute, tserver is using cpu

ERROR:  Timed out: Read RPC (request call id 209) to <server ip>:9100 timed out after 0.557s
  • the table should have >1M items - but obviously I cannot tell the number
  • YB version : 2.1.6.0-b17 (from dockerhub)
  • there are 3 nodes in the cluster
@svalaskevicius
Copy link
Author

not sure if related, but similar errors happen with delete from <table> where <partial primary index check>"

i.e. primary index is composite of two fields, the the first field is provided for the delete

@svalaskevicius
Copy link
Author

svalaskevicius commented Jun 5, 2020

in effect - YB can act as a good key-value store, if you happen to know the exact ids, but the rest of functionality is quite limited.

yugabyte=# select primary_key_part1, primary_key_part2 from <table>;
-- time passed
^CCancel request sent
^[[A^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
ERROR:  Timed out: Read RPC (request call id 489) to <server ip>:9100 timed out after 0.651s

@yugabyte-ci yugabyte-ci added the community/request Issues created by external users label Jun 5, 2020
@kmuthukk kmuthukk added the area/ysql Yugabyte SQL (YSQL) label Jun 5, 2020
@svalaskevicius
Copy link
Author

svalaskevicius commented Jun 5, 2020

table info:

yugabyte=# \d <table>
                        Table "public.<table>"
 Column  |            Type             | Collation | Nullable | Default
---------+-----------------------------+-----------+----------+---------
 p1  | character varying(255)      |           | not null |
 p2      | character varying(255)      |           | not null |
 version | bigint                      |           | not null |
 data    | jsonb                       |           | not null |
 ts      | timestamp without time zone |           | not null |
Indexes:
    "<table>_pkey" PRIMARY KEY, lsm (p1 HASH, p2)
    "<table>_ts_index" lsm (ts HASH)

@svalaskevicius
Copy link
Author

svalaskevicius commented Jun 6, 2020

Thinking about the issue, between counting, deleting and listing items I'd choose listing first, as the two former ones could be done manually having the list, but of course, the basic expectation of an sql db is to have these all working

@svalaskevicius svalaskevicius changed the title cannot select count(*) Ysql: cannot count, list, delete all items Jun 6, 2020
@svalaskevicius svalaskevicius changed the title Ysql: cannot count, list, delete all items Ysql: cannot count, list, delete many rows Jun 6, 2020
@kmuthukk
Copy link
Collaborator

kmuthukk commented Jun 6, 2020

cc: @psudheer21 , @m-iancu

@ddorian
Copy link
Contributor

ddorian commented Jun 8, 2020

@svalaskevicius

Did you run a big transaction insert/update/delete query before the timeouts started appearing ?
If yes, can you restart the server and see if the issue persists ? (small diagnostic)

Regarding:

delete from <table> where <partial primary index check>

Can you specify the full WHERE clause ? Was it a range-query and involving both primary key columns ? And a rough estimate of how many rows would this delete ?

Can you also paste logs from yb-tserver .INFO and .WARNING while the query is running ? (how to find logs)

@svalaskevicius
Copy link
Author

@ddorian

the delete statement:
delete from entities where bucket = 'keyXXX'

bucket - the first part of the primary index.

there should be ~76k items in this bucket, according to another DB.

the smaller buckets are working ok - e.g. the one with 17k items was processed well (however 76k is not the highest limit )

==> yb-tserver.WARNING <==
W0608 12:50:42.513047 10165 yb_rpc.cc:426] Call yb.tserver.TabletServerService.Read <ip>:53810 => <ip>:9100 (request call id 202) took 2502ms (client timeout 2500ms).

==> yb-tserver.INFO <==
W0608 12:50:42.513047 10165 yb_rpc.cc:426] Call yb.tserver.TabletServerService.Read <ip>:53810 => <ip>:9100 (request call id 202) took 2502ms (client timeout 2500ms).

==> yb-tserver.WARNING <==
W0608 12:50:45.047436 10116 yb_rpc.cc:426] Call yb.tserver.TabletServerService.Read <ip>:53810 => <ip>:9100 (request call id 206) took 2502ms (client timeout 2500ms).

==> yb-tserver.INFO <==
W0608 12:50:45.047436 10116 yb_rpc.cc:426] Call yb.tserver.TabletServerService.Read <ip>:53810 => <ip>:9100 (request call id 206) took 2502ms (client timeout 2500ms).



< many lines of the same >


==> yb-tserver.INFO <==
W0608 12:51:38.978379  9783 yb_rpc.cc:426] Call yb.tserver.TabletServerService.Read <ip>:53810 => <ip>:9100 (request call id 286) took 2502ms (client timeout 2500ms).

==> yb-tserver.WARNING <==
W0608 12:51:40.013182  9795 yb_rpc.cc:426] Call yb.tserver.TabletServerService.Read <ip>:53810 => <ip>:9100 (request call id 290) took 673ms (client timeout 652ms).

==> yb-tserver.INFO <==
W0608 12:51:40.013182  9795 yb_rpc.cc:426] Call yb.tserver.TabletServerService.Read <ip>:53810 => <ip>:9100 (request call id 290) took 673ms (client timeout 652ms).

==> postgresql-2020-06-08_000000.log <==
W0608 12:51:40.002357 15158 tablet_rpc.cc:368] Timed out (yb/rpc/outbound_call.cc:512): Failed Read(tablet: 3f4c570253c74f8ab9eac21d5ed90f77, num_ops: 1, num_attempts: 89, txn: 00000000-0000-0000-0000-000000000000) to tablet 3f4c570253c74f8ab9eac21d5ed90f77 on tablet server { uuid: 74460c6721334c0fae97dbae8c448df2 private: [host: "<ip>" port: 9100] cloud_info: placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" after 89 attempt(s): Read RPC (request call id 290) to <ip>:9100 timed out after 0.652s
2020-06-08 12:51:40.007 UTC [15148] ERROR:  Timed out: Read RPC (request call id 290) to <ip>:9100 timed out after 0.652s
2020-06-08 12:51:40.007 UTC [15148] STATEMENT:  delete from entities where bucket = 'keyXXX';

@svalaskevicius
Copy link
Author

also no other big transactions running in parallel or before this statememt

@svalaskevicius
Copy link
Author

A quick question - when I select all data in the table, does it buffer the whole response somewhere, or does it stream the rows to the client (either one by one or batched)?

@ndeodhar
Copy link
Contributor

ndeodhar commented Jun 8, 2020

Hi @svalaskevicius , can you please change retryable_rpc_single_call_timeout_msgflag to 60000. This will help eliminate the 2.5seconds timeout that you see. Note that this will need a restart of your tservers.

About your question, no, we don't stream rows to the client currently. Have you tried running
select * from <table> limit 10? Does that work okay? (I'm trying to understand whether count(*) is the only bottleneck or whether there's something else that's going on).

@svalaskevicius
Copy link
Author

svalaskevicius commented Jun 9, 2020

@ndeodhar simple limit 10 works, however select * from entities limit 10 offset 500000; breaks with the same error (Read RPC times out) - so again, it does not seem to be possible to get the whole data back.

I can raise the timeout value (will try at some point later), however, it seems that that will only move the limit, or worse, will fail because of not enough RAM (btw, the servers the current DB is running are not big, with some other processes consuming memory too, so I wouldn't be surprised if the memory limit has been reached even now - i.e. depends on how much additional RAM is needed for the response/search buffer, however, regardless of the server, this indicates that at some point (diff amount of data) any server would have this issue).

Is the streaming rows planned to be implemented, or is not to stream them a design decision? Although the limit 10 query shows that this is not purely to nonstreaming, but also filtering / offsetting.

Is there an alternative approach to export all ids in the table?

@ndeodhar
Copy link
Contributor

ndeodhar commented Jun 9, 2020

@svalaskevicius A query like select count(*) won't run out of memory because it's scan intensive and only maintains a counter in memory. Similarly, select * from entities limit 10 offset 500000 will also be more scan intensive. But yes, a huge select or huge join could fail because of not enough RAM. We want to get to a better streaming solution in future but that's going to be a few months down the line.

In the interim, we do have plans to make scans more efficient, for example:
#3520

There are 2 alternatives to getting the data/counts (not great alternatives but available in case you need it):

  1. You can use ysql_dump to dump the table contents.
  2. For number of rows, you can use pg_stat_statements extension (using create extension pg_stat_statements) and run the following query on every node:
    select sum(rows) from pg_stat_statements where query ilike 'insert into entities%'

@svalaskevicius
Copy link
Author

Thanks @ndeodhar, glad to see the ongoing scan improvements and the planned row streaming.

I'll try the increased timeout, or the ysql_dump approach in the meanwhile

andrei-mart added a commit that referenced this issue Feb 10, 2022
Summary:
Previously ysql_scan_timeout_multiplier was used to stop scan before the
timeout, default value of 0.5 meant scan should return a response after
running for 1/2 of current client timeout. The feature was broken at
some point, and the multiplier effectively become 1.0, so scans were
wrapped up too late.

This defunctioned Gflag is deprecated and replaced with
ysql_scan_deadline_margin_ms, which is an absolute value, and its
default value of 1000 means the scan should be wrapped up after client
timeout minus one second.

For maximum efficiency the margin should be just above duration of
network round trip, and one second is good for most setups. It can be
increased if ping between nodes is longer than 500ms.

Minor fix: check deadline every iteration, instead of every 1024th,
the check is cheap and iterations are potentially lengthy, doing 1024
of them may take longer than the margin.

Test Plan:
ybd --java-test 'org.yb.pgsql.TestPgReadTimeout'

Reviewers: mihnea, sergei, jason

Reviewed By: jason

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D15359
andrei-mart added a commit that referenced this issue Feb 15, 2022
Summary:
Good chance to reach desired timing in 2 iterations.
Amendment on top of already landed https://phabricator.dev.yugabyte.com/D15359

Test Plan: ybd --java-test 'org.yb.pgsql.TestPgReadTimeout'

Reviewers: jason

Reviewed By: jason

Subscribers: amitanand, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D15424
andrei-mart added a commit that referenced this issue Feb 17, 2022
Summary:
Previously ysql_scan_timeout_multiplier was used to stop scan before the
timeout, default value of 0.5 meant scan should return a response after
running for 1/2 of current client timeout. The feature was broken at
some point, and the multiplier effectively become 1.0, so scans were
wrapped up too late.

This defunctioned Gflag is deprecated and replaced with
ysql_scan_deadline_margin_ms, which is an absolute value, and its
default value of 1000 means the scan should be wrapped up after client
timeout minus one second.

For maximum efficiency the margin should be just above duration of
network round trip, and one second is good for most setups. It can be
increased if ping between nodes is longer than 500ms.

Minor fix: check deadline every iteration, instead of every 1024th,
the check is cheap and iterations are potentially lengthy, doing 1024
of them may take longer than the margin.

Original commits:
https://phabricator.dev.yugabyte.com/D15359 / [[https://github.com/yugabyte/yugabyte-db/commit/16bbda3ef0ad37c321be313be09661a7df5159dc|16bbda3]]
https://phabricator.dev.yugabyte.com/D15424 / [[https://github.com/yugabyte/yugabyte-db/commit/77f69a396fd39ce19d1baf6e97ee74d4802193e5|77f69a3]]

Test Plan: ybd --java-test 'org.yb.pgsql.TestPgReadTimeout'

Reviewers: mihnea, jason

Reviewed By: jason

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D15511
andrei-mart added a commit that referenced this issue Feb 17, 2022
Summary:
Previously ysql_scan_timeout_multiplier was used to stop scan before the
timeout, default value of 0.5 meant scan should return a response after
running for 1/2 of current client timeout. The feature was broken at
some point, and the multiplier effectively become 1.0, so scans were
wrapped up too late.

This defunctioned Gflag is deprecated and replaced with
ysql_scan_deadline_margin_ms, which is an absolute value, and its
default value of 1000 means the scan should be wrapped up after client
timeout minus one second.

For maximum efficiency the margin should be just above duration of
network round trip, and one second is good for most setups. It can be
increased if ping between nodes is longer than 500ms.

Minor fix: check deadline every iteration, instead of every 1024th,
the check is cheap and iterations are potentially lengthy, doing 1024
of them may take longer than the margin.

Original commits:
https://phabricator.dev.yugabyte.com/D15359 / [[https://github.com/yugabyte/yugabyte-db/commit/16bbda3ef0ad37c321be313be09661a7df5159dc|16bbda3]]
https://phabricator.dev.yugabyte.com/D15424 / [[https://github.com/yugabyte/yugabyte-db/commit/77f69a396fd39ce19d1baf6e97ee74d4802193e5|77f69a3]]

Test Plan: ybd --java-test 'org.yb.pgsql.TestPgReadTimeout'

Reviewers: mihnea, jason

Reviewed By: jason

Subscribers: jenkins-bot, bogdan, sanketh, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D15512
andrei-mart added a commit that referenced this issue Feb 18, 2022
Summary:
Previously ysql_scan_timeout_multiplier was used to stop scan before the
timeout, default value of 0.5 meant scan should return a response after
running for 1/2 of current client timeout. The feature was broken at
some point, and the multiplier effectively become 1.0, so scans were
wrapped up too late.

This defunctioned Gflag is deprecated and replaced with
ysql_scan_deadline_margin_ms, which is an absolute value, and its
default value of 1000 means the scan should be wrapped up after client
timeout minus one second.

For maximum efficiency the margin should be just above duration of
network round trip, and one second is good for most setups. It can be
increased if ping between nodes is longer than 500ms.

Minor fix: check deadline every iteration, instead of every 1024th,
the check is cheap and iterations are potentially lengthy, doing 1024
of them may take longer than the margin.

Backport conflict: target branch does not has ANALYZE support, changes
to respective code are skipped.

Original commits:
https://phabricator.dev.yugabyte.com/D15359 / [[https://github.com/yugabyte/yugabyte-db/commit/16bbda3ef0ad37c321be313be09661a7df5159dc|16bbda3]]
https://phabricator.dev.yugabyte.com/D15424 / [[https://github.com/yugabyte/yugabyte-db/commit/77f69a396fd39ce19d1baf6e97ee74d4802193e5|77f69a3]]

Test Plan: ybd --java-test 'org.yb.pgsql.TestPgReadTimeout'

Reviewers: mihnea, jason

Reviewed By: jason

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D15546
jayant07-yb pushed a commit to jayant07-yb/yugabyte-db that referenced this issue Mar 8, 2022
Summary:
Previously ysql_scan_timeout_multiplier was used to stop scan before the
timeout, default value of 0.5 meant scan should return a response after
running for 1/2 of current client timeout. The feature was broken at
some point, and the multiplier effectively become 1.0, so scans were
wrapped up too late.

This defunctioned Gflag is deprecated and replaced with
ysql_scan_deadline_margin_ms, which is an absolute value, and its
default value of 1000 means the scan should be wrapped up after client
timeout minus one second.

For maximum efficiency the margin should be just above duration of
network round trip, and one second is good for most setups. It can be
increased if ping between nodes is longer than 500ms.

Minor fix: check deadline every iteration, instead of every 1024th,
the check is cheap and iterations are potentially lengthy, doing 1024
of them may take longer than the margin.

Test Plan:
ybd --java-test 'org.yb.pgsql.TestPgReadTimeout'

Reviewers: mihnea, sergei, jason

Reviewed By: jason

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D15359
jayant07-yb pushed a commit to jayant07-yb/yugabyte-db that referenced this issue Mar 8, 2022
Summary:
Good chance to reach desired timing in 2 iterations.
Amendment on top of already landed https://phabricator.dev.yugabyte.com/D15359

Test Plan: ybd --java-test 'org.yb.pgsql.TestPgReadTimeout'

Reviewers: jason

Reviewed By: jason

Subscribers: amitanand, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D15424
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 8, 2022
@sushantrmishra
Copy link

There have been many improvements in scan path and related to timeout has been added. since there is no reproducible case listed in the issue, so unable to verify, hence closing the issue.

Please reopen if this error is reported again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) community/request Issues created by external users kind/bug This issue is a bug priority/medium Medium priority issue
Projects
Status: Done
Development

No branches or pull requests

7 participants