Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: consider limiting the memory usage of prepared statements in a single session #97866

Closed
yuzefovich opened this issue Mar 1, 2023 · 6 comments · Fixed by #98917
Closed
Assignees
Labels
branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker T-sql-queries SQL Queries Team

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Mar 1, 2023

Currently, all prepared statements of a single session are kept in memory until either the session stops (could be due to a node restart) or the prepared statements are dropped via DEALLOCATE command. If a user has long-lived sessions (e.g. due to the usage of the connection pooling) and creates new prepared statements without deallocating the previous ones, the memory usage of all of the prepared statements on the node can reach --max-sql-memory budget in which case the node can effectively become unavailable.

We should consider limiting the memory usage of all prepared statements at the session level, perhaps by implementing LRU cache. Note that the current behavior of never internally deallocating prepared statements is consistent with Postgres, so evicting the prepared statements once the threshold is reached can break the existing applications. However, it seems that the downside of that (an app hitting an error) might be smaller than the downside of keeping all prepared statements live and in-memory (node becoming unavailable).

Jira issue: CRDB-24930

@yuzefovich yuzefovich added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Mar 1, 2023
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Mar 1, 2023
@mgartner
Copy link
Collaborator

mgartner commented Mar 1, 2023

This seems very related to #72581. Possibly a duplicate?

@yuzefovich
Copy link
Member Author

Indeed, it is exactly one of the points of #72581

we need to limit the total memory to a fraction of memory available to SQL

However, the title of that issue seems to be about performing correct memory accounting for the prepared statements (which we already do I think).

@michae2
Copy link
Collaborator

michae2 commented Mar 3, 2023

If we had some hook into the connection pooling, we could perform a DEALLOCATE ALL ourselves on the pooled connection when the application connection closes. (But I don't know if we have such a hook.)

@DrewKimball
Copy link
Collaborator

Maybe should be its own issue, but we should consider trying to reuse entries in the prepared statement cache when the query is the same apart from its name. I think that would have prevented the recent issue entirely.

michae2 added a commit to michae2/cockroach that referenced this issue Mar 18, 2023
Add a new circular doubly-linked list of prepared statements to
`prepStmtNamespace` which tracks the least-recently-used prepared
statement. When new setting `prepared_statements_cache_size` is set,
use this LRU list to automatically deallocate prepared statements.

Fixes: cockroachdb#97866

Epic: None

Release note (sql change): Add a new `prepared_statements_cache_size`
setting which, when set to a non-zero number of bytes, causes the least-
recently-used prepared statements to be automatically deallocated when
prepared statement memory usage goes above the cache size. This setting
can be used to avoid prepared statement leaks from long-lived
connections which never `DEALLOCATE` prepared statements.
michae2 added a commit to michae2/cockroach that referenced this issue Mar 18, 2023
Add a new circular doubly-linked list of prepared statements to
`prepStmtNamespace` which tracks the least-recently-used prepared
statement. When new setting `prepared_statements_cache_size` is set,
use this LRU list to automatically deallocate prepared statements.

Fixes: cockroachdb#97866

Epic: None

Release note (sql change): Add a new `prepared_statements_cache_size`
setting which, when set to a non-zero number of bytes, causes the least-
recently-used prepared statements to be automatically deallocated when
prepared statement memory usage goes above the cache size. This setting
can be used to avoid prepared statement leaks from long-lived
connections which never `DEALLOCATE` prepared statements.
michae2 added a commit to michae2/cockroach that referenced this issue Mar 21, 2023
Add a new circular doubly-linked list of prepared statements to
`prepStmtNamespace` which tracks the least-recently-used prepared
statement. When new setting `prepared_statements_cache_size` is set,
use this LRU list to automatically deallocate prepared statements.

Fixes: cockroachdb#97866

Epic: None

Release note (sql change): Add a new `prepared_statements_cache_size`
setting which, when set to a non-zero number of bytes, causes the least-
recently-used prepared statements to be automatically deallocated when
prepared statement memory usage goes above the cache size. This setting
can be used to avoid prepared statement leaks from long-lived
connections which never `DEALLOCATE` prepared statements.
michae2 added a commit to michae2/cockroach that referenced this issue Mar 21, 2023
Add a new circular doubly-linked list of prepared statements to
`prepStmtNamespace` which tracks the least-recently-used prepared
statement. When new setting `prepared_statements_cache_size` is set,
use this LRU list to automatically deallocate prepared statements.

Fixes: cockroachdb#97866

Epic: None

Release note (sql change): Add a new `prepared_statements_cache_size`
setting which, when set to a non-zero number of bytes, causes the least-
recently-used prepared statements to be automatically deallocated when
prepared statement memory usage goes above the cache size. This setting
can be used to avoid prepared statement leaks from long-lived
connections which never `DEALLOCATE` prepared statements.
craig bot pushed a commit that referenced this issue Mar 22, 2023
98671: streamingccl: use fingerprinting in more tests r=dt a=stevendanna

This uses fingerprinting in more tests to avoid starting the destination tenant before cutting over.

Epic: none

Release note: None

98914: sqlsmith: add DELETE FROM ... USING and UPDATE ... FROM support r=yuzefovich a=yuzefovich

This commit makes it so that the sqlsmith can now generate statements of the form `DELETE FROM ... USING` and `UPDATE ... FROM`. We toss a coin every time before deciding to add another table (meaning in 50% cases these forms are not used, in 25% we have 1 extra table, etc). It also adjusts the generation of the RETURNING clause for DELETE and UPDATE to be able to pick from any of the table sources.

Fixes: #98910.

Release note: None

98917: sql: add prepared_statements_cache_size setting r=yuzefovich a=michae2

Add a new circular doubly-linked list of prepared statements to `prepStmtNamespace` which tracks the least-recently-used prepared statement. When new setting `prepared_statements_cache_size` is set, use this LRU list to automatically deallocate prepared statements.

Fixes: #97866

Epic: None

Release note (sql change): Add a new `prepared_statements_cache_size` setting which, when set to a non-zero number of bytes, causes the least-recently-used prepared statements to be automatically deallocated when prepared statement memory usage goes above the cache size. This setting can be used to avoid prepared statement leaks from long-lived connections which never `DEALLOCATE` prepared statements.

99155: sql/logictest: disable stats collection for system tables earlier r=rytaft a=rytaft

This commit updates the logictests to disable stats collection for system tables before the test cluster is started. This avoids a race condition where stats might be collected on system tables before collection is disabled with `SET CLUSTER SETTING`.

This should prevent flakes for tests that show `EXPLAIN` output for queries over system tables.

Fixes #99118

Release note: None

99173: sql: enable resumption of a flow for pausable portals r=yuzefovich a=ZhouXing19

This PR is part of the implementation of multiple active portals. (Extracted from #96358)

We now introduce a `Resume()` method for flow, and when a pausable portal is being re-executed, rather than generating a new flow, we resume the persisted flow to continue the previous execution.

---
### sql: add telemetry MultipleActivePortalCounter

This commit added a telemetry counter `MultipleActivePortalCounter` that would
be incremented each time the cluster setting
`sql.pgwire.multiple_active_portals.enabled` is set to true

---
### sql: add Resume method for flowinfra.Flow and execinfra.Processor

For pausable portals, each execution needs to resume the processor with new
output receiver. We don't need to restart the processors, and this `Resume()`
step can be called many times after `Run()` is called.

----
### sql: reuse flow for pausable portal
To execute portals in an interleaving manner, we need to persist the flow and
queryID so that we can _continue_ fetching the result when we come back to the same
portal.

We now introduce `pauseInfo` field in `sql.PreparedPortal` that stores this
metadata. It's set during the first-time execution of an engine, and all
following executions will reuse the flow and the queryID. This also implies that
these resources should not be cleaned up with the end of each execution.
Implementation for the clean-up steps is included in the next commit.

Also, in this commit we hang a `*PreparedPortal` to the planner, and how it is
set can be seen in the next commit as well.

Epic: CRDB-17622

Release note: None


Co-authored-by: Steven Danna <danna@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Michael Erickson <michae2@cockroachlabs.com>
Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
Co-authored-by: Jane Xing <zhouxing@uchicago.edu>
@craig craig bot closed this as completed in 6018086 Mar 22, 2023
michae2 added a commit to michae2/cockroach that referenced this issue Mar 27, 2023
Add a new circular doubly-linked list of prepared statements to
`prepStmtNamespace` which tracks the least-recently-used prepared
statement. When new setting `prepared_statements_cache_size` is set,
use this LRU list to automatically deallocate prepared statements.

Fixes: cockroachdb#97866

Epic: None

Release note (sql change): Add a new `prepared_statements_cache_size`
setting which, when set to a non-zero number of bytes, causes the least-
recently-used prepared statements to be automatically deallocated when
prepared statement memory usage goes above the cache size. This setting
can be used to avoid prepared statement leaks from long-lived
connections which never `DEALLOCATE` prepared statements.
michae2 added a commit to michae2/cockroach that referenced this issue Mar 27, 2023
Add a new circular doubly-linked list of prepared statements to
`prepStmtNamespace` which tracks the least-recently-used prepared
statement. When new setting `prepared_statements_cache_size` is set,
use this LRU list to automatically deallocate prepared statements.

Fixes: cockroachdb#97866

Epic: None

Release note (sql change): Add a new `prepared_statements_cache_size`
setting which, when set to a non-zero number of bytes, causes the least-
recently-used prepared statements to be automatically deallocated when
prepared statement memory usage goes above the cache size. This setting
can be used to avoid prepared statement leaks from long-lived
connections which never `DEALLOCATE` prepared statements.
michae2 added a commit to michae2/cockroach that referenced this issue Mar 27, 2023
Add a new circular doubly-linked list of prepared statements to
`prepStmtNamespace` which tracks the least-recently-used prepared
statement. When new setting `prepared_statements_cache_size` is set,
use this LRU list to automatically deallocate prepared statements.

Fixes: cockroachdb#97866

Epic: None

Release note (sql change): Add a new `prepared_statements_cache_size`
setting which, when set to a non-zero number of bytes, causes the least-
recently-used prepared statements to be automatically deallocated when
prepared statement memory usage goes above the cache size. This setting
can be used to avoid prepared statement leaks from long-lived
connections which never `DEALLOCATE` prepared statements.
@blathers-crl
Copy link

blathers-crl bot commented Mar 28, 2023

Hi @michae2, please add branch-* labels to identify which branch(es) this release-blocker affects.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@michae2 michae2 added the branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 label Mar 28, 2023
@michae2
Copy link
Collaborator

michae2 commented Mar 28, 2023

Marking this as a GA blocker because a customer is waiting for it, and also I'd like to merge it to release-23.1 before #99663.

michae2 added a commit to michae2/cockroach that referenced this issue Mar 28, 2023
Add a new circular doubly-linked list of prepared statements to
`prepStmtNamespace` which tracks the least-recently-used prepared
statement. When new setting `prepared_statements_cache_size` is set,
use this LRU list to automatically deallocate prepared statements.

Fixes: cockroachdb#97866

Epic: None

Release note (sql change): Add a new `prepared_statements_cache_size`
setting which, when set to a non-zero number of bytes, causes the least-
recently-used prepared statements to be automatically deallocated when
prepared statement memory usage goes above the cache size. This setting
can be used to avoid prepared statement leaks from long-lived
connections which never `DEALLOCATE` prepared statements.
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants