Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.2: sql/pgwire: fix statement buffer memory leak when using suspended portals #67370

Merged
merged 1 commit into from
Jul 9, 2021

Commits on Jul 9, 2021

  1. sql/pgwire: fix statement buffer memory leak when using suspended por…

    …tals
    
    The connection statement buffer grows indefinitely when the client uses the
    execute portal with limit feature of the Postgres protocol, eventually causing
    the node to crash out of memory. Any long running query that uses the limit
    feature will cause this memory leak such as the `EXPERIMENTAL CHANGEFEED FOR`
    statement. The execute portal with limit feature of the Postgres protocol is
    used by the JDBC Postgres driver to fetch a limited number of rows at a time.
    
    The leak is caused by commands accumulating in the buffer and never getting
    cleared out. The client sends 2 commands every time it wants to receive more
    rows:
    
    - `Execute {"Portal": "C_1", "MaxRows": 1}`
    - `Sync`
    
    The server processes the commands and leaves them in the buffer, every
    iteration causes 2 more commands to leak.
    
    A similar memory leak was fixed by cockroachdb#48859, however the execute with limit
    feature is implemented by a side state machine in limitedCommandResult. The
    cleanup routine added by cockroachdb#48859 is never executed for suspended portals as
    they never return to the main conn_executor loop.
    
    After this change the statement buffer gets trimmed to reclaim memory after
    each client command is processed in the limitedCommandResult side state
    machine. The StmtBuf.Ltrim function was changed to be public visibility to
    enable this. While this is not ideal, it does scope the fix to the
    limitedCommandResult side state machine and could be addressed when the
    limitedCommandResult functionality is refactored into the conn_executor.
    
    Added a unit test which causes the leak, used the PGWire client in the test as
    neither the pg or pgx clients use execute with limit, so cant be used to
    demonstrate the leak. Also tested the fix in a cluster by following the steps
    outlined in cockroachdb#66849.
    
    Resolves: cockroachdb#66849
    
    See also: cockroachdb#48859
    
    Release note (bug fix): fix statement buffer memory leak when using
    suspended portals
    joesankey committed Jul 9, 2021
    Configuration menu
    Copy the full SHA
    a0fb94e View commit details
    Browse the repository at this point in the history