Previously we supported row count limits as long as the limit was higher
than the number of returned rows. Here we add a feature that supports
this part of the spec in a partial way. This should unblock simple JDBC
usage of this feature.
Supported use cases:
- implicit transactions (which auto close the portal after suspension)
- explicit transactions executed to completion
Unsupported use cases (with explicit transactions):
- interleaved execution
- explicitly closing a portal after partial execution
Many options were evaluated during implementation. The one here is based
on work where the pgwire package itself takes over the state machine
processing during AddRow and looks for further ExecPortal messages. This
has a number of problems: there are now two state machines, we can only
support a part of the spec. However it also has a number of benefits
like it is a simple implementation that is easy to understand.
Two other solutions were evaluated.
First, teaching distsql how to pause and resume execution (a
proof-of-concept branch for this was produced). I did not pursue this
route because of my own unfamiliarity with distsql, and I thought that
attempting to reach a high level of confidence that all of the new pause,
resume, and error logic flows were correct would be very difficult
(I may be wrong about this). Also, this approach needed to address how
to handle post-distsql execution cleanup and stats gathering. That is,
after the call into distsql was paused and returned control back to
the sql executor, there's a lot of code that gets run to cleanup stuff
and report stats in various places, including some defers. These would
need to be audited and only execute once per statement, not once per
portal execution.
Second, start distsql execution in a new go routine and send exec portal
requests to it over a channel. This would avoid teaching distsql how
to pause and resume itself, but instead move that complexity into the
channel and go routine handling, another area ripe for race conditions
and deadlocks. This also needed to deal with the post-distsql execution
cleanup and defers handling discussed above.
For now, we decided that this limited implementation is good enough for
what we need today. In order to one day either support the full spec
or pay down some of our technical debt, we can probably do a number of
preliminary refactors that will make invoking much of the distsql path
multpile times easier.
pgx got a version bump to get support for PortalSuspended. The current
pgx version had a few new dependencies too.
See #4035
Release note (sql change): add partial support for row limits during
portal execution in pgwire.