Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: fix the lifetime of portals #33423

Merged
merged 1 commit into from
Jan 13, 2019
Merged

Conversation

andreimatei
Copy link
Contributor

@andreimatei andreimatei commented Jan 2, 2019

Before this patch, the lifetime of portals was completely broken (wrt
Postgres). A portal used to live until the prepared statement from which
it was created was destroyed.
What Postgres wants is the following:

  • for portals created in transactions, they live until the transaction is
    done
  • for portals created outside of transactions, they live until the
    following Sync command
    Notice that the life times of portals has nothing to do with the
    lifetime of prepared statements.

This patch implements something kinda analogous:

  • transactional portals live until the transaction is done, like in
    Postgres
  • non-transactional portals live until the next transaction is done, so
    they'll be destroyed as soon as a non-transactional statement is
    executed, or when they themselves are executed (which is the common
    case), or until some other transaction runs. This is close enough.

This patch also comes with other points:

  • I've reworked how portals and prepared statements are kept alive by
    references from two sources: the prepStmtsNamespace and
    prepStmtsNamespaceAtTxnRewindPos. Entries need to live while they're
    referenced from either of those. The code used to implement a pretty
    unscrutable scheme for ensuring this, now I've switched it to a
    reference counting scheme which is simpler.
  • I've removed the references from prepared statements to portals and
    I've reworked the way in which prepStmtsNamespace is cleared, and so
    hopefully I've eliminated a bunch of map allocations happening in
    between transactions that Nathan was complaining about.
  • I've written a test which is the first ever unit test for the
    connExecutor. It took an ungodly amount of boiler plate to create a
    connExecutor that's not running in the context of a full node, but now
    it can be nicely reused. Having this testing facility was long overdue.

Release note: None

@andreimatei andreimatei requested review from jordanlewis and a team January 2, 2019 02:08
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andreimatei
Copy link
Contributor Author

cc @knz so I don't sneak the 2nd commit past him

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About this 2nd commit I am ok with the change -- but then why don't you go and add the correct memory accounting instead? Anything (like portals) that can be allocated an arbitrary number of times deserves memory accounting.

Reviewed 4 of 4 files at r1, 2 of 2 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/conn_executor.go, line 961 at r1 (raw file):

	prepStmts map[string]*PreparedStatement
	// portals contains the portals currently available on the session.
	// TODO(andrei): The lifetime of portals is different from the Postgres. In

...from that of postgres...


pkg/sql/conn_executor.go, line 964 at r1 (raw file):

	// Postgres portals are tied to a transaction, and also one can't be reused
	// after exhausting its results. Everything we do about portals is nonsense
	// because we didn't understand them when we implemented them.

Isn't your patch fixing this TODO?

Otherwise, add a suggestion in the comment of what needs to be done to improve the situation

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then why don't you go and add the correct memory accounting instead? Anything (like portals) that can be allocated an arbitrary number of times deserves memory accounting.

Well I'm not attempting to improve things because I don't know how to measure the memory of the arguments. They're TypedExprs...
Also, I don't think it's worth it frankly. I've come to think very differently about portals now that I've studied them a bit: they're not supposed to be long lived things (like prepared stmts are). You're not supposed to have many of them hanging around on a connection because they can't be used to run queries repeatedly - a portal runs a query once and, once all the results are exhausted, you can't use it any more. In practice, particularly given that we don't have a procedural language that allows one to use multiple portals and switch between them, nobody is going to create more than one of these portals in crdb (namely, the unnamed one).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/conn_executor.go, line 961 at r1 (raw file):

Previously, knz (kena) wrote…

...from that of postgres...

Done.


pkg/sql/conn_executor.go, line 964 at r1 (raw file):

Previously, knz (kena) wrote…

Isn't your patch fixing this TODO?

Otherwise, add a suggestion in the comment of what needs to be done to improve the situation

My patch is only fixing one aspect (the portals can now survive a prep stmt being deleted) which is not the most important in the grand scheme.
The TODO lists the other problems - how portals should be tied to transactions and how we shouldn't allow them to be reused. I'm trying to do something about the latter at least - I have something cooking for these portals on the side.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, particularly given that we don't have a procedural language that allows one to use multiple portals and switch between them, nobody is going to create more than one of these portals in crdb (namely, the unnamed one).

But this is untrue like we've discovered in our investigations - in practice, applications only created named portals. They do dispose of them promptly, when the transaction ends, but it's worth noting nonetheless.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nobody is going to create more than one of these portals in crdb

  1. If that is indeed the assumption, then I'd recommend instead to error out when the assumption does not hold, instead of letting rando objects linger in the session.

That is, for example, an error if there are n>10 portals remaining when a txn has been committed; or if there is n>0 portal remaining when a txn starts.

  1. I reiterate that we need memory monitoring for anything that grows arbitrarily inside one session. If you would prefer to not do any monitoring, then add a (low) limit on the number of portals; this way the max allocations are O(1) and thus need not be more monitored than the number of sessions overall.

Reviewed 3 of 3 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/conn_executor.go, line 964 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

My patch is only fixing one aspect (the portals can now survive a prep stmt being deleted) which is not the most important in the grand scheme.
The TODO lists the other problems - how portals should be tied to transactions and how we shouldn't allow them to be reused. I'm trying to do something about the latter at least - I have something cooking for these portals on the side.

I still think that the comment could be more specific about what is wrong (instead of hand waving "everything we do is nonsense")

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and you can also consider:

  1. add telemetry for suspicious numbers of portals (eg n>1? or n>10), so we can investigate what's done in the wild

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is untrue like we've discovered in our investigations - in practice, applications only created named portals. They do dispose of them promptly, when the transaction ends, but it's worth noting nonetheless.

Wait what? When have we discovered that? pgx, for example, only uses the unnamed one.

If that is indeed the assumption, then I'd recommend instead to error out when the assumption does not hold, instead of letting rando objects linger in the session.

Well it's up to us to destroy them, and we don't. So an assertion would only... fire :)
The situation is also complicated by portals outside of a transaction, which in Postgres live until a Sync message (so one can do parse/bind/execute(limit 1)/execute(limit 1).../sync) but in our case the transaction is finished after the first execute, so... I don't know what to do exactly. One option is to only allow portals to be used inside transactions (and so a portal used outside of a transaction could be destroyed as soon as it was executed the first time). Or... more work.

add telemetry for suspicious numbers of portals (eg n>1? or n>10), so we can investigate what's done in the wild

I'll let someone else do that :)

I reiterate that we need memory monitoring for anything that grows arbitrarily inside one session. If you would prefer to not do any monitoring, then add a (low) limit on the number of portals; this way the max allocations are O(1) and thus need not be more monitored than the number of sessions overall.

Well so I intend to work on some of these things, but more on the portal cleanup part than on the limits and accounting part. With the code as it stands, I don't think putting a limit would serve much purpose. If you want to do it, though...
The way I see it, the 2nd commit removes a bit of accounting that was inaccurate anyway. I think it's better to not pretend we have accounting than to have this. But I can also leave it if you prefer; I'm not looking to do more than that. The reason why I deleted it in the first place is because I thought I can get rid of the Portal.Close() method, but turns out that I can't as long as the portals hold references to that memo.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/conn_executor.go, line 964 at r1 (raw file):

Previously, knz (kena) wrote…

I still think that the comment could be more specific about what is wrong (instead of hand waving "everything we do is nonsense")

I've removed the last sentence although I think it was good - this being a TODO it was more like "go talk to Andrei" than handwaving.
I'm not really prepared to say more than what's here, nor would this be a good place for it I think.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait what? When have we discovered that? pgx, for example, only uses the unnamed one.

The Java drivers all do this. When you try to use a portal with limit, it automatically promotes the statement to one within a transaction and names the portal something with an autoincremented integer.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@knz
Copy link
Contributor

knz commented Jan 3, 2019

I reiterate that we need memory monitoring for anything that grows arbitrarily inside one session. If you would prefer to not do any monitoring, then add a (low) limit on the number of portals; this way the max allocations are O(1) and thus need not be more monitored than the number of sessions overall.

I do not care about the previous code that much - if you want to remove it that's your choice. I do stand by my point that some code must exist for monitoring (perhaps new code).

In the end you know my position on this and I know yours. I have sent a discussion item offline, we can discuss around that before we settle on this PR.

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've remove the commit that was removing the memory accounting; it's causing too much of a ruckus.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: together with this patch I should also add cleanup for portals when transactions end. Otherwise, this might be a regression since it used to be the case that re-defining the unnamed prepared statement used to destroy portals and now it doesn't (so named portals hanging from the unnamed prepared statement would now leak whereas before they didn't).
So hold off the review.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@andreimatei andreimatei requested a review from a team January 6, 2019 00:01
@andreimatei andreimatei changed the title sql: don't close portals when the prepared stmt is closed sql: fix the lifetime of portals Jan 6, 2019
@andreimatei
Copy link
Contributor Author

I've expanded and reworked this patch. PTAL

The first commit is #33529

@andreimatei andreimatei force-pushed the sql.portals branch 2 times, most recently from 3e15d67 to 399ced9 Compare January 6, 2019 20:55
Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 9 of 9 files at r5.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

Before this patch, the lifetime of portals was completely broken (wrt
Postgres). A portal used to live until the prepared statement from which
it was created was destroyed.
What Postgres wants is the following:
- for portals created in transactions, they live until the transaction is
done
- for portals created outside of transactions, they live until the
following Sync command
Notice that the life times of portals has nothing to do with the
lifetime of prepared statements.

This patch implements something kinda analogous:
- transactional portals live until the transaction is done, like in
Postgres
- non-transactional portals live until the next transaction is done, so
they'll be destroyed as soon as a non-transactional statement is
executed, or when they themselves are executed (which is the common
case), or until some other transaction runs. This is close enough.

This patch also comes with other points:
- I've reworked how portals and prepared statements are kept alive by
references from two sources: the prepStmtsNamespace and
prepStmtsNamespaceAtTxnRewindPos. Entries need to live while they're
referenced from either of those. The code used to implement a pretty
unscrutable scheme for ensuring this, now I've switched it to a
reference counting scheme which is simpler.
- I've removed the references from prepared statements to portals and
I've reworked the way in which prepStmtsNamespace is cleared, and so
hopefully I've eliminated a bunch of map allocations happening in
between transactions that Nathan was complaining about.
- I've written a test which is the first ever unit test for the
connExecutor. It took an ungodly amount of boiler plate to create a
connExecutor that's not running in the context of a full node, but now
it can be nicely reused. Having this testing facility was long overdue.
- I've fixed a bug in the implementation of
connExPrepStmtsAccessor.DeleteAll() that "leaked" the memory accounting
for  prepared statements. This could lead to crashes, like showed below.

root@127.0.0.1:53405/defaultdb> begin; select 1;
root@127.0.0.1:53405/defaultdb  OPEN> prepare x as select 1;
root@127.0.0.1:53405/defaultdb  OPEN> deallocate all;
root@127.0.0.1:53405/defaultdb  OPEN> commit;
root@127.0.0.1:53405/defaultdb> ^D
panic: session: unexpected 10240 leftover bytes

Release note (bug fix): Fix a memory leak around DEALLOCATE and DISCARD
statements that could result in crashes with the "unexpected <amount>
leftover bytes" message.
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

craig bot pushed a commit that referenced this pull request Jan 13, 2019
33423: sql: fix the lifetime of portals r=andreimatei a=andreimatei

Before this patch, the lifetime of portals was completely broken (wrt
Postgres). A portal used to live until the prepared statement from which
it was created was destroyed.
What Postgres wants is the following:
- for portals created in transactions, they live until the transaction is
done
- for portals created outside of transactions, they live until the
following Sync command
Notice that the life times of portals has nothing to do with the
lifetime of prepared statements.

This patch implements something kinda analogous:
- transactional portals live until the transaction is done, like in
Postgres
- non-transactional portals live until the next transaction is done, so
they'll be destroyed as soon as a non-transactional statement is
executed, or when they themselves are executed (which is the common
case), or until some other transaction runs. This is close enough.

This patch also comes with other points:
- I've reworked how portals and prepared statements are kept alive by
references from two sources: the prepStmtsNamespace and
prepStmtsNamespaceAtTxnRewindPos. Entries need to live while they're
referenced from either of those. The code used to implement a pretty
unscrutable scheme for ensuring this, now I've switched it to a
reference counting scheme which is simpler.
- I've removed the references from prepared statements to portals and
I've reworked the way in which prepStmtsNamespace is cleared, and so
hopefully I've eliminated a bunch of map allocations happening in
between transactions that Nathan was complaining about.
- I've written a test which is the first ever unit test for the
connExecutor. It took an ungodly amount of boiler plate to create a
connExecutor that's not running in the context of a full node, but now
it can be nicely reused. Having this testing facility was long overdue.

Release note: None

Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Jan 13, 2019

Build succeeded

@craig craig bot merged commit f612484 into cockroachdb:master Jan 13, 2019
@andreimatei andreimatei deleted the sql.portals branch January 15, 2019 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants