-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: make InternalExecutor have the same descs.Collection as the parent #69495
Comments
Doing something pragmatic here where we tightly couple the InternalExecutor to the Today the There's also the fact that we take this I think what I'd prefer is that we remove the I think there's good reason to bring sanity to this layer but doing a good job here isn't trivial. |
At its core, I think the problem here is that the internal executor abstraction is sitting at the wrong layer. It's this free-floating thing that is sort of associated with a session and sort of not. We should bind the internal executor much more closely to the session and connection. It should not take a transaction but rather should be initialized with one. We'll still need to be careful with concurrency and transaction usage, but that's not new. This feels like a relatively important thing to do sooner rather than later. |
This is only with regards to the InternalExecutor used for builtins (aka on EvalContext) right? I think my PR here might fix it. |
How does this fix it? The internal executor hanging off of the |
Nvm just realized we still init a new conn_executor at https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/internal.go#L182 which is where the Collections is created. We could make it so InternalExecutor overrides could take Collections as a short term solution. Although I don't think having to always override to use the IE is a good idea which is what's going on in #71246 to use the IE for any builtins. |
I agree. I think we should increase the coupling between the internal executor and the conn executor in the common case. See #71246 (comment). |
I think this is affecting activerecord too cockroachdb/activerecord-cockroachdb-adapter#234 |
I gave this a go (this was intended to be a quick and dirty to see if passing the same descCollection through is enough): #73100 But end up with this when running this query - any ideas?:
|
Curious if @RichardJCai has an idea? He did something similar in #71246 |
I think I actually had a working method to pass the descCollections through in an earlier version of #71246, I don't recall running into any of those errors. Once #71246 is merged I can add back that code (trying to limit the scope of that refactor). |
@RichardJCai has the time come? |
Pretty busy as of recent, I have another refactor in the works that I think I can easily tag this desc.Collections work (injecting desc.Collections) as part of. #73293 Will probably be a flex friday thing though. |
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath an outer txn. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None Release note (<category, see below>): <what> <show> <why>
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
Currently, the internal executor always create its own descriptor collections, txn state, job collection and etc. for its conn executor, even though it's run underneath a "parent" query. These recreation can unneccesarily reduce the query efficiency in some use cases, such as when an internal executor is used under a planner context. In this case, the internal executor is expected to inherit these info from the planner, rather than creating its own. To make this rule more explicit, this commit adds a series of query functions under `sql.planner`. Each of these functions wrap both the init of an internal executor and the query execution. In this way, the internal executor always stores the info inherited from the parent planner, and will pass it to its child conn executor. fixes cockroachdb#69495 Release note: None
82477: sql: introduce new internal executor interfaces r=rafiss,ajwerner a=ZhouXing19 This PR aims to provide a set of safer interfaces for the internal executor, making it less easy to abuse. Currently, each conn executor underneath the internal executor (we call it “child executor”) has its own set of information, such as descriptor collection, job collection, schema change jobs, etc, even when it’s run with a not-nil outer `kv.Txn`, or there're multiple SQL executions under the same `kv.Txn`. This is not intuitive, since it violates a rather deep principle that a `descs.Collection` and a SQL txn have a 1:1 relationship. The code doesn’t enforce that, but it ought to. The more places that make it possible to decouple this, the more anxious we get. Ideally, internal executor with a not-nil txn is either planner or `collectionFactory` oriented, so that the txn is always tightly coupled with the descriptor collection. We thus propose a set of new interfaces to ensure this coupling. Currently, the usage of an internal executor query function (e.g. `InternalExecutor.ExecEx()`) falls into the following 3 categories: 1. The query is run under a planner context and with a not-nil kv.Txn from this planner. 2. The query is run without a kv.Txn. (e.g. InternalExecutor.ExecEx(..., nil /* txn */, stmt...) 3. The query is running with a not-nil kv.Txn but not under the planner context. For usage 1, the descriptor collections, txn state, job collections, and session data from the parent planner are expected to be passed to the internal executor's child conn executor. For usage 2 and 3, if multiple SQL statements are run under the same txn, these executions should share the descs.Collection, txn state machine, job collections and session data for their conn executors. To suit these 3 use cases, we proposed 3 interfaces for each of the query function: (In the following we use `InternalExecutor.ExecEx` as the example) - For case 1, refactor to use `func (p *planner) ExecExUpdated()`, where the internal executor is always initialized with `descs.Collection`, `TxnState` and etc. from the `sql.planner`. - For case 2, refactor to use `ieFactory.WithoutTxn()`, where the query is always run with a nil kv.Txn. - For case 3, refactor to use `CollectionFactory.TxnWithExecutor()`. In this function, the internal executor is generated and passed to the call back function to run the query. We also tried refactoring some of the existing use cases to give an example of the new interface. (Note that the ultimate goal of this improvement is to deprecate all the "free-hanging" `InternalExecutor` objects (such as `sql.ExecutorConfig.InternalExecutor`) and replace them with an `InternalExecutorFactory` field. `InternalExecutorFactory` is to initialize a REAL internal executor, but it cannot be used directly to run SQL statement queries. Instead, we wrap the initialization of an internal executor inside each query function, i.e. init it only when you really need to run a query. In other words, the creation of an internal executor becomes closer to the query running.) fixes #69495 fixes #78998 Release Note: None Co-authored-by: Jane Xing <zhouxing@uchicago.edu>
Is your feature request related to a problem? Please describe.
The internal executor uses a different descriptor
Collection
than the parentconn_executor
. We've seen this cause performance problems since some builtins are implemented using the internal executor. If the builtin is called once per each output row in a query, then the table descriptors need to be re-fetched for each invocation of the internal executor.@ajwerner suggests that not using the same Collection might be a correctness problem. #58356 (comment)
Describe the solution you'd like
It would be nice if the internal executor could use the same Collection as the conn_executor it's used from.
Describe alternatives you've considered
Implement builtins without using the internal executor.
Additional context
This caused severe performance problems in #57924 and #65551. We still have a backlog item for addressing other similar issues (#66173)
This was discussed in Slack: https://cockroachlabs.slack.com/archives/C0168LW5THS/p1611947857055600
Jira issue: CRDB-9622
Epic CRDB-14492
The text was updated successfully, but these errors were encountered: