Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: SHOW QUERIES lazily interpolates placeholders #86968

Merged
merged 3 commits into from
Oct 28, 2022

Conversation

jordanlewis
Copy link
Member

@jordanlewis jordanlewis commented Aug 26, 2022

SHOW QUERIES (and crdb_internal.node_queries, cluster_queries)
interpolates placeholder values into the statement so that it is
possible to see the placeholder values of a prepared statement - but it
used to do this unconditionally during statement execution.

This is an expensive process that spends a lot of CPU for little reason,
since the interpolation was happening in the hot path of every query.

Now, we include the placeholder values as a separate array in the
internal representation of active queries, and interpolate the values
only when they're being pulled out to examine, to avoid the
unconditional runtime interpolation costs.

As a side effect of this change, the original comments in a query are
now included in SHOW QUERIES and the two active queries tables.

Release note (sql change): the query field in the
crdb_internal.node_queries, crdb_internal.cluster_queries, and SHOW
QUERIES commands now includes the original comments in the queries.

Informs CRDB-17299

@jordanlewis jordanlewis requested a review from a team August 26, 2022 22:09
@jordanlewis jordanlewis requested a review from a team as a code owner August 26, 2022 22:09
@jordanlewis jordanlewis requested a review from a team August 26, 2022 22:09
@jordanlewis jordanlewis requested a review from a team as a code owner August 26, 2022 22:09
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice improvements! both changes need tests

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rafiss)


pkg/sql/conn_executor.go line 3163 at r1 (raw file):

		placeholders := make([]string, nPlaceholders)
		for i := range placeholders {
			placeholders[i] = query.placeholders.Values[i].String()

nit: maybe use tree.AsStringWithFlags(..., tree.FmtPgwireText) here


pkg/sql/crdb_internal.go line 1894 at r1 (raw file):

				return err
			}
			arr := tree.NewDArray(types.String)

nit: name it placeholders instead of arr


pkg/sql/crdb_internal.go line 1897 at r1 (raw file):

			arr.Array = make(tree.Datums, len(query.Placeholders))
			for i, v := range query.Placeholders {
				arr.Array[i] = tree.NewDString(v)

does this need to use (*Darray).Append?

@jordanlewis jordanlewis force-pushed the no-interpolate branch 2 times, most recently from af99535 to ce18c4b Compare August 29, 2022 03:10
@rafiss rafiss self-requested a review August 29, 2022 14:52
@dhartunian dhartunian removed the request for review from a team August 31, 2022 14:30
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first commit needs new tests, and fixes to existing ones

@jordanlewis
Copy link
Member Author

Yes, sorry, it's not ready for review yet.

@jordanlewis jordanlewis changed the title sql: SHOW QUERIES doesn't interpolate placeholders sql: SHOW QUERIES lazily interpolates placeholders Sep 23, 2022
@jordanlewis
Copy link
Member Author

I reworked this to lazily interpolate the output of crdb_internal.cluster_statements etc as well as SHOW QUERIES, to avoid changing the API. If we want to change the API we can do it later. Now, we still gain the performance enhancements because we avoid having to unconditionally interpolate during the hot path - instead, we interpolate just when printing the active queries on the other end.

@jordanlewis
Copy link
Member Author

PTAL @rafiss

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this lgtm! just had one optional comment about testing.

does the first commit mean that we will get comments in telemetry logging?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)


pkg/sql/show_test.go line 735 at r11 (raw file):

		},
		{
			"SELECT /* test */ 'hi'::string",

what if there are two comments?

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)


-- commits line 28 at r11:
this release note is old

This commit adds a new array to the return value from the parser, which
contains the comments that were in the parsed SQL statement.

Release note: None
SHOW QUERIES (and crdb_internal.node_queries, cluster_queries)
interpolates placeholder values into the statement so that it is
possible to see the placeholder values of a prepared statement - but it
used to do this unconditionally during statement execution.

This is an expensive process that spends a lot of CPU for little reason,
since the interpolation was happening in the hot path of every query.

Now, we include the placeholder values as a separate array in the
internal representation of active queries, and interpolate the values
only when they're being pulled out to examine, to avoid the
unconditional runtime interpolation costs.

As a side effect of this change, the original comments in a query are
now included in SHOW QUERIES and the two active queries tables.

Release note (sql change): the query field in the
crdb_internal.node_queries, crdb_internal.cluster_queries, and SHOW
QUERIES commands now includes the original comments in the queries.
@jordanlewis
Copy link
Member Author

does the first commit mean that we will get comments in telemetry logging?

Yes, I believe so.

Previously, the parser (actually the scanner) removed comments that came
at the end of a statement from the raw sql that was returned along with
the parsed AST. This behavior is now removed.

Release note: None
Copy link
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)


-- commits line 28 at r11:

Previously, rafiss (Rafi Shamim) wrote…

this release note is old

Done.


pkg/sql/crdb_internal.go line 1897 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

does this need to use (*Darray).Append?

Done.


pkg/sql/show_test.go line 735 at r11 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

what if there are two comments?

Done.

@jordanlewis
Copy link
Member Author

@rafiss PTAL

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! rebase to fix the TestComposeGSSPython failure

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@jordanlewis
Copy link
Member Author

TFTR! I will just bors it I think that should get past the flake right?

bors r+

@craig
Copy link
Contributor

craig bot commented Oct 28, 2022

Build failed:

@jordanlewis
Copy link
Member Author

Unrelated flake

bors r+

@craig
Copy link
Contributor

craig bot commented Oct 28, 2022

Build succeeded:

@craig craig bot merged commit 3064717 into cockroachdb:master Oct 28, 2022
@jordanlewis jordanlewis deleted the no-interpolate branch October 28, 2022 17:27
mgartner added a commit to mgartner/cockroach that referenced this pull request Aug 15, 2024
In cockroachdb#86968 the scanner gained the ability to retain comments in scanned
SQL strings, and this was an always-on feature. However, the comments
are only used when populating the crdb_internal.cluster_queries table,
see `sql.formatActiveQuery`. Now, comments are only retained when the
parser is used from this function, reducing allocations for all other
cases.

Fixes cockroachdb#127713

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this pull request Aug 15, 2024
In cockroachdb#86968 the scanner gained the ability to retain comments in scanned
SQL strings, and this was an always-on feature. However, the comments
are only used when populating the crdb_internal.cluster_queries table,
see `sql.formatActiveQuery`. Now, comments are only retained when the
parser is used from this function, reducing allocations for all other
cases.

Fixes cockroachdb#127713

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this pull request Aug 15, 2024
In cockroachdb#86968 the scanner gained the ability to retain comments in scanned
SQL strings, and this was an always-on feature. However, the comments
are only used when populating the crdb_internal.cluster_queries table,
see `sql.formatActiveQuery`. Now, comments are only retained when the
parser is used from this function, reducing allocations for all other
cases.

Fixes cockroachdb#127713

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this pull request Aug 15, 2024
In cockroachdb#86968 the scanner gained the ability to retain comments in scanned
SQL strings, and this was an always-on feature. However, the comments
are only used when populating the crdb_internal.cluster_queries table,
see `sql.formatActiveQuery`. Now, comments are only retained when the
parser is used from this function, reducing allocations for all other
cases.

Fixes cockroachdb#127713

Release note: None
craig bot pushed a commit that referenced this pull request Aug 15, 2024
128032: Reapply "bootstrap: create an explicit zoneconfig for timeseries data" r=rafiss a=rafiss

This partially reverts commit f9d47ce.

This time, rather than copying over the full zone config for the default range to the timeseries range, we only copy over gc.ttlseconds. This means that other attributes, most notably the replication factor, will be inherited from the default range.

This will make it more clear that the timeseries zone config can be changed independently from all the other zone configs.

fixes #123762

Release note (ops change): New clusters that are initialized for the first time will now have a zone config defined for the `timeseries` range. This zone config only specifies the gc.ttlseconds, so all the other attributes are inherited from the zone config of the `default` range, as they were before.

Clusters that are upgraded to v24.3 from a previous version will also
have this zone configuration applied to the timeseries range, as long as
that range does not already have a zone config.

128987: sql/sem/tree: remove duplicate `tree.NewDOidWithName` function r=mgartner a=mgartner

The `tree.NewDOidWithName` function has been removed. All previous
callers now use `tree.NewDOidWithTypeAndName` which is exactly the same.

Epic: None

Release note: None


129053: sql/parser: only retain scanned SQL comments when necessary r=mgartner a=mgartner

In #86968 the scanner gained the ability to retain comments in scanned
SQL strings, and this was an always-on feature. However, the comments
are only used when populating the crdb_internal.cluster_queries table,
see `sql.formatActiveQuery`. Now, comments are only retained when the
parser is used from this function, reducing allocations for all other
cases.

Fixes #127713

Release note: None


Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
craig bot pushed a commit that referenced this pull request Aug 15, 2024
128032: Reapply "bootstrap: create an explicit zoneconfig for timeseries data" r=rafiss a=rafiss

This partially reverts commit f9d47ce.

This time, rather than copying over the full zone config for the default range to the timeseries range, we only copy over gc.ttlseconds. This means that other attributes, most notably the replication factor, will be inherited from the default range.

This will make it more clear that the timeseries zone config can be changed independently from all the other zone configs.

fixes #123762

Release note (ops change): New clusters that are initialized for the first time will now have a zone config defined for the `timeseries` range. This zone config only specifies the gc.ttlseconds, so all the other attributes are inherited from the zone config of the `default` range, as they were before.

Clusters that are upgraded to v24.3 from a previous version will also
have this zone configuration applied to the timeseries range, as long as
that range does not already have a zone config.

129053: sql/parser: only retain scanned SQL comments when necessary r=mgartner a=mgartner

In #86968 the scanner gained the ability to retain comments in scanned
SQL strings, and this was an always-on feature. However, the comments
are only used when populating the crdb_internal.cluster_queries table,
see `sql.formatActiveQuery`. Now, comments are only retained when the
parser is used from this function, reducing allocations for all other
cases.

Fixes #127713

Release note: None


Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
cthumuluru-crdb pushed a commit to cthumuluru-crdb/cockroach that referenced this pull request Aug 22, 2024
In cockroachdb#86968 the scanner gained the ability to retain comments in scanned
SQL strings, and this was an always-on feature. However, the comments
are only used when populating the crdb_internal.cluster_queries table,
see `sql.formatActiveQuery`. Now, comments are only retained when the
parser is used from this function, reducing allocations for all other
cases.

Fixes cockroachdb#127713

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants