Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Acceptance testing 22.2 - Failed executions insights #87785

Closed
kevin-v-ngo opened this issue Sep 10, 2022 · 2 comments
Closed

Acceptance testing 22.2 - Failed executions insights #87785

kevin-v-ngo opened this issue Sep 10, 2022 · 2 comments
Assignees
Labels
C-escalation-improvement Having this feature would have made an escalation easier O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs

Comments

@kevin-v-ngo
Copy link

kevin-v-ngo commented Sep 10, 2022

Failed execution insights are surfacing already. We are missing critical information such as the error code and message to help users troubleshoot why the statement failed.

image

Jira issue: CRDB-19537

Epic CRDB-20388

@kevin-v-ngo kevin-v-ngo added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Sep 10, 2022
@maryliag maryliag self-assigned this Nov 3, 2022
@maryliag
Copy link
Contributor

maryliag commented Nov 10, 2022

since the error message can contain private information, we need to consider 2 options:

  • simply not display them as is. We can probably create some generic error message for each case and point to the documentation, using the error code as the main point of the message
  • save normally the error message, but when retrieving hide the message if the user has VIEWACTIVITYREDACTED role

@maryliag maryliag assigned gtr and unassigned maryliag Nov 15, 2022
gtr added a commit to gtr/cockroach that referenced this issue Jan 30, 2023
crdb_internal.execution_insights

Part of: cockroachdb#87785, cockroachdb#94381.

Previously, the insights subsystem did not keep track of error code and
error messages for failed executions and only contained a
`FailedExecution` value for the `problem` field. This commit adds
the `last_error_code` and `last_error_msg` columns to the
`crdb_internal.cluster_execution_insights` virtual table. The next PR
will focus on displaying these values in the UI.

Release note (sql change): Added `last_error_code` and `last_error_msg`
columns to the `crdb_internal.cluster_execution_insights` virtual table
which contain the error code and error message for a failed statement
execution, respectively.
gtr added a commit to gtr/cockroach that referenced this issue Jan 31, 2023
crdb_internal.execution_insights

Part of: cockroachdb#87785, cockroachdb#94381.

Previously, the insights subsystem did not keep track of error code and
error messages for failed executions and only contained a
`FailedExecution` value for the `problem` field. This commit adds
the `last_error_code` and `last_error_msg` columns to the
`crdb_internal.cluster_execution_insights` virtual table. The next PR
will focus on displaying these values in the UI.

Release note (sql change): Added `last_error_code` and `last_error_msg`
columns to the `crdb_internal.cluster_execution_insights` virtual table
which contain the error code and error message for a failed statement
execution, respectively.
gtr added a commit to gtr/cockroach that referenced this issue Feb 9, 2023
Part of: cockroachdb#87785, cockroachdb#94381.

Previously, the insights subsystem did not keep track of `error code` for
failed executions and only contailed a `FailedExecution` value for the
problem field. This commit adds the `error_code` column to the
`crdb_internal.{cluster/node}_execution_insights` virtual tables.

Release note (sql change): Adds `error_code` column to the
`crdb_internal.{cluster/node}_execution_insights` virtual tables, which
contains the error code for a failed execution.
gtr added a commit to gtr/cockroach that referenced this issue Feb 10, 2023
Part of: cockroachdb#87785.

Previously, the insights subsystem did not keep track of `error code` for
failed executions and only contained a `FailedExecution` value for the
problem field. This commit adds the `error_code` column to the
`crdb_internal.{cluster/node}_execution_insights` virtual tables. This
commit also bubbles up that error code at the transaction level for
writing into the `crdb_internal.{cluster/node}_txn_execution_insights`
virtual table.

Release note (sql change): Adds `error_code` column to the
`crdb_internal.{cluster/node}_execution_insights` virtual tables, which
contains the error code for a failed execution. Adds `last_error_code`
column to the `crdb_internal.{cluster/node}_txn_execution_insights`
virtual tables, which is the error code of the last failed statement in
that transaction.
gtr added a commit to gtr/cockroach that referenced this issue Feb 13, 2023
tables

Part of: cockroachdb#87785.

Previously, the error codes for failed executions were not written to
the execution insights tables. This commit adds an `error_code` column
to the `crdb_internal.[cluster|node]_execution_insights` virtual tables
to keep track of the error code for a failed stmt execution. This commit
also adds a also adds a `last_error_code` column to to the
`crdb_internal.[cluster|node_txn_execution_insights` virtual tables to
keep track of the error code for the last failed statement for that
transaction.

Release note (sql change): Adds `error_code` column to the
`crdb_internal.[cluster|node]_execution_insights` virtual tables, which
contains the error code for a failed execution. Adds `last_error_code`
column to the `crdb_internal.[cluster|node]_txn_execution_insights`
virtual tables, which is the error code of the last failed statement in
that transaction.
gtr added a commit to gtr/cockroach that referenced this issue Feb 13, 2023
tables

Part of: cockroachdb#87785.

Previously, the error codes for failed executions were not written to
the execution insights tables. This commit adds an `error_code` column
to the `crdb_internal.[cluster|node]_execution_insights` virtual tables
to keep track of the error code for a failed stmt execution. This commit
also adds a also adds a `last_error_code` column to to the
`crdb_internal.[cluster|node_txn_execution_insights` virtual tables to
keep track of the error code for the last failed statement for that
transaction.

Release note (sql change): Adds `error_code` column to the
`crdb_internal.[cluster|node]_execution_insights` virtual tables, which
contains the error code for a failed execution. Adds `last_error_code`
column to the `crdb_internal.[cluster|node]_txn_execution_insights`
virtual tables, which is the error code of the last failed statement in
that transaction.
xinhaoz pushed a commit to xinhaoz/cockroach that referenced this issue Sep 20, 2023
Part of: cockroachdb#87785.

Previously, the insights subsystem did not keep track of the error
messages for failed executions, only the error codes.

This commit  updates the `[cluster|node]_execution_insights` and
`[cluster|node]_txn_execution_insights` virtual tables to include a
`last_error` column which contains the most recent error message.

Release note (sql change): adds `last_error` column to the
`[cluster|node]_execution_insights` and `[cluster|node]_txn_execution_insights`
tables which keeps track of the error message for failed executions.

Co-authored-by: gtr <gerardo@cockroachlabs.com>
xinhaoz pushed a commit to xinhaoz/cockroach that referenced this issue Sep 20, 2023
Part of cockroachdb#87785

This commit adds the error message for failed executions to the
statement and transaction insights pages. Since this value can contain
sensitive information, it must conform to the VIEWACTIVITY and
VIEWACTIVITYREDACTED system privielges.

Release note (ui change): adds "Error message" row to the statement and
transaction insights details pages. If the user has VIEWACTIVITY, they
are able to view the full error message. If they have
VIEWACTIVTYREDACTED, they are given a redacted error message. If they
have both, VIEWACTIVITYTREDACTED  takes precedence.
xinhaoz pushed a commit to xinhaoz/cockroach that referenced this issue Sep 20, 2023
Part of: cockroachdb#87785.

Previously, the insights subsystem did not keep track of the error
messages for failed executions, only the error codes.

This commit  updates the `[cluster|node]_execution_insights` and
`[cluster|node]_txn_execution_insights` virtual tables to include a
`last_error` column which contains the most recent error message.

Release note (sql change): adds `last_error` column to the
`[cluster|node]_execution_insights` and `[cluster|node]_txn_execution_insights`
tables which keeps track of the error message for failed executions.

Co-authored-by: gtr <gerardo@cockroachlabs.com>
xinhaoz pushed a commit to xinhaoz/cockroach that referenced this issue Sep 21, 2023
Part of: cockroachdb#87785.

Previously, the insights subsystem did not keep track of the error
messages for failed executions, only the error codes.

This commit  updates the `[cluster|node]_execution_insights` and
`[cluster|node]_txn_execution_insights` virtual tables to include a
`last_error` column which contains the most recent error message.

Release note (sql change): adds `last_error` column to the
`[cluster|node]_execution_insights` and `[cluster|node]_txn_execution_insights`
tables which keeps track of the error message for failed executions.

Co-authored-by: gtr <gerardo@cockroachlabs.com>
xinhaoz pushed a commit to xinhaoz/cockroach that referenced this issue Sep 21, 2023
Part of cockroachdb#87785

This commit adds the error message for failed executions to the
statement and transaction insights pages. Since this value can contain
sensitive information, it must conform to the VIEWACTIVITY and
VIEWACTIVITYREDACTED system privielges.

Release note (ui change): adds "Error message" row to the statement and
transaction insights details pages. If the user has VIEWACTIVITY, they
are able to view the full error message. If they have
VIEWACTIVTYREDACTED, they are given a redacted error message. If they
have both, VIEWACTIVITYTREDACTED  takes precedence.
craig bot pushed a commit that referenced this issue Sep 22, 2023
110565: sql, insights: record error message for failed stmts r=yuzefovich a=xinhaoz

### Commit 1 sql: modify HasViewActivityOrViewActivityRedactedRole signature

This commit adds an additional return value to the function
`HasViewActivityOrViewActivityRedactedRole` which returns true if
the user has either `VIEWACTIVITY` or `VIEWACTIVITYREDACTED` roles.

In addition to that, the function will now return a boolean indicating
whether the user is not admin and has the `VIEWACTIVITYREDACTED` role.
This simplifies  areas where we need to redact values based on the
presence of this role, allowing callers to skip an additional role check
for admin and VIEWACTIVITYREDACTED.

Epic: none

Release note: None
### Commit 2 sql, insights: record error message for failed stmts

Part of: #87785.

Previously, the insights subsystem did not keep track of the error messages for failed executions, only the error codes.

This commit  updates the `[cluster|node]_execution_insights` and `[cluster|node]_txn_execution_insights` virtual tables to include a `last_error` column which contains the most recent error message.

Release note (sql change): adds `last_error` column to the `[cluster|node]_execution_insights` and `[cluster|node]_txn_execution_insights` tables which keeps track of the error message for failed executions.

110966: sql: fix gist decoding in foreign DB r=yuzefovich a=cucaroach

When a gist is exported and decoded in a foreign DB it should still
decode w/o panic. If a table scan result columns are fed into a
group by column the group by column code expects the result columns to
be in tact and panics. Fix this by making fake "unknown" result columns
so we can print the plan.

Making this change led to an unintended change in gists_tpce, this is
because the catalog was shared between the test files and I added a
table.  Fix this for the future by creating a new catalog for each file.
But with this change the table ids encoded in the gists_tpce got
smaller.

Epic: None
Fixes: #110964
Release note (bug fix): Fixed panic when decoding gist in DB without the
table referred to by the gist.


Co-authored-by: Xin Hao Zhang <xzhang@cockroachlabs.com>
Co-authored-by: Xin Hao Zhang <xzhang@cockroahchlabs.com>
Co-authored-by: Tommy Reilly <treilly@cockroachlabs.com>
xinhaoz pushed a commit to xinhaoz/cockroach that referenced this issue Sep 22, 2023
Part of cockroachdb#87785

This commit adds the error message for failed executions to the
statement and transaction insights pages. Since this value can contain
sensitive information, it must conform to the VIEWACTIVITY and
VIEWACTIVITYREDACTED system privielges.

Release note (ui change): adds "Error message" row to the statement and
transaction insights details pages. If the user has VIEWACTIVITY, they
are able to view the full error message. If they have
VIEWACTIVTYREDACTED, they are given a redacted error message. If they
have both, VIEWACTIVITYTREDACTED  takes precedence.
craig bot pushed a commit that referenced this issue Sep 22, 2023
110849: ui: add error message for failed executions r=xinhaoz a=xinhaoz

Part of #87785

This commit adds the error message for failed executions to the
statement and transaction insights pages. Since this value can contain
sensitive information, it must conform to the VIEWACTIVITY and
VIEWACTIVITYREDACTED system privielges.

Release note (ui change): adds "Error message" row to the statement and
transaction insights details pages. If the user has VIEWACTIVITY, they
are able to view the full error message. If they have
VIEWACTIVTYREDACTED, they are given a redacted error message. If they
have both, VIEWACTIVITYTREDACTED  takes precedence.


------------------
Only the most recent commit (ui) should be reviewed.

------------------
<img width="1742" alt="image" src="https://github.com/cockroachdb/cockroach/assets/20136951/786d2a20-1610-4ff7-a8fb-0733d344147c">


111115: dbconsole: add tooltop to replication dash ranges chart r=koorosh a=kvoli

The Ranges chart in the replication dashboard can be easily misinterpreted when in a single node view, because the per-node metric is only reported by one node for each range. Most commonly, this is the leaseholder, and if not, the first replica in the range descriptor.

Add a tooltip which explains this nuance, taken from the [documentation](https://www.cockroachlabs.com/docs/stable/ui-replication-dashboard#ranges).

![image](https://github.com/cockroachdb/cockroach/assets/39606633/c2bcecd2-d8ed-46ec-b6f3-61927ecdce45)


Epic: None
Resolves: #111055

Release note (ui change): Added a tooltip to the ranges chart on the replication dashboard, describing the metric in single vs cluster view.

Co-authored-by: gtr <gerardo@cockroachlabs.com>
Co-authored-by: Austen McClernon <austen@cockroachlabs.com>
@maryliag maryliag assigned xinhaoz and unassigned gtr Sep 25, 2023
@maryliag
Copy link
Contributor

maryliag commented Nov 3, 2023

Completed on #110849

@maryliag maryliag closed this as completed Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-escalation-improvement Having this feature would have made an escalation easier O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs
Projects
None yet
Development

No branches or pull requests

5 participants