-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ui: remove polling from fingerprints pages, allow new stats requests while one is in flight #98331
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, 2 of 2 files at r2, 1 of 2 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @xinhaoz)
-- commits
line 37 at r3:
nit: experiences
-- commits
line 43 at r3:
nit: this phrase is confusing
-- commits
line 50 at r3:
nice!
pkg/ui/workspaces/db-console/src/redux/cachedDataReducer.ts
line 324 at r3 (raw file):
this.pendingRequest = request; return request;
can you add test to confirm the behaviour and/or a loom showing network tab before/after?
1d517dc
to
ef95124
Compare
7b8813b
to
34a754a
Compare
I have to do some testing on CC still. I'll add that Loom once it is done. Edit: Included in the description. |
1f0eebe
to
57eaea1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @maryliag and @xinhaoz)
-- commits
line 6 at r9:
nit: typo finegerprints
-- commits
line 29 at r9:
Should this be labeled SQL Activity pages instead of fingerprint pages?
-- commits
line 31 at r9:
Should a refresh/apply button be added to the page to make it clear that it's not refreshing?
pkg/ui/workspaces/cluster-ui/src/transactionDetails/transactionDetails.tsx
line 212 at r9 (raw file):
refreshData = (): void => { console.log("aaaaaaaaaaah");
nit: I think this might need to be removed..
pkg/ui/workspaces/db-console/src/util/api.ts
line 782 at r9 (raw file):
return timeoutFetch( serverpb.StatementsResponse, `${STATUS_PREFIX}/combinedstmts?${queryStr}`,
Is this a new api? If so should it be a concern this will likely break during upgrades?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dongniwang, @j82w, and @maryliag)
Previously, maryliag (Marylia Gutierrez) wrote…
nit: this phrase is confusing
Rephrased.
Previously, j82w (Jake) wrote…
nit: typo
finegerprints
Fixed.
Previously, j82w (Jake) wrote…
Should this be labeled SQL Activity pages instead of fingerprint pages?
I wanted to distinguish them from insights pages / active exec pages, I can change it SQL Activity if it's more clear though.
Previously, j82w (Jake) wrote…
Should a refresh/apply button be added to the page to make it clear that it's not refreshing?
Sounds like a good idea -- cc @dongniwang
pkg/ui/workspaces/cluster-ui/src/transactionDetails/transactionDetails.tsx
line 212 at r9 (raw file):
Previously, j82w (Jake) wrote…
nit: I think this might need to be removed..
Oops, removed.
pkg/ui/workspaces/db-console/src/redux/cachedDataReducer.ts
line 324 at r3 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
can you add test to confirm the behaviour and/or a loom showing network tab before/after?
Loom added.
pkg/ui/workspaces/db-console/src/util/api.ts
line 782 at r9 (raw file):
Previously, j82w (Jake) wrote…
Is this a new api? If so should it be a concern this will likely break during upgrades?
It's not a new api. It existed since the creation of persisted stats from 21.2, we just did not use it and instead opted to have the older api just route to this one using a req param in case we needed to revert anything. We're fully committed to persisted stats now though so I don't see why we shouldn't be using it.
e46d58d
to
d926a44
Compare
I've updated the Looms to also show the txn details page since that request behaviour was modified as well, and also to show the long loading messages appearing on both stmt and txn overview pages. |
d926a44
to
5db079d
Compare
a520cd0
to
2b15c3b
Compare
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 40dd354 to blathers/backport-release-22.1-98331: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1.x failed. See errors above. error creating merge commit from 40dd354 to blathers/backport-release-22.2-98331: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Changes to allow in-flight request replacement in this PR: cockroachdb#98331 caused breaking changes to `KeyedCachedDataReducer`s using the sql api. A `KeyedCachedDataReducer` using the sql api would issue the requests to the same endpoint and despite the added `allowReplacementRequests` flag being `false`, in-flight requests would be replaced (specifically, when fetching for initial state). This would prevent all requests but the last one from being stored in the reducer state and put them in a permanent `inFlight` status. Consequently, our pages would break as the `inFlight` status would prevent us from ever fetching data, putting these pages in a permanent loading state. This PR adds checks to ensure `allowReplacementRequests` is enabled when we receive a response, before deciding whether to omit it from the reducer state. It's important to note that this is still not an ideal solution for `KeyedCachedDataReducer`s, should we ever want to replace inflight requests for one. The checks to replace an in-flight request still occur at the reducer-level, whereas in a `KeyedCachedDataReducer` we'd like them to occur at the state-level (i.e. instead of replacement checks for all requests across the reducer, we should check for requests that impact a specific key's state). This way we scope replacements to each key in the reducer, allowing us to fetch data for each key (which is the intended functionality). Enabling this for a `KeyedCachedDataReducer` without making this change will cause the breaking behaviour above. This PR also includes some cleanup to the database details API, namely: - camel casing response fields - encapsulating the database span stats response into its own object (instead of across multiple objects), this helps scope any errors deriving from the query to the single stats object Release note (bug fix): Fixed replacement of in-flight requests for `KeyedCachedDataReducer`s to prevent permanent loading on requests stuck on an `inFlight` status. https://cockroachlabs.atlassian.net/browse/DOC-1355 #Informs: cockroachdb#33316 #Epic: CRDB-8035
Changes to allow in-flight request replacement in this PR: cockroachdb#98331 caused breaking changes to `KeyedCachedDataReducer`s using the sql api. A `KeyedCachedDataReducer` using the sql api would issue the requests to the same endpoint and despite the added `allowReplacementRequests` flag being `false`, in-flight requests would be replaced (specifically, when fetching for initial state). This would prevent all requests but the last one from being stored in the reducer state and put them in a permanent `inFlight` status. Consequently, our pages would break as the `inFlight` status would prevent us from ever fetching data, putting these pages in a permanent loading state. This PR adds checks to ensure `allowReplacementRequests` is enabled when we receive a response, before deciding whether to omit it from the reducer state. It's important to note that this is still not an ideal solution for `KeyedCachedDataReducer`s, should we ever want to replace inflight requests for one. The checks to replace an in-flight request still occur at the reducer-level, whereas in a `KeyedCachedDataReducer` we'd like them to occur at the state-level (i.e. instead of replacement checks for all requests across the reducer, we should check for requests that impact a specific key's state). This way we scope replacements to each key in the reducer, allowing us to fetch data for each key (which is the intended functionality). Enabling this for a `KeyedCachedDataReducer` without making this change will cause the breaking behaviour above. This PR also includes some cleanup to the database details API, namely: - camel casing response fields - encapsulating the database span stats response into its own object (instead of across multiple objects), this helps scope any errors deriving from the query to the single stats object Release note (bug fix): Fixed replacement of in-flight requests for `KeyedCachedDataReducer`s to prevent permanent loading on requests stuck on an `inFlight` status. https://cockroachlabs.atlassian.net/browse/DOC-1355 #Informs: cockroachdb#33316 #Epic: CRDB-8035
…99123 98712: storage: expose separate WriteBatch interface r=sumeerbhola a=jbowens Adapt NewUnindexedBatch to no longer take a writeOnly parameter. Instead, a new NewWriteBatch method is exposed that returns a WriteBatch type that does not provide Reader facilities. Future work may remove UnindexedBatch altogether, updating callers to explicitly maintain separate Readers and WriteBatches. Epic: None Release note: None 98807: spanconfig: add TestBlockedKVSubscriberDisablesQueues r=irfansharif a=irfansharif Regression test for #98422 which disables {split,replicate,mvccGC} queues until subscribed to span configs. We're only slightly modifying the existing merge-queue specific TestBlockedKVSubscriberDisablesMerges. Release note: None 98823: sql: address a couple of old TODOs r=yuzefovich a=yuzefovich This commit addresses a couple of old TODOs in the `connExecutor.dispatchToExecutionEngine` method. First, it simply removes the now-stale TODO about needing to `Step()` the txn before executing the query. The TODO mentioned things about the old way FK checks were performed, but we now always run checks in a deferred fashion, and perform the `Step`s correctly there for cascades and checks, thus, the TODO can be removed. (The TODO itself explicitly mentions that it can be removed once we do checks and cascades in the deferred fashion.) Second, it addresses a TODO about how some plan information is saved. Up until 961e66f the cleanup of `planNode` tree and `flow` infra was intertwined, so in 6ae4881 in order to "sample" the plan with correct info (some of which is only available _after_ the execution is done) we delegated that sampling to `planTop.close`. However, with `planNode` tree and `flow` cleanups refactored and de-coupled, we no longer have to close the plan early on. As a result, we now close the plan in a defer right after we create it (now it's the only place we do so), and this commit makes it so that we record the plan info explicitly right after returning from the execution engine, thus, addressing the second TODO. Epic: None Release note: None 98896: tracing: fix WithEventListeners option with verbose parent r=yuzefovich a=yuzefovich Previously, when using `WithEventListeners` span option and not explicitly specifying `WithRecording` option, we could incorrectly use the structured recording. In particular, this was the case when the parent has verbose recording, and this is now fixed. At the moment, this span option is used only by the backup and restore processors, so the impact would be that their verbose recording wouldn't be included into the session recording (perhaps we don't actually expose that anyway). Epic: None Release note: None 99079: sql: fix hint for unknown udfs in views and funcs r=rharding6373 a=rharding6373 Fixes a hint for some instances of an "unknown function" error that was not properly applied previously. Informs: #99002 Epic: None Release note: None 99095: ui: fix inflight request replacement, cleanup database details api r=THardy98 a=THardy98 Changes to allow in-flight request replacement in this PR: #98331 caused breaking changes to `KeyedCachedDataReducer`s. Despite the added `allowReplacementRequests` flag being `false`, in-flight requests would be replaced (specifically, when fetching for initial state). This would prevent all requests but the last one from being stored in the reducer state and leave the remaining requests in a permanent `inFlight` status. Consequently, our pages would break as the permanent `inFlight` status on these requests would prevent us from ever fetching data, putting these pages in a permanent loading state. This PR adds checks to ensure `allowReplacementRequests` is enabled when we receive a response, before deciding whether to omit it from the reducer state. It's important to note that this is still not an ideal solution for `KeyedCachedDataReducer`s, should we ever want to replace inflight requests for one. The checks to replace an in-flight request still occur at the reducer-level, whereas in a `KeyedCachedDataReducer` we'd like them to occur at the state-level (i.e. instead of replacement checks for all requests across the reducer, we should check for requests that impact a specific key's state). This way we scope replacements to each key in the reducer, allowing us to fetch data for each key independently (which is the intended functionality). Enabling this for a `KeyedCachedDataReducer` without making this change I believe will cause the breaking behaviour above. This PR also includes some cleanup to the database details API, namely: - camel casing response fields - encapsulating the database span stats response into its own object (instead of across multiple objects), this helps scope any errors deriving from the query to the single stats object **BEFORE** https://www.loom.com/share/57c9f278376a4551977398316d6ed7b6 **AFTER** https://www.loom.com/share/b2d4e074afca4b42a4f711171de3fd72 Release note (bug fix): Fixed replacement of in-flight requests for `KeyedCachedDataReducer`s to prevent permanent loading on requests stuck on an `inFlight` status. 99097: upgrades: fix backfilling of special roles in user ID migrations r=rafiss a=andyyang890 upgrades: fix system.privileges user ID migration This patch adds a missing case for the public role in the backfilling portion of the system.privileges user ID migration. Release note: None --- upgrades: fix system.database_role_settings user ID migration This patch adds a missing case for the empty role in the backfilling portion of the system.database_role_settings user ID migration. Release note: None --- Part of #87079 99113: server: use no-op cost controller for shared-process tenants r=knz a=stevendanna In the long run, all rate limiting of tenants should be controlled by a tenant capability. However, at the moment, we do not have the infrastructure to read tenant capabilities from the tenant process. Since, the main user of shared-process tenants is the new, experimental unified-architecture were we do not anticipate needing the cost controller for most users, this PR injects a no-op cost controller for shared-process tenants. Epic: CRDB-23559 Release note: None 99120: tracing: allow collection of stack history for Structured spans as well r=dt a=adityamaru Previously, we mandated that the span must be Verbose to be able to collect and emit a Structured event containing stack history. This change permits Structured spans to also collect stack history. Release note: None Epic: none 99123: sql: skip flaky TestSQLStatsDataDriven r=ericharmeling a=ericharmeling This commit skips TestSQLStatsDataDriven. This test is flaky. Part of #89861. Epic: none Release note: None Co-authored-by: Jackson Owens <jackson@cockroachlabs.com> Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: rharding6373 <rharding6373@users.noreply.github.com> Co-authored-by: Thomas Hardy <thardy@cockroachlabs.com> Co-authored-by: Andy Yang <yang@cockroachlabs.com> Co-authored-by: Steven Danna <danna@cockroachlabs.com> Co-authored-by: adityamaru <adityamaru@gmail.com> Co-authored-by: Eric Harmeling <eric.harmeling@cockroachlabs.com>
Previously, the `MetricsDataProvider` component queried the redux store for the `TimeScale` object which contained details of the currently active time window. This piece of state was assumed to update to account for the "live" moving window that metrics show when pre-set lookback time windows are selected. A recent PR: cockroachdb#98331 removed the feature that polled new data from SQL pages, which also disabled polling on metrics pages due to the re-use of `TimeScale`. This commit modifies the `MetricsDataProvider` to instead read the `metricsTime` field of the `TimeScaleState` object. This object was constructed for use by the `MetricsDataProvider` but was not wired up to the component. Resolves cockroachdb#99524 Epic: None Release note: None
Previously, the `MetricsDataProvider` component queried the redux store for the `TimeScale` object which contained details of the currently active time window. This piece of state was assumed to update to account for the "live" moving window that metrics show when pre-set lookback time windows are selected. A recent PR: cockroachdb#98331 removed the feature that polled new data from SQL pages, which also disabled polling on metrics pages due to the re-use of `TimeScale`. This commit modifies the `MetricsDataProvider` to instead read the `metricsTime` field of the `TimeScaleState` object. This object was constructed for use by the `MetricsDataProvider` but was not wired up to the component. Resolves cockroachdb#99524 Epic: None Release note: None
…99752 #99774 99433: opt: fixup CTE stats on placeholder queries r=cucaroach a=cucaroach During optbuilder phase we copy the initial expressions stats into the fake-rel but this value can change when placeholders are assigned so add code in AssignPlaceholders to rebuild the cte if the stats change. Fixes: #99389 Epic: none Release note: none 99516: metrics: improve ux around _status/vars output r=aadityasondhi a=dhartunian Previously, the addition of the `tenant` metric label was applied uniformly and could result in confusion for customers who never enable multi-tenancy or c2c. The `tenant="system"` label carries little meaning when there's no tenancy in use. This change modifies the system tenant label application to only happen when a non- sytem in-process tenant is created. Additionally, an environment variable: `COCKROACH_DISABLE_NODE_AND_TENANT_METRIC_LABELS` can be set to `false` to disable the new `tenant` and `node_id` labels. This can be used on single-process tenants to disable the `tenant` label. Resolves: #94668 Epic: CRDB-18798 Release note (ops change): The `COCKROACH_DISABLE_NODE_AND_TENANT_METRIC_LABELS` env var can be used to disable the newly introduced metric labels in the `_status/vars` output if they conflict with a customer's scrape configuration. 99522: jobsprofiler: store DistSQL diagram of jobs in job info r=dt a=adityamaru This change teaches import, cdc, backup and restore to store their DistSQL plans in the job_info table under a timestamped info key. The generation and writing of the plan diagram is done asynchronously so as to not slow down the execution of the job. A new plan will be stored everytime the job sets up its DistSQL flow. Release note: None Epic: [CRDB-8964](https://cockroachlabs.atlassian.net/browse/CRDB-8964) Informs: #99729 99574: streamingccl: skip acceptance/c2c on remote cluster setup r=stevendanna a=msbutler acceptance/c2c currently fails when run on a remote cluster. This patch ensures the test gets skipped when run on a remote cluster. There's no need to run the test on a remote cluster because the other c2c roachtests provide sufficient coverage. Fixes #99553 Release note: none 99691: codeowners: update sql obs to cluster obs r=maryliag a=maryliag Update mentions of `sql-observability` to `cluster-observability`. Epic: none Release note: None 99712: ui: connect metrics provider to metrics timescale object r=xinhaoz a=dhartunian Previously, the `MetricsDataProvider` component queried the redux store for the `TimeScale` object which contained details of the currently active time window. This piece of state was assumed to update to account for the "live" moving window that metrics show when pre-set lookback time windows are selected. A recent PR: #98331 removed the feature that polled new data from SQL pages, which also disabled polling on metrics pages due to the re-use of `TimeScale`. This commit modifies the `MetricsDataProvider` to instead read the `metricsTime` field of the `TimeScaleState` object. This object was constructed for use by the `MetricsDataProvider` but was not wired up to the component. Resolves #99524 Epic: None Release note: None 99733: telemetry: add FIPS-specific channel r=knz a=rail Previously, all official builds were reporting using the same telemetry channel. This PR adds an new telemetry channel for the FIPS build target. Fixes: CC-24110 Epic: DEVINF-478 Release note: None 99745: spanconfigsqlwatcher: deflake TestSQLWatcherOnEventError r=arulajmani a=arulajmani Previously, this test was setting the no-op checkpoint duration to be every hour to effectively disable checkpoints. Doing so is integral to what the test is testing. However, this was a lie, given how `util.Every` works -- A call to `ShouldProcess` returns true the very first time. This patch achieves the original goal by introducing a new testing knob. Previously, the test would fail in < 40 runs locally. Have this running strong for ~1000 runs. Fixes #76765 Release note: None 99747: roachtest: use persistent disks for disk-stall tests r=jbowens a=nicktrav Currently, the `disk-stall` tests use local SSDs. When run on GCE VMs, a higher test flake rate is observed due to known issues with fsync latency for local SSDs. Switch the test to use persistent disks instead. Touches: #99372. Release note: None. Epic: CRDB-20293 99752: kvserver: bump tolerance more r=ajwerner a=ajwerner I'm not digging into this more, but the test is flakey. Epic: none https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_UnitTests_BazelUnitTests/9161972?showRootCauses=false&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandBuildTestsSection=true Release note: None 99774: *: identify remaining uses of TODOSQLCodec r=stevendanna a=knz The `TODOSQLCodec` was a bug waiting to happen. The only reasonable remaining purpose is for use in tests. As such, this change moves its definition to a test-only package (we have a linter that verifies that `testutils` is not included in non-test code). This change then identifies the one non-reasonable remaining purposes and identifies it properly as a bug linked to #48123. Release note: None Epic: None Co-authored-by: Tommy Reilly <treilly@cockroachlabs.com> Co-authored-by: David Hartunian <davidh@cockroachlabs.com> Co-authored-by: adityamaru <adityamaru@gmail.com> Co-authored-by: Michael Butler <butler@cockroachlabs.com> Co-authored-by: maryliag <marylia@cockroachlabs.com> Co-authored-by: Rail Aliiev <rail@iqchoice.com> Co-authored-by: Arul Ajmani <arulajmani@gmail.com> Co-authored-by: Nick Travers <travers@cockroachlabs.com> Co-authored-by: ajwerner <awerner32@gmail.com> Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Previously, the `MetricsDataProvider` component queried the redux store for the `TimeScale` object which contained details of the currently active time window. This piece of state was assumed to update to account for the "live" moving window that metrics show when pre-set lookback time windows are selected. A recent PR: #98331 removed the feature that polled new data from SQL pages, which also disabled polling on metrics pages due to the re-use of `TimeScale`. This commit modifies the `MetricsDataProvider` to instead read the `metricsTime` field of the `TimeScaleState` object. This object was constructed for use by the `MetricsDataProvider` but was not wired up to the component. Resolves #99524 Epic: None Release note: None
Previously, the `MetricsDataProvider` component queried the redux store for the `TimeScale` object which contained details of the currently active time window. This piece of state was assumed to update to account for the "live" moving window that metrics show when pre-set lookback time windows are selected. A recent PR: cockroachdb#98331 removed the feature that polled new data from SQL pages, which also disabled polling on metrics pages due to the re-use of `TimeScale`. This commit modifies the `MetricsDataProvider` to instead read the `metricsTime` field of the `TimeScaleState` object. This object was constructed for use by the `MetricsDataProvider` but was not wired up to the component. Resolves cockroachdb#99524 Epic: None Release note: None
Previously, the `MetricsDataProvider` component queried the redux store for the `TimeScale` object which contained details of the currently active time window. This piece of state was assumed to update to account for the "live" moving window that metrics show when pre-set lookback time windows are selected. A recent PR: cockroachdb#98331 removed the feature that polled new data from SQL pages, which also disabled polling on metrics pages due to the re-use of `TimeScale`. This commit modifies the `MetricsDataProvider` to instead read the `metricsTime` field of the `TimeScaleState` object. This object was constructed for use by the `MetricsDataProvider` but was not wired up to the component. Resolves cockroachdb#99524 Epic: None Release note: None
Previously, the `MetricsDataProvider` component queried the redux store for the `TimeScale` object which contained details of the currently active time window. This piece of state was assumed to update to account for the "live" moving window that metrics show when pre-set lookback time windows are selected. A recent PR: cockroachdb#98331 removed the feature that polled new data from SQL pages, which also disabled polling on metrics pages due to the re-use of `TimeScale`. This commit modifies the `MetricsDataProvider` to instead read the `metricsTime` field of the `TimeScaleState` object. This object was constructed for use by the `MetricsDataProvider` but was not wired up to the component. Resolves cockroachdb#99524 Epic: None Release note: None
See individual commits.
The changes here serve as a base for the performance improvements we'll be making.
Since they're not really related to those changes specifically (e.g. adding limit/sort and splitting the api calls), I've put them in their own PR here to make reviewing easier.
Loom verifying behaviour and showing new requests being dispatched while ones are pending
Pages shown:
https://www.loom.com/share/3448117bfecf404c8d698f4ad1240e8c
CC:
https://www.loom.com/share/bb30b51ebe5144528ab0c6fabdbfb2f1