-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: make use of telemetry channel for structured event logging #85589
Labels
A-storage
Relating to our storage engine (Pebble) on-disk storage.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-storage
Storage Team
Comments
nicktrav
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-storage
Relating to our storage engine (Pebble) on-disk storage.
T-storage
Storage Team
labels
Aug 3, 2022
nicktrav
added a commit
to nicktrav/cockroach
that referenced
this issue
Aug 19, 2022
Add the `StoreStats` event type, a per-store event emitted to the `TELEMETRY` logging channel. This event type will be computed from the Pebble metrics for each store. Emit a `StoreStats` event periodically, by default, once per hour, per store. Touches cockroachdb#85589. Release note: None. Release justification: low risk, high benefit changes to existing functionality.
nicktrav
added a commit
to nicktrav/cockroach
that referenced
this issue
Aug 22, 2022
Add the `StoreStats` event type, a per-store event emitted to the `TELEMETRY` logging channel. This event type will be computed from the Pebble metrics for each store. Emit a `StoreStats` event periodically, by default, once per hour, per store. Touches cockroachdb#85589. Release note: None. Release justification: low risk, high benefit changes to existing functionality.
nicktrav
added a commit
to nicktrav/cockroach
that referenced
this issue
Aug 23, 2022
Add the `StoreStats` event type, a per-store event emitted to the `TELEMETRY` logging channel. This event type will be computed from the Pebble metrics for each store. Emit a `StoreStats` event periodically, by default, once per hour, per store. Touches cockroachdb#85589. Release note: None. Release justification: low risk, high benefit changes to existing functionality.
nicktrav
added a commit
to nicktrav/cockroach
that referenced
this issue
Aug 24, 2022
Add the `StoreStats` event type, a per-store event emitted to the `TELEMETRY` logging channel. This event type will be computed from the Pebble metrics for each store. Emit a `StoreStats` event periodically, by default, once per hour, per store. Touches cockroachdb#85589. Release note: None. Release justification: low risk, high benefit changes to existing functionality.
nicktrav
added a commit
to nicktrav/cockroach
that referenced
this issue
Sep 19, 2022
Add the `StoreStats` event type, a per-store event emitted to the `TELEMETRY` logging channel. This event type will be computed from the Pebble metrics for each store. Emit a `StoreStats` event periodically, by default, once per hour, per store. Touches cockroachdb#85589. Release note: None. Release justification: low risk, high benefit changes to existing functionality.
craig bot
pushed a commit
that referenced
this issue
Sep 21, 2022
86277: eventpb: add storage event types r=jbowens,sumeerbhola a=nicktrav Add the `StoreStats` event type, a per-store event emitted to the `TELEMETRY` logging channel. This event type will be computed from the Pebble metrics for each store. Emit a `StoreStats` event periodically, by default, once per hour, per store. Touches #85589. Release note: None. Release justification: low risk, high benefit changes to existing functionality. 87142: workload/mixed-version/schemachanger: re-enable mixed version workload r=fqazi a=fqazi Fixes: #58489 #87477 Previously the mixed version schema changer workload was disabled because of the lack of version gates. These changes will do the following: - Start reporting errors on this workload again. - Disable trigrams in a mixed version state. - Disable the insert part of the workload in a mixed version state (there is an optimizer on 22.1 that can cause some of the queries to fail) Release justification: low risk only extends test coverage 87883: schedulerlatency: export Go scheduling latency metric r=irfansharif a=irfansharif And record data into CRDB's internal time-series database. Informs \#82743 and #87823. To export scheduling latencies to prometheus, we choose an exponential bucketing scheme with base multiple of 1.1, and the output range bounded to [50us, 100ms). This makes for ~70 buckets. It's worth noting that the default histogram buckets used in Go are not fit for our purposes. If we care about improving it, we could consider patching the runtime. ``` bucket[ 0] width=0s boundary=[-Inf, 0s) bucket[ 1] width=1ns boundary=[0s, 1ns) bucket[ 2] width=1ns boundary=[1ns, 2ns) bucket[ 3] width=1ns boundary=[2ns, 3ns) bucket[ 4] width=1ns boundary=[3ns, 4ns) ... bucket[270] width=16.384µs boundary=[737.28µs, 753.664µs) bucket[271] width=16.384µs boundary=[753.664µs, 770.048µs) bucket[272] width=278.528µs boundary=[770.048µs, 1.048576ms) bucket[273] width=32.768µs boundary=[1.048576ms, 1.081344ms) bucket[274] width=32.768µs boundary=[1.081344ms, 1.114112ms) ... bucket[717] width=1h13m18.046511104s boundary=[53h45m14.046488576s, 54h58m32.09299968s) bucket[718] width=1h13m18.046511104s boundary=[54h58m32.09299968s, 56h11m50.139510784s) bucket[719] width=1h13m18.046511104s boundary=[56h11m50.139510784s, 57h25m8.186021888s) bucket[720] width=57h25m8.186021888s boundary=[57h25m8.186021888s, +Inf) ``` Release note: None Release justification: observability-only PR, low-risk high-benefit; would help understand admission control out in the wild 88179: ui/cluster-ui: fix no most recent stmt for active txns r=xinhaoz a=xinhaoz Fixes #87738 Previously, active txns could have an empty 'Most Recent Statement' column, even if their executed statement count was non-zero. This was due to the most recent query text being populated by the active stmt, which could be empty at the time of querying. This commit populates the last statement text for a txn even when it is not currently executing a query. This commit also removes the `isFullScan` field from active txn pages, as we cannot fill this field out without all stmts in the txn. Release note (ui change): Full scan field is removed from active txn details page. Release note (bug fix): active txns with non-zero executed statement count now always have populated stmt text, even when no stmt is being executed. 88334: kvserver: align Raft recv/send queue sizes r=erikgrinaker a=pavelkalinnikov Fixes #87465 Release justification: performance fix Release note: Made sending and receiving Raft queue sizes match. Previously the receiver could unnecessarily drop messages in situations when the sending queue is bigger than the receiving one. Co-authored-by: Nick Travers <travers@cockroachlabs.com> Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com> Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com> Co-authored-by: Xin Hao Zhang <xzhang@cockroachlabs.com> Co-authored-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
blathers-crl bot
pushed a commit
that referenced
this issue
Sep 21, 2022
Add the `StoreStats` event type, a per-store event emitted to the `TELEMETRY` logging channel. This event type will be computed from the Pebble metrics for each store. Emit a `StoreStats` event periodically, by default, once per hour, per store. Touches #85589. Release note: None. Release justification: low risk, high benefit changes to existing functionality.
Marking this as done in #86277. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-storage
Relating to our storage engine (Pebble) on-disk storage.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-storage
Storage Team
Is your feature request related to a problem? Please describe.
Cockroach supports logging structured events (see the proto definition, here) to the
TELEMETRY
logging channel. On Cockroach Cloud, these events are collected and made available elsewhere for reporting purposes.Describe the solution you'd like
The storage layer should make use of these structured events to log characteristics about clusters that would help us a) identify areas for improvement, and b) track our improvements over time.
Additional context
There is additional documentation in the following:
Jira issue: CRDB-18327
Epic CRDB-17515
The text was updated successfully, but these errors were encountered: