-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql,kv: extend SHOW RANGES to include table/index size estimates #97858
Comments
NB: we may have a package at https://github.com/cockroachlabs/metrics-exporter that needs to be updated accordingly |
|
It might replace #20712 depending on where #98339 gets to. It will not replace #85834 nor make #86017 easier to resolve. For that, we need a stored cache of the computed approximations (presumably alongside system.descriptors) and expose that data via the pg-compatible builtin function. Building that cache is out of scope here and in #98339 as far as i understand. |
This commit adds object size estimates to `SHOW RANGES WITH DETAILS`. A new override of `crdb_internal.tenant_span_stats` is introduced to leverage batched span statistics requests to KV, so only 1 fan-out is required. Resolves: cockroachdb#97858 Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Release note (sql change): The SHOW RANGES command will now emit span statistics when the DETAILS option is specified. The statistics are included in a new column named 'span_stats', as a JSON value. The statistics are calculated for the identifier of each row: `SHOW RANGES WITH DETAILS` will compute span statistics for each range. `SHOW RANGES WITH TABLES, DETAILS` will compute span statistics for each table, etc. In the JSON value, the 'range_count' key refers to the number of ranges that the object spans. The 'total_stats' key refers to the MVCC statistics of the object's span.
This commit adds object size estimates to `SHOW RANGES WITH DETAILS`. A new override of `crdb_internal.tenant_span_stats` is introduced to leverage batched span statistics requests to KV, so only 1 fan-out is required. Resolves: cockroachdb#97858 Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Release note (sql change): The SHOW RANGES command will now emit span statistics when the DETAILS option is specified. The statistics are included in a new column named 'span_stats', as a JSON value. The statistics are calculated for the identifier of each row: `SHOW RANGES WITH DETAILS` will compute span statistics for each range. `SHOW RANGES WITH TABLES, DETAILS` will compute span statistics for each table, etc. In the JSON value, the 'range_count' key refers to the number of ranges that the object spans. The 'total_stats' key refers to the MVCC statistics of the object's span.
This commit adds object size estimates to `SHOW RANGES WITH DETAILS`. A new override of `crdb_internal.tenant_span_stats` is introduced to leverage batched span statistics requests to KV, so only 1 fan-out is required. Resolves: cockroachdb#97858 Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Release note (sql change): The SHOW RANGES command will now emit span statistics when the DETAILS option is specified. The statistics are included in a new column named 'span_stats', as a JSON value. The statistics are calculated for the identifier of each row: `SHOW RANGES WITH DETAILS` will compute span statistics for each range. `SHOW RANGES WITH TABLES, DETAILS` will compute span statistics for each table, etc. In the JSON value, the 'range_count' key refers to the number of ranges that the object spans. The 'total_stats' key refers to the MVCC statistics of the object's span.
This commit adds object size estimates to `SHOW RANGES WITH DETAILS`. A new override of `crdb_internal.tenant_span_stats` is introduced to leverage batched span statistics requests to KV, so only 1 fan-out is required. Resolves: cockroachdb#97858 Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Release note (sql change): The SHOW RANGES command will now emit span statistics when the DETAILS option is specified. The statistics are included in a new column named 'span_stats', as a JSON object. The statistics are calculated for the identifier of each row: `SHOW RANGES WITH DETAILS` will compute span statistics for each range. `SHOW RANGES WITH TABLES, DETAILS` will compute span statistics for each table, etc. The 'span_stats' JSON object has the following keys: - approximate_disk_bytes - [key|val|sys|live|intent]_count - [key|val|sys|live|intent]_bytes 'approximate_disk_bytes' is an approximation of the total on-disk size of the given object. '[key|val|sys|live|intent]_count' and '[key|val|sys|live|intent]_bytes' are defined in enginepb.MVCCStats.
This commit adds object size estimates to `SHOW RANGES WITH DETAILS`. A new override of `crdb_internal.tenant_span_stats` is introduced to leverage batched span statistics requests to KV, so only 1 fan-out is required. Resolves: cockroachdb#97858 Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Release note (sql change): The SHOW RANGES command will now emit span statistics when the DETAILS option is specified. The statistics are included in a new column named 'span_stats', as a JSON object. The statistics are calculated for the identifier of each row: `SHOW RANGES WITH DETAILS` will compute span statistics for each range. `SHOW RANGES WITH TABLES, DETAILS` will compute span statistics for each table, etc. The 'span_stats' JSON object has the following keys: - approximate_disk_bytes - [key|val|sys|live|intent]_count - [key|val|sys|live|intent]_bytes 'approximate_disk_bytes' is an approximation of the total on-disk size of the given object. '[key|val|sys|live|intent]_count' and '[key|val|sys|live|intent]_bytes' are defined in enginepb.MVCCStats.
103128: sql: extend SHOW RANGES to include object size estimates r=zachlite a=zachlite This commit adds object size estimates to `SHOW RANGES WITH DETAILS`. A new override of `crdb_internal.tenant_span_stats` is introduced to leverage batched span statistics requests to KV, so only 1 fan-out is required. Resolves: #97858 Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Release note (sql change): The SHOW RANGES command will now emit span statistics when the DETAILS option is specified. The statistics are included in a new column named 'span_stats', as a JSON object. The statistics are calculated for the identifier of each row: `SHOW RANGES WITH DETAILS` will compute span statistics for each range. `SHOW RANGES WITH TABLES, DETAILS` will compute span statistics for each table, etc. The 'span_stats' JSON object has the following keys: - approximate_disk_bytes - [key|val|sys|live|intent]_count - [key|val|sys|live|intent]_bytes 'approximate_disk_bytes' is an approximation of the total on-disk size of the given object. '[key|val|sys|live|intent]_count' and '[key|val|sys|live|intent]_bytes' are defined in enginepb.MVCCStats. 105114: sql: zero out crdb_internal.node_statement_statistics(anonymized) column r=knz a=rafiss This column is not used, and is expensive to compute. In the past, we used to redact identifiers from queries for data privacy reasons, but now we no longer have this policy. (Constants and user-supplied data are still redacted.) To mitigate impact, we zero out the column value instead of removing the column completely. Epic: None Release note: None 105116: sql: add an advisory check between default_tenant and alter tenant r=yuzefovich a=knz Fixes #105115. Epic: CRDB-26691 Prior to this patch, it was just too easy to mistakenly configure `default_tenant` to a non-existent tenant or one without service; or to disable the service for a tenant currently serving default traffic. This patch makes this mistake harder, by adding a coherence check between the cluster setting and the service mode. The coherence check can be disabled via the (new) cluster setting `server.controller.default_tenant.check_service.enabled`. Release note: None Co-authored-by: Zach Lite <zach@cockroachlabs.com> Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com> Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
This commit adds object size estimates to `SHOW RANGES WITH DETAILS`. A new override of `crdb_internal.tenant_span_stats` is introduced to leverage batched span statistics requests to KV, so only 1 fan-out is required. Resolves: cockroachdb#97858 Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Release note (sql change): The SHOW RANGES command will now emit span statistics when the DETAILS option is specified. The statistics are included in a new column named 'span_stats', as a JSON object. The statistics are calculated for the identifier of each row: `SHOW RANGES WITH DETAILS` will compute span statistics for each range. `SHOW RANGES WITH TABLES, DETAILS` will compute span statistics for each table, etc. The 'span_stats' JSON object has the following keys: - approximate_disk_bytes - [key|val|sys|live|intent]_count - [key|val|sys|live|intent]_bytes 'approximate_disk_bytes' is an approximation of the total on-disk size of the given object. 'key_count' is the number of meta keys tracked under key_bytes. 'key_bytes' is the number of bytes stored in all non-system point keys, including live, meta, old, and deleted keys. Only meta keys really account for the "full" key; value keys only for the timestamp suffix. 'val_count' is the number of meta values tracked under val_bytes. 'val_bytes' is the number of bytes in all non-system version values, including meta values. 'sys_count' is the number of meta keys tracked under sys_bytes. 'sys_bytes' is the number of bytes stored in system-local kv-pairs. This tracks the same quantity as (key_bytes + val_bytes), but for system-local metadata keys (which aren't counted in either key_bytes or val_bytes). Each of the keys falling into this group is documented in keys/constants.go under the LocalPrefix constant and is prefixed by either LocalRangeIDPrefix or LocalRangePrefix. 'live_count' is the number of meta keys tracked under live_bytes. 'live_bytes' is the number of bytes stored in keys and values which can in principle be read by means of a Scan or Get in the far future, including intents but not deletion tombstones (or their intents). Note that the size of the meta kv pair (which could be explicit or implicit) is included in this. Only the meta kv pair counts for the actual length of the encoded key (regular pairs only count the timestamp suffix). 'intent_count' is the number of keys tracked under intent_bytes. It is equal to the number of meta keys in the system with a non-empty Transaction proto. 'intent_bytes' is the number of bytes in intent key-value pairs (without their meta keys).
This commit adds object size estimates to `SHOW RANGES WITH DETAILS`. A new override of `crdb_internal.tenant_span_stats` is introduced to leverage batched span statistics requests to KV, so only 1 fan-out is required. Resolves: cockroachdb#97858 Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Release note (sql change): The SHOW RANGES command will now emit span statistics when the DETAILS option is specified. The statistics are included in a new column named 'span_stats', as a JSON object. The statistics are calculated for the identifier of each row: `SHOW RANGES WITH DETAILS` will compute span statistics for each range. `SHOW RANGES WITH TABLES, DETAILS` will compute span statistics for each table, etc. The 'span_stats' JSON object has the following keys: - approximate_disk_bytes - [key|val|sys|live|intent]_count - [key|val|sys|live|intent]_bytes 'approximate_disk_bytes' is an approximation of the total on-disk size of the given object. 'key_count' is the number of meta keys tracked under key_bytes. 'key_bytes' is the number of bytes stored in all non-system point keys, including live, meta, old, and deleted keys. Only meta keys really account for the "full" key; value keys only for the timestamp suffix. 'val_count' is the number of meta values tracked under val_bytes. 'val_bytes' is the number of bytes in all non-system version values, including meta values. 'sys_count' is the number of meta keys tracked under sys_bytes. 'sys_bytes' is the number of bytes stored in system-local kv-pairs. This tracks the same quantity as (key_bytes + val_bytes), but for system-local metadata keys (which aren't counted in either key_bytes or val_bytes). Each of the keys falling into this group is documented in keys/constants.go under the LocalPrefix constant and is prefixed by either LocalRangeIDPrefix or LocalRangePrefix. 'live_count' is the number of meta keys tracked under live_bytes. 'live_bytes' is the number of bytes stored in keys and values which can in principle be read by means of a Scan or Get in the far future, including intents but not deletion tombstones (or their intents). Note that the size of the meta kv pair (which could be explicit or implicit) is included in this. Only the meta kv pair counts for the actual length of the encoded key (regular pairs only count the timestamp suffix). 'intent_count' is the number of keys tracked under intent_bytes. It is equal to the number of meta keys in the system with a non-empty Transaction proto. 'intent_bytes' is the number of bytes in intent key-value pairs (without their meta keys).
This commit adds object size estimates to `SHOW RANGES WITH DETAILS`. A new override of `crdb_internal.tenant_span_stats` is introduced to leverage batched span statistics requests to KV, so only 1 fan-out is required. Resolves: cockroachdb#97858 Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Release note (sql change): The SHOW RANGES command will now emit span statistics when the DETAILS option is specified. The statistics are included in a new column named 'span_stats', as a JSON object. The statistics are calculated for the identifier of each row: `SHOW RANGES WITH DETAILS` will compute span statistics for each range. `SHOW RANGES WITH TABLES, DETAILS` will compute span statistics for each table, etc. The 'span_stats' JSON object has the following keys: - approximate_disk_bytes - [key|val|sys|live|intent]_count - [key|val|sys|live|intent]_bytes 'approximate_disk_bytes' is an approximation of the total on-disk size of the given object. 'key_count' is the number of meta keys tracked under key_bytes. 'key_bytes' is the number of bytes stored in all non-system point keys, including live, meta, old, and deleted keys. Only meta keys really account for the "full" key; value keys only for the timestamp suffix. 'val_count' is the number of meta values tracked under val_bytes. 'val_bytes' is the number of bytes in all non-system version values, including meta values. 'sys_count' is the number of meta keys tracked under sys_bytes. 'sys_bytes' is the number of bytes stored in system-local kv-pairs. This tracks the same quantity as (key_bytes + val_bytes), but for system-local metadata keys (which aren't counted in either key_bytes or val_bytes). Each of the keys falling into this group is documented in keys/constants.go under the LocalPrefix constant and is prefixed by either LocalRangeIDPrefix or LocalRangePrefix. 'live_count' is the number of meta keys tracked under live_bytes. 'live_bytes' is the number of bytes stored in keys and values which can in principle be read by means of a Scan or Get in the far future, including intents but not deletion tombstones (or their intents). Note that the size of the meta kv pair (which could be explicit or implicit) is included in this. Only the meta kv pair counts for the actual length of the encoded key (regular pairs only count the timestamp suffix). 'intent_count' is the number of keys tracked under intent_bytes. It is equal to the number of meta keys in the system with a non-empty Transaction proto. 'intent_bytes' is the number of bytes in intent key-value pairs (without their meta keys).
Describe the problem
Currently
SHOW RANGES WITH DETAILS
only includes range sizes, not object sizes.When a range contains multiple table/indexes, the range size cannot be used to estimate the table/index size any more.
So we need a replacement for where people would previously use SUM on the range size.
This can use the new span stats computation that Zach implemented in #96223 .
Jira issue: CRDB-24925
Epic: CRDB-24928
The text was updated successfully, but these errors were encountered: