Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: explicit zone configuration for the timeseries range on a new cluster #123762

Closed
andrewbaptist opened this issue May 7, 2024 · 1 comment · Fixed by #127034 or #128032
Closed

sql: explicit zone configuration for the timeseries range on a new cluster #123762

andrewbaptist opened this issue May 7, 2024 · 1 comment · Fixed by #127034 or #128032
Assignees
Labels
A-zone-configs C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-testcluster Issues found or occurred on a test cluster, i.e. a long-running internal cluster T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@andrewbaptist
Copy link
Collaborator

andrewbaptist commented May 7, 2024

Is your feature request related to a problem? Please describe.
When a cluster is created, we create a zone configuration for the default zone as well as for most of the "special" ranges that are documented here: https://www.cockroachlabs.com/docs/v23.2/configure-replication-zones#create-a-replication-zone-for-a-system-range. The only system range we don't explicitly create a zone config for is the timeseries range.

After a test cluster is created, it has the following configurations defined:

> select target from [show all zone configurations];
                       target
----------------------------------------------------
  RANGE default
  DATABASE system
  TABLE system.public.lease
  RANGE meta
  RANGE system
  RANGE liveness
  TABLE system.public.replication_constraint_stats
  TABLE system.public.replication_stats
  TABLE system.public.statement_statistics
  TABLE system.public.transaction_statistics
  TABLE system.public.tenant_usage
  TABLE system.public.span_stats_tenant_boundaries
  TABLE system.public.statement_activity
  TABLE system.public.transaction_activity
(14 rows)

Note that the 3 ranges for meta, system, and liveness have explicit zone configs, but timeseries doesn't`.

Describe the solution you'd like
On a system creation, we should create an explicit timeseries zone config that exactly matches the RANGE default zone config. This would prevent confusion and allow users to make the right decisions on what RF they want. Otherwise users who don't carefully read the documentation are not likely to consider this range.

Describe alternatives you've considered

  1. We could just leave the system as it is. The default zone config change and the documentation may be sufficient.
  2. We could have the timeseries range use the system configuration if there isn't a more explicit one set.

Additional context
In a test cluster where all the "visible" zone configurations were set with RF=5 and two nodes were brought down, we had a LoQ event for some of the timeseries ranges. This was confusing to debug since it required figuring out that the timeseries range was not included.

Jira issue: CRDB-38531

Epic CRDB-40419

@andrewbaptist andrewbaptist added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-testcluster Issues found or occurred on a test cluster, i.e. a long-running internal cluster A-zone-configs T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) labels May 7, 2024
@craig craig bot closed this as completed in 6154ebf Jul 18, 2024
@arulajmani
Copy link
Collaborator

@andrewbaptist @rafiss @nvanbenschoten and I discussed this internally, and came to the conclusion that adding an explicit entry to system.zones for TIMESERIES may break user expectations in certain cases. In particular, users that configure a cluster wide replication factor of 5x by changing RANGE DEFAULT might be surprised that it no longer applies to the TIMESERIES range. At the least, we don't have compelling evidence to make a backwards incompatible change here. As a result, we're reverting the patch that addressed this issue #127853.

@rafiss had a great suggestion on how we might be able to address the intention of this issue without making a backwards incompatible change. Instead of copying over RANGE DEFAULT on RANGE TIMESERIES entirely, we could instead just copy over a single field (such as the GC TTL). That way, it no longer runs into the 5x replication factor change problem described above.

Another alternative would be to always include an entry for all named zones in the output for SHOW ZONE CONFIGURATIONS, even if there isn't an explicit zone configuration set in system.zones. In cases where there is no zone configuration explicitly set, we could just leave the raw_config_sql column empty (or say ALTER RANGE TIMESERIES DISCARD instead).

Extending the above, one could also imagine making a distinction between SHOW ZONE CONFIGURATIONS and SHOW ALL ZONE CONFIGURATIONS. Today, they both behave the same, but maybe we can re-purpose the latter to include zone configurations for all ranges/schema objects in the system.

@arulajmani arulajmani reopened this Jul 29, 2024
craig bot pushed a commit that referenced this issue Aug 15, 2024
128032: Reapply "bootstrap: create an explicit zoneconfig for timeseries data" r=rafiss a=rafiss

This partially reverts commit f9d47ce.

This time, rather than copying over the full zone config for the default range to the timeseries range, we only copy over gc.ttlseconds. This means that other attributes, most notably the replication factor, will be inherited from the default range.

This will make it more clear that the timeseries zone config can be changed independently from all the other zone configs.

fixes #123762

Release note (ops change): New clusters that are initialized for the first time will now have a zone config defined for the `timeseries` range. This zone config only specifies the gc.ttlseconds, so all the other attributes are inherited from the zone config of the `default` range, as they were before.

Clusters that are upgraded to v24.3 from a previous version will also
have this zone configuration applied to the timeseries range, as long as
that range does not already have a zone config.

128987: sql/sem/tree: remove duplicate `tree.NewDOidWithName` function r=mgartner a=mgartner

The `tree.NewDOidWithName` function has been removed. All previous
callers now use `tree.NewDOidWithTypeAndName` which is exactly the same.

Epic: None

Release note: None


129053: sql/parser: only retain scanned SQL comments when necessary r=mgartner a=mgartner

In #86968 the scanner gained the ability to retain comments in scanned
SQL strings, and this was an always-on feature. However, the comments
are only used when populating the crdb_internal.cluster_queries table,
see `sql.formatActiveQuery`. Now, comments are only retained when the
parser is used from this function, reducing allocations for all other
cases.

Fixes #127713

Release note: None


Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
@craig craig bot closed this as completed in ec66ae7 Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-zone-configs C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-testcluster Issues found or occurred on a test cluster, i.e. a long-running internal cluster T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
No open projects
Status: Done
3 participants