Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spanconfig: introduce spanconfig.Reconciler #71994

Merged
merged 3 commits into from
Dec 15, 2021

Commits on Dec 15, 2021

  1. server,kvaccessor: record span configs during tenant creation/gc

    For newly created tenants, we want to ensure hard splits on tenant
    boundaries. The source of truth for split points in the span configs
    subsystem is the contents of system.span_configurations. To ensure
    hard splits, we insert a single key record at the tenant prefix. In a
    future commit we'll introduce the spanconfig.Reconciler process, which
    runs per tenant and governs all config entries under each tenant's
    purview. This has implications for this initial record we're talking
    about (this record might get cleaned up for e.g.); we'll explore it in
    tests for the Reconciler itself.
    
    Creating a single key record is easy enough -- we could've written
    directly to system.span_configurations. When a tenant is GC-ed
    however, we need to clear out all tenant state transactionally. To that
    end we plumb in a txn-scoped KVAccessor into the planner where
    crdb_internal.destroy_tenant is executed. This lets us easily delete
    all abandoned tenant span config records.
    
    Note: We get rid of spanconfig.experimental_kvaccessor.enabled. Access
    to spanconfigs infrastructure is already sufficiently gated through
    the env var. Now that crdb_internal.create_tenant attempts to write
    through the KVAccessor, it's cumbersome to have to enable the setting
    manually in every multi-tenant test (increasingly the default) enabling
    some part of the span configs infrastructure.
    
    ---
    
    This commit also needs a migration -- for existing clusters with
    secondary tenants, when upgrading we need to install this initial record
    at the tenant prefix for all extant tenants (and make sure to continue
    doing so for subsequent newly created tenants). This is to preserve the
    hard-split-on-tenant-boundary invariant we wish to provide. It's
    possible for an upgraded multi-tenant cluster to have dormant sql pods
    that have never reconciled. If KV switches over to the span configs
    subsystem, splitting only on the contents of system.span_configurations,
    we'll fail to split on all tenant boundaries. To this end we introduce
    clusterversion.SeedTenantSpanConfigs, which allows us to seed span
    config data for secondary tenants. The associated migration seeds
    entries for existing tenants.
    
    Release note: None
    irfansharif committed Dec 15, 2021
    Configuration menu
    Copy the full SHA
    93d6f22 View commit details
    Browse the repository at this point in the history
  2. spanconfig: introduce spanconfig.Reconciler

    Reconciler is responsible for reconciling a tenant's zone configs (SQL
    construct) with the cluster's span configs (KV construct). It's the
    central engine for the span configs infrastructure; a single Reconciler
    instance is active for every tenant in the system.
    
        type Reconciler interface {
          // Reconcile starts the incremental reconciliation process from
          // the given checkpoint. If it does not find MVCC history going
          // far back enough[1], it falls back to a scan of all
          // descriptors and zone configs before being able to do more
          // incremental work. The provided callback is invoked with
          // timestamps that can be safely checkpointed. A future
          // Reconciliation attempt can make use of this timestamp to
          // reduce the amount of necessary work (provided the MVCC
          // history is still available).
          //
          // [1]: It's possible for system.{zones,descriptor} to have been
          //      GC-ed away; think suspended tenants.
          Reconcile(
            ctx context.Context,
            checkpoint hlc.Timestamp,
            callback func(checkpoint hlc.Timestamp) error,
          ) error
        }
    
    Let's walk through what it does. At a high-level, we maintain an
    in-memory data structure that's up-to-date with the contents of the KV
    (at least the subset of spans we have access to, i.e. the keyspace
    carved out for our tenant ID). We watch for changes to SQL state
    (descriptors, zone configs), translate the SQL updates to the flattened
    span+config form, "diff" the updates against our data structure to see
    if there are any changes we need to inform KV of. If so, we do, and
    ensure that our data structure is kept up-to-date. We continue watching
    for future updates and repeat as necessary.
    
    There's only single instance of the Reconciler running for a given
    tenant at a given point it time (mutual exclusion/leasing is provided by
    the jobs subsystem). We needn't worry about contending writers, or the
    KV state being changed from underneath us. What we do have to worry
    about, however, is suspended tenants' not being reconciling while
    suspended. It's possible for a suspended tenant's SQL state to be GC-ed
    away at older MVCC timestamps; when watching for changes, we could fail
    to observe tables/indexes/partitions getting deleted. Left as is, this
    would result in us never issuing a corresponding deletion requests for
    the dropped span configs -- we'd be leaving orphaned span configs lying
    around (taking up storage space and creating pointless empty ranges). A
    "full reconciliation pass" is our attempt to find all these extraneous
    entries in KV and to delete them.
    
    We can use our span config data structure here too, one that's
    pre-populated with the contents of KV. We translate the entire SQL state
    into constituent spans and configs, diff against our data structure to
    generate KV updates that we then apply. We follow this with clearing out
    all these spans in our data structure, leaving behind all extraneous
    entries to be found in KV -- entries we can then simply issue deletes
    for.
    
    Release note: None
    irfansharif committed Dec 15, 2021
    Configuration menu
    Copy the full SHA
    5f28900 View commit details
    Browse the repository at this point in the history
  3. spanconfig: improve typing for Update

    By using a type definition instead of a stand-alone type, we can reduce
    the amount of code needed to convert from a SpanConfigEntry to an Update
    (something we frequently do). It also helps make it less error prone.
    While here, we introduce two helpful constructors for the two kinds of
    updates we're typically interested in -- additions and deletions.
    
    Release note: None
    irfansharif committed Dec 15, 2021
    Configuration menu
    Copy the full SHA
    f22eefa View commit details
    Browse the repository at this point in the history