backupccl: add prototype metadata.sst #76705

rhu713 · 2022-02-16T21:03:08Z

This adds writing of an additional file to the completion of BACKUP. This new file
is an sstable that contains the same metadata currently stored in the BACKUP_MANIFEST
file and statistics files, but organizes that data differently.

The current BACKUP_MANIFEST file contains a single binary-encoded protobuf message
of type BackupManifest, that in turn has several fields some of which are repeated
to contain e.g. the TableDescriptor for every table backed up, or every revision to
every table descriptor backed up. This can result in these manifests being quite large
in some cases, which is potentially concerning because as a single protobuf message,
one has to read and unmarshal the entire struct into memory to read any field(s) of it.

Organizing this metadata into an SSTable where repeated fields are instead stored as
separate messages under separate keys should instead allow reading it incrementally:
one can seek to a particular key or key prefix and then scan, acting on whatever data
is found as it is read, without loading the entire file at once (when opened using the
same seek-ing remote SST reader we use to read backup data ssts).

This initial prototype adds only the writer -- RESTORE does not rely on, or even open,
this new file at this time.

Release note: none.

cockroach-teamcity · 2022-02-16T21:03:19Z

This change is

dt · 2022-02-22T17:28:35Z

Been trying to think forward to next version and about how we'll write this SST -- sorted -- for an unbounded number of files during a backup. For now of course we assume all file metadata fits ram to write the proto, so we can also sort it to write the sst, but if we wanted to change that later, this might be tricky? we could of course record file (in rows in some job state table perhaps) and then read them sorted to make the SST all at once. Another idea though that might motivate changing this now would be to pull the files info out of "the" manifest and instead allow having many file list SSTs, e.g. manifest/files.123.sst, manifest/files.124.sst, etc which need to be combined to get the total set of all files ? Initially we could just write one, with everything just like it'd be in the files keys of this SST, but later switch to flush smaller, men-bound file lists as we go?

�I suppose we don't need to do that now: if we flushed the smaller files as we went later, we could always read them back in a file merge phase of the backup to produce this SST as-is, with it's contract being that it is the complete list. So maybe nothing we need to do now?

rhu713 · 2022-02-22T20:07:34Z

I suppose we don't need to do that now: if we flushed the smaller files as we went later, we could always read them back in a file merge phase of the backup to produce this SST as-is

I think it makes sense to implement this now. Is the change to have a list of ssts in the main manifest sst for filenames that contain file data? What about the other fields that may be large, maybe spans or stats?

dt · 2022-02-22T20:12:47Z

I think spans and stats are OK since we can iterate over all the tables we're backing up and add their span, or their stats, to the SST builder, in order, flush, and move on to the next, without holding them all in memory at once. Files is the hard one, I think, because they're produced out of order but need to go into the SST in-order, thus the forced buffering stage to sort.

dt · 2022-02-23T04:17:30Z