Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: stop returning raw protobufs from descriptor accessors #56306

Closed
jordanlewis opened this issue Nov 5, 2020 · 2 comments
Closed

sql: stop returning raw protobufs from descriptor accessors #56306

jordanlewis opened this issue Nov 5, 2020 · 2 comments
Assignees
Labels
A-schema-descriptors Relating to SQL table/db descriptor handling. meta-issue Contains a list of several other issues.

Comments

@jordanlewis
Copy link
Member

jordanlewis commented Nov 5, 2020

Currently, the interfaces defined for the top level descriptor types (catalog.{,Database,Schema,Table,Type}Descriptor) return raw protobufs from most of their methods. This is undesirable (at minimum) because the return values are easily mutated by accident, and because there's no opportunity to extend the raw protobufs with more advanced caching data structures.

Adding interfaces here instead of raw protos will also improve API uniformity and make our codebase more easily tested and understood.


The goal of this issue is to wrap the returned protobufs in an struct and accompanying interface that hides the underlying protobufs. Some of the leaf structs matter less - for example, UserPrivilege isn't important to wrap. It's more important to prevent the big ones from mutability: ColumnDescriptor, IndexDescriptor, and so on.

We'll be extending catalog.Descriptor and its sub-interfaces (DatabaseDescriptor, SchemaDescriptor, TableDescriptor, TypeDescriptor) so that their methods do not return things from the descpb package at all.

For example, we'll want to start returning a new catalog.IndexDescriptor interface instead of descpb.IndexDescriptor from TableDescriptor.GetPrimaryIndex().

Here is the list of interfaces we'll want to extract into sql/catalog, along with associated new structs inside of sql/catalog/tabledesc:

We'll need to figure out a pattern for iteration as well, for methods like GetPublicNonPrimaryIndexes(). We don't want to have to allocate a slice of n interfaces and structs for each of these items - for example, the ColumnDescriptors inside of an IndexDescriptor.


In addition to the new interfaces listed above we'll also want to extend and use the existing catalog interfaces:


We'll also want to unify the new interfaces with those in pkg/sql/opt/cat at some point:

  • cat.Index -> catalog.Index
  • cat.Column -> catalog.Column
  • others as well most likely (TBD)
@jordanlewis jordanlewis added meta-issue Contains a list of several other issues. A-schema-descriptors Relating to SQL table/db descriptor handling. labels Nov 5, 2020
@blathers-crl
Copy link

blathers-crl bot commented Nov 5, 2020

Hi @jordanlewis, please add a C-ategory label to your issue. Check out the label system docs.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

craig bot pushed a commit that referenced this issue Jan 7, 2021
58273: sql: add catalog.Index interface, replace catalog.TableDescriptor methods for indexes r=postamar a=postamar

In an effort to not directly use `descpb.IndexDescriptor protos` everywhere, this multi-commit PR:
1. introduces a new `catalog.Index` interface to encapsulate them, and a bunch of new methods in `catalog.TableDescriptor` to replace those which have `descpb.IndexDescriptor` in their signatures, this is done in the first commit;
2. replaces calls to the old methods with calls to the new methods throughout the codebase, this is done one method at a time in the subsequent commits.

This breakdown into multiple commits is done to ease review, after all most of the contents of this diff are effectively little more than noise. Still, there are a few non-straightforward changes, but I figured they were worth it, though. 

This PR is motivated by #57465, see also its parent issue #56306 for more details. 

Subsequent work should consist in propagating this `descpb.IndexDescriptor -> catalog.Index` change further throughout the codebase.

Co-authored-by: Marius Posta <marius@cockroachlabs.com>
@postamar postamar self-assigned this Jan 27, 2021
craig bot pushed a commit that referenced this issue Feb 5, 2021
59789: sql: add catalog.Column interface, replace catalog.TableDescriptor methods for columns r=postamar a=postamar

In an effort to not directly use `descpb.ColumnDescriptor` protos everywhere, this multi-commit PR:
1. introduces a new `catalog.Column` interface to encapsulate them, and a bunch of new methods in `catalog.TableDescriptor` to replace those which have `descpb.ColumnDescriptor` in their signatures, this is done in the first commit;
2. replaces calls to the old methods with calls to the new methods throughout the codebase in the subsequent commits.

This breakdown into multiple commits is done to ease review, most of the substantial changes are in the first commit, the others follow through with the change and so are quite noisy.

Fixes #59805. 

See also the parent issue #56306, as well as the related issue #57465 which tracks the same work I've already done for `descpb.IndexDescriptor`. In this PR I reused many of the same patterns I introduced in the corresponding PR #58678.

Subsequent work should consist in propagating this `descpb.ColumnDescriptor -> catalog.Column` change further throughout the codebase.

Co-authored-by: Marius Posta <marius@cockroachlabs.com>
@postamar
Copy link
Contributor

I believe we should close this issue for now. This doesn't mean there's nothing left to do re. virtualization of descriptor protos, in fact there's plenty to do. However:

  1. At this point it the direction of future work has been set and there's not much to discuss, on, for instance, how to virtualize column families.
  2. The original goal was to wrap everything in descpb. At this point this isn't particularly desirable (nothing wrong with descpb.ID for instance) and I'm quite happy with our choice of not virtualizing mutable descriptors: immutable table descriptors are used via a catalog.TableDescriptor interface but mutables are a concrete tabledesc.Mutable.
  3. Our efforts on the new schema changer will probably drive this in un-anticipated directions anyway.

Let's leave things be as they are and revisit this after the next release.

postamar pushed a commit to postamar/cockroach that referenced this issue Apr 27, 2021
This commit aims at reducing direct usage of
descpb.PartitioningDescriptor types and instead using the new
catalog.Partitioning interface. This refactoring is in line with recent
virtualization work tracked under cockroachdb#56306.

In an effort to improve readability and ability to reason about state
changes, this commit changes the type signature of sql.CreatePartitioning
and, notably, removes its side-effects on the index descriptor. The
latter is now modified using a new tabledesc.UpdateIndexPartitioning
function.

Release note: None
postamar pushed a commit to postamar/cockroach that referenced this issue May 5, 2021
This commit aims at reducing direct usage of
descpb.PartitioningDescriptor types and instead using the new
catalog.Partitioning interface. This refactoring is in line with recent
virtualization work tracked under cockroachdb#56306.

In an effort to improve readability and ability to reason about state
changes, this commit changes the type signature of sql.CreatePartitioning
and, notably, removes its side-effects on the index descriptor. The
latter is now modified using a new tabledesc.UpdateIndexPartitioning
function.

Release note: None
postamar pushed a commit to postamar/cockroach that referenced this issue May 13, 2021
This commit aims at reducing direct usage of
descpb.PartitioningDescriptor types and instead using the new
catalog.Partitioning interface. This refactoring is in line with recent
virtualization work tracked under cockroachdb#56306.

In an effort to improve readability and ability to reason about state
changes, this commit changes the type signature of sql.CreatePartitioning
and, notably, removes its side-effects on the index descriptor. The
latter is now modified using a new tabledesc.UpdateIndexPartitioning
function.

Release note: None
craig bot pushed a commit that referenced this issue May 14, 2021
64013: sql: add catalog.Partitioning interface r=postamar a=postamar

This commit aims at reducing direct usage of
descpb.PartitioningDescriptor types and instead using the new
catalog.Partitioning interface. This refactoring is in line with recent
virtualization work tracked under #56306.

In an effort to improve readability and ability to reason about state
changes, this commit changes the type signature of sql.CreatePartitioning
and, notably, removes its side-effects on the index descriptor. The
latter is now modified using a new tabledesc.UpdateIndexPartitioning
function.

Release note: None

65137: blobs: provide io.Writer API for steaming blob writes r=dt a=dt

This changes the API for the blob service's Write method from one which
takes a Reader and writes its content to one which provides a Writer and
leaves it to the caller to write to it. This allows the nodelocal
ExternalStorage implementation to expose the blob writer directly rather
than spin up an extra goroutine run this blob stream off a pipe to yeild
a writer.

Release note: none.

65167: delegate: fix zone configs for SHOW CREATE TABLE with no partitions r=jordanlewis a=otan

Fixed the regression in c44619c which
had no tests to cover this case.

Release note (bug fix): Fixed a bug where if the zone configurations
would not display on a SHOW CREATE TABLE if there are no partitions
even though there are zone configurations on the index or tables
themselves.

Co-authored-by: Marius Posta <marius@cockroachlabs.com>
Co-authored-by: David Taylor <tinystatemachine@gmail.com>
Co-authored-by: Oliver Tan <otan@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-schema-descriptors Relating to SQL table/db descriptor handling. meta-issue Contains a list of several other issues.
Projects
None yet
Development

No branches or pull requests

2 participants