Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: Add Logical Column ID field to ColumnDescriptor #46992

Merged
merged 1 commit into from
Apr 29, 2020

Conversation

RichardJCai
Copy link
Contributor

@RichardJCai RichardJCai commented Apr 3, 2020

The LogicalColumnID field mimics the ColumnID field however LogicalColumnID may be swapped
between two columns whereas ColumnID cannot. LogicalColumnID is referenced for virtual tables
(pg_catalog, information_schema) and most notably affects column ordering for SHOW COLUMNS.

This LogicalColumnID field support swapping the order of two columns - currently only used for
ALTER COLUMN TYPE when a shadow column is created and swapped with it's original column.

Does not affect existing behaviour.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rohany
Copy link
Contributor

rohany commented Apr 6, 2020

Can you update this PR to include the representation upgrade path on descriptor read?

@RichardJCai
Copy link
Contributor Author

Will do

@rohany rohany requested a review from jordanlewis April 7, 2020 18:10
@RichardJCai RichardJCai changed the title WIP: Test Add Logical Column ID field to ColumnDescriptor sql: Add Logical Column ID field to ColumnDescriptor Apr 7, 2020
Copy link
Contributor

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a test that writes a descriptor to disk without logical column IDs set, and ensures that when we read the descriptor back we get the logical col IDs populated?

@RichardJCai
Copy link
Contributor Author

Can you also add a test that writes a descriptor to disk without logical column IDs set, and ensures that when we read the descriptor back we get the logical col IDs populated?

Created this test in logical_column_id_test.go

@rohany
Copy link
Contributor

rohany commented Apr 13, 2020

LGTM barring a review from @jordanlewis and comments on the new field on the column descriptor. I'd like to see:

  • A description of the new field
  • When it should be used/changed and what is the difference from column ID
  • A note that this doesn't exist on old descriptors

@blathers-crl
Copy link

blathers-crl bot commented Apr 13, 2020

❌ The GitHub CI (Cockroach) build has failed on 7afcfd0d.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented Apr 14, 2020

❌ The GitHub CI (Cockroach) build has failed on 7afcfd0d.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@RichardJCai
Copy link
Contributor Author

LGTM barring a review from @jordanlewis and comments on the new field on the column descriptor. I'd like to see:

  • A description of the new field
  • When it should be used/changed and what is the difference from column ID
  • A note that this doesn't exist on old descriptors

Added these.

@rohany
Copy link
Contributor

rohany commented Apr 14, 2020

comment looks good, ty

@rohany
Copy link
Contributor

rohany commented Apr 15, 2020

Just thought of this -- allocate IDs should set this for all new columns

@RichardJCai
Copy link
Contributor Author

Just thought of this -- allocate IDs should set this for all new columns

It currently does do this - AllocateIDs calls MaybeFillColumnIDs which fills out the ID and LogicalColumnID for new cols.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I don't think that this will work in all cases yet.

Isn't it the case that things will go wrong if you have a table with columns (a, b, c), and you re-type b to a new type, so it gets logical id 2 but physical id 4? What happens in that case if you do SELECT * FROM t or INSERT INTO t VALUES(1,2,3)? Can you write a test for this and see what happens?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @RichardJCai)


pkg/sql/sqlbase/structured.proto, line 132 at r4 (raw file):

  // computed column.
  optional string compute_expr = 11;
  // LogicalColumnID is used to order columns visually. This is because

It's not just visually. It's also for the purposes of SELECT * and INSERT INTO without column specifiers, right?

@rohany
Copy link
Contributor

rohany commented Apr 15, 2020

It's not just visually. It's also for the purposes of SELECT * and INSERT INTO without column specifiers, right?

I think it's just for display in virtual tables -- we can safely use the column ordinal position in the other cases that you are thinking about (like selects or inserts).

@RichardJCai
Copy link
Contributor Author

Hmm... I don't think that this will work in all cases yet.

Isn't it the case that things will go wrong if you have a table with columns (a, b, c), and you re-type b to a new type, so it gets logical id 2 but physical id 4? What happens in that case if you do SELECT * FROM t or INSERT INTO t VALUES(1,2,3)? Can you write a test for this and see what happens?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @RichardJCai)

pkg/sql/sqlbase/structured.proto, line 132 at r4 (raw file):

  // computed column.
  optional string compute_expr = 11;
  // LogicalColumnID is used to order columns visually. This is because

It's not just visually. It's also for the purposes of SELECT * and INSERT INTO without column specifiers, right?

For the 1st point - I have a test for that in the WIP PR - it's working as intended.
Here's the test

For the 2nd point
No it doesn't affect the SELECT * and INSERT INTO orders - those rely on the ordering of the ColumnDescriptors in the TableDescriptor's columns slice.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @RichardJCai)


pkg/sql/sqlbase/structured.go, line 1215 at r4 (raw file):

// if the ColumnID is set and the LogicalColumnID is not
// it assigns the ColumnID as the LogicalColumnID.
func (desc *TableDescriptor) MaybeFillInLogicalColumnID() {

Rather than have this method, would it be better to only populate these lazily? You could have a method called GetLogicalColumnID that either returns the ID or the LogicalColumnID if it is set.


pkg/sql/sqlbase/structured.proto, line 132 at r4 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

For the 1st point - I have a test for that in the WIP PR - it's working as intended. Here's the test

For the 2nd point No it doesn't affect the SELECT * and INSERT INTO orders - those rely on the ordering of the ColumnDescriptors in the TableDescriptor's columns slice.

Ah okay. That test belongs in this PR then. Also, I would explain in this comment why having this ordering is necessary, on top of the ordering provided by the ColumnDescriptor order in the slice. It's too easy to forget.


pkg/sql/sqlbase/system.go, line 705 at r4 (raw file):

		Version:                 1,
		Columns: []ColumnDescriptor{
			{Name: "key", ID: 1, LogicalColumnID: 1, Type: *types.String},

This sort of thing is why I'm not super crazy about requiring LogicalColumnID. I feel like the lazily populated method might avoid this pain.

@RichardJCai
Copy link
Contributor Author

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @RichardJCai)

pkg/sql/sqlbase/structured.go, line 1215 at r4 (raw file):

// if the ColumnID is set and the LogicalColumnID is not
// it assigns the ColumnID as the LogicalColumnID.
func (desc *TableDescriptor) MaybeFillInLogicalColumnID() {

Rather than have this method, would it be better to only populate these lazily? You could have a method called GetLogicalColumnID that either returns the ID or the LogicalColumnID if it is set.

pkg/sql/sqlbase/structured.proto, line 132 at r4 (raw file):

Previously, RichardJCai (Richard Cai) wrote…
Ah okay. That test belongs in this PR then. Also, I would explain in this comment why having this ordering is necessary, on top of the ordering provided by the ColumnDescriptor order in the slice. It's too easy to forget.

pkg/sql/sqlbase/system.go, line 705 at r4 (raw file):

		Version:                 1,
		Columns: []ColumnDescriptor{
			{Name: "key", ID: 1, LogicalColumnID: 1, Type: *types.String},

This sort of thing is why I'm not super crazy about requiring LogicalColumnID. I feel like the lazily populated method might avoid this pain.

Haha I actually used the lazy method (exactly how you described) in the initial PR.

Regarding having the test in this PR - that logic test depends on the ALTER COLUMN TYPE feature which doesn't exist in this PR. I can't think of an easy way to test the case you described without ALTER COLUMN TYPE right now.

@blathers-crl
Copy link

blathers-crl bot commented Apr 21, 2020

❌ The GitHub CI (Cockroach) build has failed on 018e3e2a.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Contributor

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with comment nits.

@@ -389,6 +389,8 @@ func allocateTableRewrites(
// backup descriptors provided. if skipFKsWithNoMatchingTable is set, FKs whose
// "other" table is missing from the set provided are omitted during the
// upgrade, instead of causing an error to be returned.
// Also assigns LogicalColumnIDs for TableDescriptors that do not have it,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is out of date

@@ -387,6 +387,7 @@ func GetTableDescFromIDWithFKsChanged(
if err != nil {
return nil, false, err
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary changes

pkg/sql/sqlbase/structured.go Outdated Show resolved Hide resolved
@@ -129,6 +129,19 @@ message ColumnDescriptor {
// Expression to use to compute the value of this column if this is a
// computed column.
optional string compute_expr = 11;
// LogicalColumnID is lazily set, do not access directly,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit]: Say something like "LogicalColumnID must be accessed through the accessor ... it is incorrect to access it directly.

@@ -129,6 +129,19 @@ message ColumnDescriptor {
// Expression to use to compute the value of this column if this is a
// computed column.
optional string compute_expr = 11;
// LogicalColumnID is lazily set, do not access directly,
// access using GetLogicalColumnID().
// LogicalColumnID is used to order columns visually. This is because
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the example here could be shrunk to just mention ordering in catalog tables, rather than the specific case.

@blathers-crl
Copy link

blathers-crl bot commented Apr 22, 2020

❌ The GitHub CI (Cockroach) build has failed on 6bd31e4e.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented Apr 22, 2020

❌ The GitHub CI (Cockroach) build has failed on a3f3cd93.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented Apr 22, 2020

❌ The GitHub CI (Cockroach) build has failed on 4e53cd85.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented Apr 22, 2020

❌ The GitHub CI (Cockroach) build has failed on fc8e63b2.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@rohany
Copy link
Contributor

rohany commented Apr 22, 2020

have you run make again? its failing because generated proto is different

@RichardJCai
Copy link
Contributor Author

have you run make again? its failing because generated proto is different

Yeah, the proto binary was different cause I added a comment

@RichardJCai RichardJCai force-pushed the column_descriptor_logical_id branch 2 times, most recently from d473a31 to a0d0616 Compare April 29, 2020 15:50
@blathers-crl
Copy link

blathers-crl bot commented Apr 29, 2020

❌ The GitHub CI (Cockroach) build has failed on a0d06160.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

The LogicalColumnID field mimics the ColumnID field however LogicalColumnID may be swapped
between two columns whereas ColumnID cannot. LogicalColumnID is referenced for virtual tables
(pg_catalog, information_schema) and most notably affects column ordering for SHOW COLUMNS.

This LogicalColumnID field support swapping the order of two columns - currently only used for
ALTER COLUMN TYPE when a shadow column is created and swapped with it's original column.

Does not affect existing behaviour.

Release note: None
@RichardJCai
Copy link
Contributor Author

bors r=rohany

@craig
Copy link
Contributor

craig bot commented Apr 29, 2020

Build succeeded

@craig craig bot merged commit 456d01b into cockroachdb:master Apr 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants