Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add user id column to system.users table and perform migration #81457

Merged

Conversation

RichardJCai
Copy link
Contributor

Previously we used usernames to access user information and privileges.
We will transition to using IDs by first adding user ids and updating the code
to use IDs where possible.

Release justification: Justified in RFC #77453
Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@RichardJCai RichardJCai force-pushed the user_id_in_system_table_05162022 branch 11 times, most recently from 492be21 to 391a6f7 Compare May 24, 2022 17:35
@RichardJCai RichardJCai marked this pull request as ready for review May 24, 2022 18:37
@RichardJCai RichardJCai requested a review from a team as a code owner May 24, 2022 18:37
@RichardJCai RichardJCai requested a review from a team May 24, 2022 18:37
@RichardJCai RichardJCai requested a review from a team as a code owner May 24, 2022 18:37
@RichardJCai RichardJCai requested a review from a team May 24, 2022 18:37
@RichardJCai RichardJCai requested review from a team as code owners May 24, 2022 18:37
@RichardJCai RichardJCai requested a review from a team May 24, 2022 18:37
@RichardJCai RichardJCai requested a review from a team as a code owner May 24, 2022 18:37
@RichardJCai RichardJCai requested review from stevendanna and removed request for a team May 24, 2022 18:37
@RichardJCai
Copy link
Contributor Author

Would like someone from @cockroachdb/sql-schema to review this migration logic.

https://github.com/cockroachdb/cockroach/pull/81457/files#diff-66ce01e5399fea40fb05669301fe0b4fa19898a9c232787a7ec0a7dcbb34492aR27-R35

The migration adds a default expression that calls nextval seq which currently isn't directly supported.

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @RichardJCai and @stevendanna)


pkg/ccl/streamingccl/streamclient/random_stream_client.go line 279 at r1 (raw file):

		catpb.NewBasePrivilegeDescriptor(username.RootUserName()),
		nil,
		nil,

could you add inline comments for the parameter names?


pkg/sql/catalog/systemschema/system.go line 89 at r1 (raw file):

  "hashedPassword" BYTES,
  "isRole"         BOOL NOT NULL DEFAULT false,
	user_id          OID NOT NULL DEFAULT oid(nextval(48:::OID)),

nits:

  • could you add a comment for this "48" id?
  • rather than using OID, i think the INT4 type is a bit more explicit about how big the value can be. OIDs have a bit of a rough edge right now, where they actually get stored as an 8 byte int, so i'm a bit worried about using it in a system table

pkg/sql/catalog/systemschema/system.go line 91 at r1 (raw file):

	user_id          OID NOT NULL DEFAULT oid(nextval(48:::OID)),
  CONSTRAINT "primary" PRIMARY KEY (username),
	INDEX (user_id)

this index should be unique


pkg/sql/catalog/systemschema/system.go line 134 at r1 (raw file):

	RoleIDSequenceSchema = `
CREATE SEQUENCE system.role_id_seq START 100;`

is it starting at 100 so we have reserved IDs? can you add that in a comment?


pkg/sql/catalog/systemschema/system.go line 895 at r1 (raw file):

	trueBoolString   = "true"
	zeroIntString    = "0:::INT8"
	genNextOIDString = "oid(nextval(48:::OID))"

genNextOIDString is specific to the sequence with an ID of 48. so i think it should be named genNextUserOIDString maybe


pkg/sql/catalog/systemschema/system.go line 917 at r1 (raw file):

			pk("username"),
			descpb.IndexDescriptor{
				Name:                "users_user_id_idx",

this should be a unique index


pkg/sql/catalog/systemschema/system.go line 1047 at r1 (raw file):

			opts := &descpb.TableDescriptor_SequenceOpts{
				Increment: 1,
				MinValue:  1,

shouldn't MinValue be 100 too? otherwise i believe the sequence cycles around (maybe i'm mis-remembering)


pkg/sql/catalog/systemschema/system.go line 1048 at r1 (raw file):

				Increment: 1,
				MinValue:  1,
				MaxValue:  math.MaxInt64,

if we're using OIDs, the max value should be MaxInt32


pkg/startupmigrations/migrations.go line 785 at r1 (raw file):

	        UPSERT INTO system.users (username, "hashedPassword", "isRole", "user_id") VALUES ($1, '', false,  1)
	        `
	return r.execAsRootWithRetry(ctx, "addRootUser", upsertRootStmt, username.RootUser)

nit: should we define constants for RootUserID and AdminRoleID and use them as placeholder args here?


pkg/upgrade/upgrades/descriptor_utils.go line 40 at r1 (raw file):

		tKey := catalogkeys.EncodeNameKey(codec, desc)

		got, err := txn.Get(ctx, tKey)

why did this get pulled out?


pkg/upgrade/upgrades/system_users_role_id_migration.go line 39 at r1 (raw file):

const addUserIDColumn = `
ALTER TABLE system.users
ADD COLUMN IF NOT EXISTS "user_id" OID CREATE FAMILY "fam_4_user_id" 

could you comment why we need a new family here?


pkg/upgrade/upgrades/system_users_role_id_migration.go line 83 at r1 (raw file):

	const upsertRootStmt = `
		       UPSERT INTO system.users (username, "hashedPassword", "isRole", "user_id") VALUES ('root', '', false, 1)

it seems like this would wipe out the password if there was one previously defined for root


pkg/ccl/backupccl/testdata/backup-restore/feature-flags line 86 at r1 (raw file):


exec-sql
RESTORE SYSTEM USERS FROM 'nodelocal://1/deprecated';

why did this test get removed?


pkg/sql/catalog/systemschema_test/testdata/bootstrap line 22 at r1 (raw file):

	FAMILY "fam_2_hashedPassword" ("hashedPassword"),
	FAMILY "fam_3_isRole" ("isRole"),
	FAMILY fam_4_user_id (user_id)

just noticed this.. i wonder if it's better to just use the "correct" convention, or to use the camelCase convention that the other columns in this table have. i'm fine either way i guess..


pkg/sql/logictest/testdata/logic_test/grant_table line 1431 at r1 (raw file):

system         public       reports_meta                     root     SELECT
system         public       reports_meta                     root     UPDATE
system         public       role_id_seq                      root     DELETE

i believe we'll want to add something so that system sequences have different privileges -- they should give USAGE to admin and root as well. (also INSERT is a no-os for sequences)

rafiss
rafiss previously requested changes May 26, 2022
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @RichardJCai and @stevendanna)


pkg/ccl/backupccl/backup_test.go line 9457 at r2 (raw file):

	defer cleanupEmptyCluster1()
	t.Run("restore-from-backup-with-system-role-members", func(t *testing.T) {
		// testuser should take ID 101. The restored users originally have no ID

why doesn't it start at ID 100? i thought we were using START WITH 100?


pkg/ccl/backupccl/backup_test.go line 9462 at r2 (raw file):

		sqlDBRestore1.Exec(t, "RESTORE SYSTEM USERS FROM $1", localFoo+"/3")

		sqlDBRestore1.CheckQueryResults(t, "SELECT username, \"hashedPassword\", \"isRole\", \"user_id\" FROM system.users", [][]string{

i'm not very familiar with RESTORE - could you explain what about it makes the restored IDs deterministic? (if restoring from v22.1 into v22.2)

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of moving parts here. I think the gist of it works. More commentary in all the various places to tie the threads together would be good.

@@ -460,6 +460,7 @@ const (
TenantUsageTableID = 45
SQLInstancesTableID = 46
SpanConfigurationsTableID = 47
RoleIDSequenceID = 48
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a somewhat sad usage of a constant ID. I buy it, but I don't love it. We could do some other things -- if you're interested. One observation is that because of the caching protocol of writes to the users tables, we always will contend when updating any row in that table. That means it's totally okay to contend as part of updating the table.

Another note is that the use of a sequence in the system table is a little scary to me.

One possibility is we could use a dummy row in the table or something else illegal to store the allocator row? It's a little hacky to be sure, but it's as or more efficient, won't leak IDs, and saves us a fixed ID. WDYT?

Doing the above suggestion will make the migration more involved and will require changing more code. I can see the argument to do it the way you are doing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a somewhat sad usage of a constant ID. I buy it, but I don't love it. We could do some other things -- if you're interested. One observation is that because of the caching protocol of writes to the users tables, we always will contend when updating any row in that table. That means it's totally okay to contend as part of updating the table.

Yeah I believe it has to be constant since system.users depends on it. I don't find it sad really. Also what other things can we do here?

Another note is that the use of a sequence in the system table is a little scary to me.

Agreed, it is uncharted.

One possibility is we could use a dummy row in the table or something else illegal to store the allocator row? It's a little hacky to be sure, but it's as or more efficient, won't leak IDs, and saves us a fixed ID. WDYT?

Can you explain more here? Are you talking about in the migration?

}

const upsertRootStmt = `
UPSERT INTO system.users (username, "hashedPassword", "isRole", "user_id") VALUES ('root', '', false, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the plan to deal with the node user?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still curious

Comment on lines 89 to 91
user_id OID NOT NULL DEFAULT oid(nextval(48:::OID)),
CONSTRAINT "primary" PRIMARY KEY (username),
INDEX (user_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation is weird

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is still weird. I think tabs instead of space.

Comment on lines 89 to 91
user_id OID NOT NULL DEFAULT oid(nextval(48:::OID)),
CONSTRAINT "primary" PRIMARY KEY (username),
INDEX (user_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a unique index?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question still stands.

},
}

return migrateTable(ctx, cs, d, op, keys.UsersTableID, systemschema.UsersTable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where's the part where you add the index?

Comment on lines 38 to 46
21,DropRole/drop_1_role
29,DropRole/drop_2_roles
37,DropRole/drop_3_roles
15,DropSequence/drop_1_sequence
17,DropSequence/drop_2_sequences
19,DropSequence/drop_3_sequences
15,DropTable/drop_1_table
17,DropTable/drop_2_tables
19,DropTable/drop_3_tables
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it obvious to you why these changed?

Comment on lines 44 to 50
got, err := txn.Get(ctx, tKey)
if err != nil {
return err
}
if got.Value.IsPresent() {
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you getting rid of this defensive code? If you're going to get rid of it, why not get rid of the comment explaining it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should delete the comment here but I just moved the code outside of the if statement, it should apply regardless of if the ID is allocated or not I believe. If the table already exists we should just exit and not re-create it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the comment too? It's weird to just leave a comment that doesn't apply to any code.

Comment on lines +783 to +791
UPSERT INTO system.users (username, "hashedPassword", "isRole", "user_id") VALUES ($1, '', false, 1)
`
return r.execAsRootWithRetry(ctx, "addRootUser", upsertRootStmt, username.RootUser)
}

func addAdminRole(ctx context.Context, r runner) error {
// Upsert the admin role into the table. We intentionally override any existing entry.
const upsertAdminStmt = `
UPSERT INTO system.users (username, "hashedPassword", "isRole") VALUES ($1, '', true)
UPSERT INTO system.users (username, "hashedPassword", "isRole", "user_id") VALUES ($1, '', true, 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate that this is still a thing :(

#73813

@RichardJCai
Copy link
Contributor Author

You can have null values in unique indexes. A unique constraint only implies uniqueness for non-NULL values. Why don't we re-order this into a few different migrations to make our lives easier. Add the index in one step. Set the values in the next step. Make the column non-null in a final step. Should be a lot fewer lines. You can do a for loop and batching on the updates.

Damn shoot I didn't realize this :/

I updated it, the update query might be able to be made simpler though?

Here's how it is right now

		updateUserIDs := fmt.Sprintf(`
UPDATE system.users
   SET user_id = id
  FROM (
        SELECT *
          FROM (
                SELECT row_number() OVER () AS rn, *
                  FROM system.users
                 WHERE user_id IS NULL
               ) AS t1
               JOIN (
                  SELECT row_number() OVER () AS rn,
                         unnest AS id
                    FROM ROWS FROM (unnest($1))
                ) AS t2 ON t1.rn = t2.rn
       ) AS temp
 WHERE system.users.username = temp.username
       AND system.users.user_id IS NULL
 LIMIT %d`, numIDs)

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use the sequence? Can't the whole migration just be:

if _, err := d.InternalExecutor.ExecEx(
    ctx, "upsert-default-users-in-role-id-migration", nil /* txn */, sessiondata.NodeUserSessionDataOverride, `
UPSERT
  INTO system.users (username, "isRole", user_id)
VALUES ($1, false, $2), ($3, true, $4);
`, username.RootUser, username.RootUserID,
   username.AdminUser, username.AdminUserID,
); err != nil {
    return err
}
for {
   if n, err := d.InternalExecutor.ExecEx(
      ctx, "update-role-ids", nil /* txn */, sessiondata.NodeUserSessionDataOverride,`
UPDATE system.users
   SET user_id = nextval('system.public.role_id_seq'::REGCLASS)
 WHERE user_id IS NULL
 LIMIT 1000;`); err != nil || n == 0 {
      return err
   }
}

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru, @ajwerner, @knz, @rafiss, @RichardJCai, and @stevendanna)

@RichardJCai
Copy link
Contributor Author

why not use the sequence? Can't the whole migration just be:

if _, err := d.InternalExecutor.ExecEx(
    ctx, "upsert-default-users-in-role-id-migration", nil /* txn */, sessiondata.NodeUserSessionDataOverride, `
UPSERT
  INTO system.users (username, "isRole", user_id)
VALUES ($1, false, $2), ($3, true, $4);
`, username.RootUser, username.RootUserID,
   username.AdminUser, username.AdminUserID,
); err != nil {
    return err
}
for {
   if n, err := d.InternalExecutor.ExecEx(
      ctx, "update-role-ids", nil /* txn */, sessiondata.NodeUserSessionDataOverride,`
UPDATE system.users
   SET user_id = nextval('system.public.role_id_seq'::REGCLASS)
 WHERE user_id IS NULL
 LIMIT 1000;`); err != nil || n == 0 {
      return err
   }
}

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru, @ajwerner, @knz, @rafiss, @RichardJCai, and @stevendanna)

Figured we'd wanna batch the allocating role ids. If that doesn't seem necessary then I'm happy to go with the above.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah that's a good point about batching the sequence accesses. See how you feel about my suggestion.

Comment on lines +126 to +142
UPDATE system.users
SET user_id = id
FROM (
SELECT *
FROM (
SELECT row_number() OVER () AS rn, *
FROM system.users
WHERE user_id IS NULL
) AS t1
JOIN (
SELECT row_number() OVER () AS rn,
unnest AS id
FROM ROWS FROM (unnest($1))
) AS t2 ON t1.rn = t2.rn
) AS temp
WHERE system.users.username = temp.username
AND system.users.user_id IS NULL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the below command which also deals with updating the sequence?

  WITH start AS (
                SELECT max(user_id) AS start
                  FROM (
                         SELECT max(user_id) AS user_id FROM system.users
                         UNION ALL SELECT 100 -- the starting value for the sequence
                       )
             ),
       ids AS (
            SELECT i, i + start AS id
              FROM start, ROWS FROM (generate_series(0, 999)) AS t (i)
           ),
       users_to_update AS (
                        SELECT row_number() OVER () AS rn, name
                          FROM system.users
                         WHERE user_id IS NULL
                       ),
       updated AS (
                   UPDATE system.users
                      SET user_id = id
                     FROM (
                            SELECT name, id
                              FROM ids INNER JOIN users_to_update ON i = rn
                          )
                    WHERE username = name
                    LIMIT 1000
                RETURNING user_id
               )
SELECT setval('system.role_id_seq', max(user_id))
  FROM updated;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explored with the SQL statement above for a bit, one problem is that if there is a concurrent create user, the above as it as doesn't deal with it since it only sets the value at the end but doesn't block another concurrent create user from using it. Doing setval before doing updating works though.

One more gripe from me is that the CTE isn't actually easier to understand compared to code, at least for me.

How much do you dislike it as it is?

Copy link
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing the work of adding testing around the backup/restore!

pkg/ccl/backupccl/system_schema.go Show resolved Hide resolved
pkg/ccl/backupccl/system_schema.go Outdated Show resolved Hide resolved
pkg/ccl/backupccl/testdata/backup-restore/feature-flags Outdated Show resolved Hide resolved
pkg/ccl/backupccl/system_schema.go Outdated Show resolved Hide resolved
@RichardJCai RichardJCai force-pushed the user_id_in_system_table_05162022 branch from 1687d19 to 46113ff Compare August 3, 2022 19:38
Copy link
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backup portions LGTM.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cool with the migration as you have it :lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @knz, @rafiss, @RichardJCai, and @stevendanna)

@RichardJCai RichardJCai force-pushed the user_id_in_system_table_05162022 branch 2 times, most recently from bc69525 to 2b9aec9 Compare August 4, 2022 18:12
In preparation for keying roles by ids.
See RFC: cockroachdb#77453

Release note: None

Co-authored-by: Fenil-P <fenil.patel@cockroachlabs.com>
@RichardJCai RichardJCai force-pushed the user_id_in_system_table_05162022 branch from 2b9aec9 to 3d846bd Compare August 4, 2022 20:16
During restore, when users in the backup did not have ids, they'll be given one.

Release note: None
@RichardJCai RichardJCai force-pushed the user_id_in_system_table_05162022 branch from 3d846bd to 1bc6c6d Compare August 4, 2022 20:49
@RichardJCai
Copy link
Contributor Author

Thanks for reviewing all!
bors r+

@craig
Copy link
Contributor

craig bot commented Aug 5, 2022

👎 Rejected by code reviews

@RichardJCai
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Aug 5, 2022

Build failed:

@RichardJCai
Copy link
Contributor Author

Flake
bors r+

@craig
Copy link
Contributor

craig bot commented Aug 5, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants