Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl: Fix generated column can refer only to generated columns defined prior to it. #11549

Merged
merged 18 commits into from
Aug 8, 2019

Conversation

AilinKid
Copy link
Contributor

@AilinKid AilinKid commented Jul 31, 2019

What problem does this PR solve?

This PR tries to solve the following bug that caused by the order of generated columns and non-generated columns:

TiDB master:

tidb> create table t (a int, b int as (a + 1), c int as (b + 1))
Query OK, 0 rows affected (0.16 sec)

tidb> insert into t set a=1;
Query OK, 1 rows affected (0.0 sec)

tidb>alter table t add column e int as (c + 1) first;
Query OK, 0 rows affected (0.16 sec)

tidb> select * from t
+--------+---+---+---+
| e      | a | b | c |
+--------+---+---+---+
| <null> | 1 | 2 | 3 |                      
+--------+---+---+---+
1 row in set

tidb> desc select * from t;
+-----------------+----------+------+-------------------------------------------------------------------------------------------------------------------+
| id              | count    | task | operator info                                                                                                     |
+-----------------+----------+------+-------------------------------------------------------------------------------------------------------------------+
| Projection_5    | 10000.00 | root | cast(plus(cast(plus(test.t.b, 1)), 1)), test.t.a, cast(plus(test.t.a, 1)), cast(plus(cast(plus(test.t.a, 1)), 1)) |
| └─TableReader_7 | 10000.00 | root | data:TableScan_6                                                                                                  |
|   └─TableScan_6 | 10000.00 | cop  | table:t, range:[-inf,+inf], keep order:false, stats:pseudo                                                        |
+-----------------+----------+------+-------------------------------------------------------------------------------------------------------------------+

In MySQL 5.7 & 8.0:

mysql> create table t (a int, b int as (a + 1), c int as (b + 1))
Query OK, 0 rows affected (0.16 sec)

mysql> insert into t set a=1;
Query OK, 1 rows affected (0.0 sec)

mysql>alter table t add column e int as (c + 1) first;
(3107, u'Generated column can refer only to generated columns defined prior to it.')

Before putting add generated column to ddl job, computing the generated column offset itself and judging the relation with depended generated column offset as well.

What is changed and how it works?

Find the generated column offset itself in func findPositionRelativeColumn
judge the relation with depended generated column offset in func checkDependedColValid

Check List

Tests

  • Unit test
  • Integration test

@AilinKid AilinKid changed the title Fix generated column can refer only to generated columns defined prior to it. ddl: Fix generated column can refer only to generated columns defined prior to it. Jul 31, 2019
@AilinKid AilinKid added the type/bugfix This PR fixes a bug. label Jul 31, 2019
@AilinKid
Copy link
Contributor Author

/rebuild

@AilinKid
Copy link
Contributor Author

/run-all-tests

@codecov
Copy link

codecov bot commented Aug 1, 2019

Codecov Report

Merging #11549 into master will decrease coverage by 0.6576%.
The diff coverage is 87.5%.

@@               Coverage Diff                @@
##             master     #11549        +/-   ##
================================================
- Coverage   81.8655%   81.2079%   -0.6577%     
================================================
  Files           427        426         -1     
  Lines         94759      91874      -2885     
================================================
- Hits          77575      74609      -2966     
- Misses        11830      11895        +65     
- Partials       5354       5370        +16

@Deardrops
Copy link
Contributor

Deardrops commented Aug 1, 2019

There is an implement for detecting if there is a generated column refer to another generated column occurring earlier in the table. Could we multiplex this function?

func checkGeneratedColumn(colDefs []*ast.ColumnDef) error {

if attr.generated && attribute.position <= attr.position {
// A generated column definition can refer to other
// generated columns occurring earlier in the table.
err := errGeneratedColumnNonPrior.GenWithStackByArgs()
return errors.Trace(err)

rs.Close()
}
c.Assert(err, NotNil)
c.Assert(err.Error(), Equals, "[ddl:1054]Unknown column 'f' in 'generated column function'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please consider using assertErrorCode()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please consider using assertErrorCode()?

Addressed @tangenta

ddl/ddl_api.go Outdated Show resolved Hide resolved
_, err = tk.Exec("alter table t add column(e int as (c+1))")
c.Assert(err, IsNil)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to test the same case twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to test the same case twice?

Addressed, PTAL

@AilinKid
Copy link
Contributor Author

AilinKid commented Aug 1, 2019

There is an implement for detecting if there is a generated column refer to another generated column occurring earlier in the table. Could we multiplex this function?

func checkGeneratedColumn(colDefs []*ast.ColumnDef) error {

if attr.generated && attribute.position <= attr.position {
// A generated column definition can refer to other
// generated columns occurring earlier in the table.
err := errGeneratedColumnNonPrior.GenWithStackByArgs()
return errors.Trace(err)

Thx very much.
I’ve check it. Logic is same to me, but func is unable to use.
It’s in create table statement, so the depended column parameter is all ast.colDefs.
but here are only columns instead when you in add column statement.

I've changed it: PTAL
To reduce intrusion into original func, I've kept it real.
To be more clear, I've simulated the similar func definition and name.

@AilinKid
Copy link
Contributor Author

AilinKid commented Aug 1, 2019

There is an implement for detecting if there is a generated column refer to another generated column occurring earlier in the table. Could we multiplex this function?

func checkGeneratedColumn(colDefs []*ast.ColumnDef) error {

if attr.generated && attribute.position <= attr.position {
// A generated column definition can refer to other
// generated columns occurring earlier in the table.
err := errGeneratedColumnNonPrior.GenWithStackByArgs()
return errors.Trace(err)

Addressed @Deardrops

ddl/db_integration_test.go Outdated Show resolved Hide resolved
ddl/ddl_api.go Outdated Show resolved Hide resolved
_, ok := dependCols[col.Name.L]
if col.Offset >= pos && ok && col.IsGenerated() {
// Generated column can refer only to generated columns defined prior to it.
return errGeneratedColumnNonPrior.GenWithStackByArgs()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if _, ok := dependCols[col.Name.L]; ok {
	if col.IsGenerated() && col.Offset >= pos {
		// Generated column can refer only to generated columns defined prior to it.
		return errGeneratedColumnNonPrior.GenWithStackByArgs()
	}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

if pos == nil {
return position, nil
}
if pos.Tp == ast.ColumnPositionFirst {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: changing to select ... case ... pattern, for better readability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It handles like ddl/column.103 when in ddl woker.

@@ -50,10 +50,28 @@ func verifyColumnGeneration(colName2Generation map[string]columnGenerationInDDL,
return nil
}

// verifyColumnGenerationSingle is for ADD GENERATED COLUMN, we just need verify one column itself.
func verifyColumnGenerationSingle(dependCols map[string]struct{}, cols []*table.Column, position *ast.ColumnPosition) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first parameter of this function is dependCols, but when call this function, you passed the parameter named duplicateColNames, is them same?
https://github.com/pingcap/tidb/blob/a7cf2762510cb1f4efd14d6ddd1ae08558ded4e8/ddl/ddl_api.go#L2113

Copy link
Contributor Author

@AilinKid AilinKid Aug 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. Since dependCols will change in checkDependedColExist, so I make it copy.
Parameter Addressed.

@AilinKid
Copy link
Contributor Author

AilinKid commented Aug 2, 2019

/run-all-tests

@Deardrops
Copy link
Contributor

LGTM

@AilinKid AilinKid added the status/LGT1 Indicates that a PR has LGTM 1. label Aug 2, 2019
Copy link
Contributor

@tangenta tangenta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AilinKid AilinKid added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Aug 2, 2019
@ngaut ngaut requested a review from zimulala August 3, 2019 08:56
ddl/ddl_api.go Outdated
@@ -2101,8 +2100,17 @@ func (d *ddl) AddColumn(ctx sessionctx.Context, ti ast.Ident, spec *ast.AlterTab
if err = checkAutoIncrementRef(specNewColumn.Name.Name.L, dependColNames, t.Meta()); err != nil {
return errors.Trace(err)
}
duplicateColNames := make(map[string]struct{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make map with capacity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

ddl/ddl_api.go Outdated Show resolved Hide resolved
ddl/db_integration_test.go Outdated Show resolved Hide resolved
ddl/generated_column.go Outdated Show resolved Hide resolved
ddl/generated_column.go Outdated Show resolved Hide resolved
ddl/generated_column.go Outdated Show resolved Hide resolved
// findPositionRelativeColumn return a pos relative to added generated column position
func findPositionRelativeColumn(cols []*table.Column, pos *ast.ColumnPosition) (int, error) {
position := len(cols)
// get column position default is append behind
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// get column position default is append behind
// get column position default is append behind

What does this comment mean? I'm confused

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.
It's should be: Get column position, the default is cols's length means appending behind.

if col == nil {
return -1, ErrBadField.GenWithStackByArgs(pos.RelativeColumn, "generated column function")
}
// Insert position is after the mentioned column.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

position cannot be after a column, you can say 'Inserted column is after the mentioned column'.

Copy link
Contributor Author

@AilinKid AilinKid Aug 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed. Yes, shouldn't be a verb here.

ddl/generated_column.go Outdated Show resolved Hide resolved
@AilinKid AilinKid added status/LGT3 The PR has already had 3 LGTM. and removed status/LGT2 Indicates that a PR has LGTM 2. labels Aug 8, 2019
@bb7133
Copy link
Member

bb7133 commented Aug 8, 2019

/run-all-tests

@bb7133 bb7133 merged commit 110b073 into pingcap:master Aug 8, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Aug 8, 2019

cherry pick to release-3.0 failed

@sre-bot
Copy link
Contributor

sre-bot commented Aug 8, 2019

cherry pick to release-2.1 failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/sql-infra SIG: SQL Infra status/LGT3 The PR has already had 3 LGTM. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants