-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: fix utf8 charset upgrade compatibility #9820
Conversation
Codecov Report
@@ Coverage Diff @@
## master #9820 +/- ##
================================================
- Coverage 67.3615% 67.3261% -0.0354%
================================================
Files 383 383
Lines 80353 80370 +17
================================================
- Hits 54127 54110 -17
- Misses 21397 21416 +19
- Partials 4829 4844 +15 |
server/http_handler.go
Outdated
@@ -743,6 +743,17 @@ func (h settingsHandler) ServeHTTP(w http.ResponseWriter, req *http.Request) { | |||
return | |||
} | |||
} | |||
if treadOldVersionUTF8AsUTF8MB4 := req.Form.Get("treat-old-version-utf8-as-utf8mb4"); treadOldVersionUTF8AsUTF8MB4 != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a session variable to set this config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the "session variable" actually is system scope, not session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete the session variable for need reload schema after change this variable.
sessionctx/variable/tidb_vars.go
Outdated
@@ -131,6 +131,9 @@ const ( | |||
|
|||
// TiDBCheckMb4ValueInUtf8 is used to control whether to enable the check wrong utf8 value. | |||
TiDBCheckMb4ValueInUtf8 = "tidb_check_mb4_value_in_utf8" | |||
|
|||
// TiDBCheckMb4ValueInUtf8 is used to control whether to enable the check wrong utf8 value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
table/column.go
Outdated
@@ -183,6 +183,12 @@ func CastValue(ctx sessionctx.Context, val types.Datum, col *model.ColumnInfo) ( | |||
} | |||
str := casted.GetString() | |||
utf8Charset := col.Charset == mysql.UTF8Charset | |||
doUTF8Check := utf8Charset && config.GetGlobalConfig().CheckMb4ValueInUtf8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be doMB4CharCheck
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reset LGTM
…ge tidb_treat_old_version_utf8_as_utf8mb4 take effect
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to update the column's or table's version when we executing AlterTableCharsetAndCollate
or Change/Modify column
? Because we need to add a tool to update old TiDB table versions to the new version.
@zimulala Not necessary, we can just change the charset to utf8mb4, and keep the table version unchanged, that is ok. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use TableInfoVersion2
to check, but we don't set table versions to TableInfoVersion2
or CurrLatestTableInfoVersion
What problem does this PR solve?
In TiDB v2.0.8
In TiDB v2.0.8, create table with specified table charset, the column charset won't use the table charset, but use UTF8 charset. It is a bug, and already fixed.
Then, after upgrade to TiDB Master,
TiDB master have
tidb_check_mb4_value_in_utf8
check, so insertx'f09f8c80'
to utf8 column will return error.For the client that upgrade from v2.0.8 to master, It will look like not compatibility. Because insert `x'f09f8c80' will successful in TiDB v2.0.8 but failed in TiDB master.
What is changed and how it works?
Add
treat-old-version-utf8-as-utf8mb4
variable to toml config file and session variable (actual is system scope variable) and set default is true.treat-old-version-utf8-as-utf8mb4
use for upgrade compatibility. Set to true will tread old version table/column UTF8 charset as UTF8MB4.How to judge the table/column is new or old version?
This PR also increase the table and column version, Please merge parser PR: pingcap/parser#254 first.
drawback
The
treat-old-version-utf8-as-utf8mb4
variable is not friendly for some user that specially specified the column charset to utf8.another
Maybe a better way to fix this compatibility problem is to rebuild table info schema charset by get all ddl histofy and parse it and rebuild for charset. This may spend a little long time and will change the stored table schema data.
Check List
Tests
Code changes
Side effects
Related changes