-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config: Stop server startup if an unrecognized option is found in a config file #9855
Changes from 10 commits
ff42683
d5c8f8a
3be74b0
7f526b0
b4207f7
46f9037
3238d1e
6266fd8
344e3d7
1062aff
5de01e9
f65f452
eb2a90a
1d8d5ec
bb0ec3d
573f82a
2ec9439
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,6 +16,7 @@ package config | |
import ( | ||
"crypto/tls" | ||
"crypto/x509" | ||
"fmt" | ||
"io/ioutil" | ||
"time" | ||
|
||
|
@@ -371,10 +372,22 @@ func GetGlobalConfig() *Config { | |
|
||
// Load loads config options from a toml file. | ||
func (c *Config) Load(confFile string) error { | ||
_, err := toml.DecodeFile(confFile, c) | ||
metaData, err := toml.DecodeFile(confFile, c) | ||
if c.TokenLimit <= 0 { | ||
c.TokenLimit = 1000 | ||
} | ||
|
||
// If any items in confFile file are not mapped into the Config struct, issue | ||
// an error and stop the server from starting. | ||
undecoded := metaData.Undecoded() | ||
if len(undecoded) > 0 && err == nil { | ||
undecodedItems := "" | ||
for _, item := range undecoded { | ||
undecodedItems += fmt.Sprintf(" %s \n", item.String()) | ||
kolbe marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
err = errors.Errorf("config file %s contained unknown configuration options:\n%s", confFile, undecodedItems) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. error generated by There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's fine with me, but the trace is not my code (it was there before, so that a non-existent config file for example would be traced for some reason). |
||
} | ||
|
||
return errors.Trace(err) | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When upgrade from old version tidb to a newer version, some old config item maybe is not used in the newer version tidb. The rolling upgrade may meet error in this case, then the rolling upgrade failed.
How about only check the config items when
-config-check
is provided?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this suggestion. I think the behavior should be that tidb-server simply refuses to start if there are unknown configuration options, for the reasons I provided above.
A couple points:
Now that there's a
--config-check
option, it can be a part of the recommended procedure for a rolling upgrade, so that the only people running into the issue you describe are those not following the recommended procedure.The benefit of a rolling upgrade is that you never have to take the entire cluster offline. When upgrading the first node, it will refuse to start with this error. That's a fine time to address the problem, apply the change to the config files, get that first node upgraded, and then proceed with the rolling upgrade.
@morgo, your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think because our own CI broke, we may need to ease this change in slowly. Here would be my suggestion:
--config-check
as is, it is independently very useful.WARN
about each incorrect config item.--config-strict
, which defaults toFALSE
. We can document it that enabling it is a best practice, and we may change the default toTRUE
in the future. This will result in errors for invalid options.It is not bulletproof of course, because the default log level is
INFO
it is possible that the warnings may not be seen.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the very fact our own CI broke is evidence of the value of this behavior change. Even we weren't keeping track of whether our config files were well-formed!
Your suggestion above is difficult, because the logging system isn't set up yet when the config file is parsed. Either we have to keep around the list of undecoded items and check them later after logging is possible, or we have to pass some other string/structure/flag around, or we have to simply print something to stderr, or ...? @coocood Morgan mentioned that you may have some thoughts on this?
I dislike this. Adding this new flag means that people may start relying on it, and then we have to keep it around as a no-op if we later make that behavior the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do agree with Morgan's first and second suggestions. Our users may use old version ansible to upgrade, that doesn't pre-check the config using
tidb-server --config-check
, we need to keep the cluster stable when rolling upgrade.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, it sounds like there's a consensus. Can someone offer a suggestion of the best way to technically implement a "warning" before the logging system is set up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolbe I will create a pull request to your branch later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kolbe#1 @kolbe PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@winkyao I added some comments on your PR. I re-wrote my own interpretation of how I see this working and pushed it to my branch of my fork, please take a look.
Basically I added a kind of unsettling amount of additional stuff to create a custom error in config.go, then use that in the case of failed config validation. That enables checking for the specific error type in main.go, and if configStrict is not enabled, it'll get the string from the warning and keep that until ater logging has been set up.
This is kind of a lot of extra work and adds some ugly things, but the whole idea behind strict config checking (and the --config-strict option) is that this is a temporary situation until we make --config-strict behavior the default (and hopefully only!) behavior of TiDB Server in the future. I think at that point we'd simply remove all this extra stuff with the custom error and strange handling in the server.
Thoughts?