-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expose vtbackup stats at --port /metrics #11388
expose vtbackup stats at --port /metrics #11388
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
If a new flag is being introduced:
If a workflow is added or modified:
Bug fixes
Non-trivial changes
New/Existing features
Backward compatibility
|
mysqlTimeout = 5 * time.Minute | ||
initDBSQLFile string | ||
detachedMode bool | ||
keepAliveTimeout = 0 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New, the rest is just formatting changes
go/cmd/vtbackup/vtbackup.go
Outdated
} | ||
|
||
func init() { | ||
mathrand.Seed(time.Now().UnixNano()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is needed. Copy-pasted from another cmd package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
@@ -191,6 +208,15 @@ func main() { | |||
log.Errorf("Couldn't prune old backups: %v", err) | |||
exit.Return(1) | |||
} | |||
|
|||
if keepAliveTimeout > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added for local testing, but think it could be useful in K8s context to keep process alive long enough for a Prometheus scrape interval.
// Catch SIGTERM and SIGINT so we get a chance to clean up. | ||
ctx, cancel := context.WithCancel(context.Background()) | ||
sigChan := make(chan os.Signal, 1) | ||
signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signal handling now the responsibility of servenv
f7bf347
to
5f17ded
Compare
Signed-off-by: Max Englander <max@planetscale.com>
5f17ded
to
bb164b4
Compare
} | ||
|
||
func init() { | ||
mathrand.Seed(time.Now().UnixNano()) | ||
servenv.RegisterDefaultFlags() | ||
dbconfigs.RegisterFlags(dbconfigs.All...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from main
servenv.OnParse(registerFlags) | ||
} | ||
|
||
func main() { | ||
defer exit.Recover() | ||
dbconfigs.RegisterFlags(dbconfigs.All...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to init
Signed-off-by: Max Englander <max@planetscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the other changes LGTM!
Signed-off-by: Max Englander <max@planetscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Description
As far as I can tell
vtbackup
does not currently expose metrics. It would be awesome if it did. In particular it would be great to have the following timings:This PR modifies
vtbackup
command so that a server is started on--port
. The server is similar to that launched by other VT components, and includes a/metrics
route. With this PR this route include two useful metrics which are managed by themysqlctl
package:vtbackup_restore_duration_seconds
vtbackup_backup_duration_seconds
Keeping this PR small to lay the groundwork. The other metrics can come in a later PR if we're OK with this overall approach.
Use cases
PlanetScale makes ongoing internal and public efforts to improve backup and restore performance. It would be great to have detailed metrics on current performance so that we can make informed decisions on where to put our energy.