-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch from finalizer to DeleteBackupRequest for deleting backups #383
Conversation
I also think it would be worth it to add some code that runs once each time the server starts, removing the ark GC finalizer from any backups that have it. |
pkg/backup/delete_helpers.go
Outdated
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
) | ||
|
||
func NewDeleteBackupRequest(name string) *v1.DeleteBackupRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: godoc
|
||
deleteBackupRequestInformer.Informer().AddEventHandler( | ||
cache.ResourceEventHandlerFuncs{ | ||
AddFunc: c.enqueue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: need an UpdateFunc (e.g. request is created and modified before the controller sees it, status is still ""/New)
pkg/controller/gc_controller.go
Outdated
}, | ||
syncPeriod, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: switch to resyncFunc in genericController
name string | ||
queue workqueue.RateLimitingInterface | ||
logger logrus.FieldLogger | ||
syncHandler func(key string) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: add resyncFunc
} | ||
|
||
// Try to delete restores | ||
log.Info("Removing restores") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should remove restores when explicitly deleting a backup, but I have seen others mention that retaining restores as a matter of record would be beneficial. The restores would then be cleaned up via the GC controller. I think that's also a reasonable approach, wdyt?
I don't think this should be an option, though, we should only do one or the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing now that the GC controller uses the BackupDeleteRequest
s to do a delete, so I suppose the 2nd approach would require shuffling some logic around. I'm fine with it as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial pass
Gopkg.toml
Outdated
@@ -31,6 +31,11 @@ required = [ | |||
"k8s.io/code-generator/cmd/informer-gen", | |||
] | |||
|
|||
[prune] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool!
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object | ||
|
||
// DeleteBackupRequestList is a list of DeleteBackupRequests. | ||
type DeleteBackupRequestList struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question - why isn't there a code-generator for the *List types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not really worth the effort
pkg/backup/delete_helpers.go
Outdated
return &v1.DeleteBackupRequest{ | ||
ObjectMeta: metav1.ObjectMeta{ | ||
GenerateName: name + "-", | ||
Labels: map[string]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the backup name stored in a label in addition to the spec just for ease of filtering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so we can delete all requests for backup x
pkg/cmd/cli/backup/delete.go
Outdated
} | ||
|
||
func (o *DeleteOptions) Validate(c *cobra.Command, args []string, f client.Factory) error { | ||
if len(args) != 1 { | ||
return errors.New("you must specify only one argument, the backup's name") | ||
} | ||
|
||
kubeClient, err := f.KubeClient() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to validate that the specified backup exists here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
func (c *backupDeletionController) processRequest(req *v1.DeleteBackupRequest) error { | ||
log := c.logger.WithFields(logrus.Fields{ | ||
"namespace": req.Namespace, | ||
"name": req.Spec.BackupName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the name field be req.Spec.Name
? Maybe add that and the backup name as fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
pkg/controller/generic_controller.go
Outdated
|
||
}() | ||
|
||
c.logger.Info("Starting %s", c.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would probably be useful to add the controller name as a field to the c.logger
README.md
Outdated
@@ -135,28 +135,6 @@ ark restore describe <RESTORE_NAME> | |||
|
|||
For more information, see [the debugging information][18]. | |||
|
|||
### Clean up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe leave this section, and specify that you can just delete the NS, and note that backups/snapshots will not be deleted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I'll make it clearer that this is optional and only if you want to remove stuff.
docs/azure-config.md
Outdated
@@ -115,7 +115,7 @@ Now you need to create a Secret that contains all the seven environment variable | |||
|
|||
```bash | |||
kubectl create secret generic cloud-credentials \ | |||
--namespace <ARK_SERVER_NAMESPACE> \ | |||
--namespace <ARK_SERVER_ARK_NAMESPACE> \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARK_NAMESPACE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
docs/ibm-config.md
Outdated
@@ -47,7 +47,7 @@ Create a Secret. In the directory of the credentials file you just created, run: | |||
|
|||
```bash | |||
kubectl create secret generic cloud-credentials \ | |||
--namespace <ARK_SERVER_NAMESPACE> \ | |||
--namespace <ARK_SERVER_ARK_NAMESPACE> \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARK_NAMESPACE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
|
||
To run the server in another namespace, you edit the relevant files, changing `heptio-ark-server` to | ||
To run the server in another namespace, you edit the relevant files, changing `heptio-ark` to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing not explicitly called out here is that the cloud-credentials secret also needs to be created in the alternate namespace. It's templatized in each cloud-provider's walkthrough, but maybe worth just referencing it here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
r.Status.Phase = v1.DeleteBackupRequestPhaseProcessed | ||
r.Status.Errors = []string{"unable to delete backup because it includes PV snapshots and Ark is not configured with a PersistentVolumeProvider"} | ||
}) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return err
f06cb35
to
1fe9dea
Compare
Some observations from testing:
|
1fe9dea
to
ea74e5d
Compare
Fixed
I'm not sure an error is appropriate here. Your intent is to delete a backup and all its associated data. If we can't find some of the associated data, that's not really an issue, is it? I would expect it to be a problem if you wanted to restore, of course. Maybe we need something as part of the backup sync controller that periodically verifies the integrity of a backup's associated data?
That's pushed up now.
I manually created a
So maybe always show if at least 1 has errors?
I'll link on both name and uid. This brings up something I haven't addressed yet - there's no expiration on |
Yeah, I think it's fine as-is.
Would be a nice feature to add, but not needed as part of this PR.
Yep, I'll look into it
I think that makes sense. Makes more sense to me than putting the backup in
Yeah - probably not urgent but would be a good idea to GC old ones, esp. those that aren't associated with any existing backup. |
I can no longer produce the patch error - attempted deletion of a nonexistent backup is blocked by the CLI, and applying the resource directly works fine. |
3960c93
to
54dabd3
Compare
@skriss added deletion of processed delete backup requests > 24h |
- Add pruning settings to Gopkg.toml - Update vendoring deps doc to point to dep installation instructions and to use dep instead of hack/dep-save.sh - Remove hack/dep-save.sh Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Now that we've configured pruning for dep, this removes all unused packages, all non-go files, and all tests from the vendor directory. NOTE: due to a change in dep, it preserves anything that looks like a license file. We'll be pulling in a few files we weren't previously using - mostly license files. It's easier to just go with what dep does than to try to exclude them after the fact. Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Use a custom builder image to do all of Ark's builds. This image now contains k8s.io/code-generator for code generation. Enable docker in travis to use the builder image. Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
We ran into a lot of problems using a finalizer on the backup to allow the Ark server to clean up all associated backup data when deleting a backup. Users also found it less than desirable that deleting the heptio-ark namespace resulted in all the backup data being deleted. This removes the finalizer and replaces it with an explicit DeleteBackupRequest that is created as a means of requesting the deletion of a backup and all its associated data. This is what `ark backup delete` does. If you use kubectl to delete a backup or to delete the heptio-ark namespace, this no longer deletes associated backups. Additionally, as long as the heptio-ark namespace still exists, the Ark server's BackupSyncController will continually sync backups into the heptio-ark namespace from object storage. Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
54dabd3
to
a4d5061
Compare
Latest commit LGTM |
Always request DeleteBackupRequests for a given backup so we can show failed deletion attempts if you try to delete a backup that has PV snapshots when Ark doesn't have a persistentVolumeProvider configured. When creating a DeleteBackupRequest, include a label for the UID so we can match based on name and UID when associated DeleteBackupRequests with a given backup. Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Make sure a DeleteBackupRequest has its Spec.BackupName filled in. If not, record an error in the status and mark the request as processed. Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
last 2 also LGTM |
When the BackupDeletionController processes a request, set the request's backup-name and backup-uid labels if they aren't currently set. Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
2 things found testing:
Otherwise, all LGTM! |
As a followup to the previous comment, the scenario in the 2nd bullet also errors on AWS. |
The problem is knowing if an InProgress backup is truly in progress, or stuck. Maybe it's running, or maybe the Ark server was restarted in the middle of the backup. We could add an in-memory list/map/set of in progress backups and disallow deleting those.
This requires that we change the protobuf messages to return better errors, so we can identify "not found" properly. I'd like to defer fixing this until we make the plugin/protobuf changes. That ok with you? |
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Latest commit keeps track of in-progress backups in memory and doesn't allow deletion of them. |
OK, latest LGTM |
We ran into a lot of problems using a finalizer on the backup to allow
the Ark server to clean up all associated backup data when deleting a
backup.
Users also found it less than desirable that deleting the heptio-ark
namespace resulted in all the backup data being deleted.
This removes the finalizer and replaces it with an explicit
DeleteBackupRequest that is created as a means of requesting the
deletion of a backup and all its associated data. This is what
ark backup delete
does.If you use kubectl to delete a backup or to delete the heptio-ark
namespace, this no longer deletes associated backups. Additionally, as
long as the heptio-ark namespace still exists, the Ark server's
BackupSyncController will continually sync backups into the heptio-ark
namespace from object storage.
TODO:
ark backup describe
to showDeleteBackupRequest
status if relevantFixes #375
Fixes #376
Fixes #358