Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support encrypting backup data (with AES-GCM) with flag --encryption-config-path #418

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

KevinWang15
Copy link

@KevinWang15 KevinWang15 commented Jan 21, 2022

Hi,

Thanks for this project, I find it very helpful.

In my use case, I share the storage with other users, so I'd like to have my backup data encrypted.

I added this feature and decided to open this PR in case you may find it useful.


What this PR does / why we need it:

This PR makes it possible to encrypt the backup data.
A user can now optionally specify a flag --encryption-config-path that points to a JSON file containing SnapstoreEncryptionConfig ( e.g. { "key": "your-encryption-key" } ).
Backups will be encrypted (with AES-GCM) according to the specs.
It is also possible to update the specs on the fly - just change the content of the JSON file.

Release note:

support encrypting backup data (with AES-GCM) with flag `--encryption-config-path` 

@KevinWang15 KevinWang15 requested a review from a team as a code owner January 21, 2022 12:52
@gardener-robot
Copy link

@KevinWang15 Thank you for your contribution.

@gardener-robot gardener-robot added the needs/review Needs review label Jan 21, 2022
@CLAassistant
Copy link

CLAassistant commented Jan 21, 2022

CLA assistant check
All committers have signed the CLA.

@gardener-robot gardener-robot added the size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) label Jan 21, 2022
@gardener-robot-ci-2
Copy link
Contributor

Thank you @KevinWang15 for your contribution. Before I can start building your PR, a member of the organization must set the required label(s) {'reviewed/ok-to-test'}. Once started, you can check the build status in the PR checks section below.

@rfranzke rfranzke added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 21, 2022
@rfranzke
Copy link
Member

Welcome @KevinWang15! :) Please sign the CLA

@gardener-robot-ci-2 gardener-robot-ci-2 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 21, 2022
@ishan16696
Copy link
Member

ishan16696 commented Mar 7, 2022

Hi @KevinWang15 ,
Kindly asking you to rebase the PR on latest master as backup-restore now move to Go v1.17 and now pipeline checks won't get fail.

@KevinWang15
Copy link
Author

Hi @ishan16696 , I have rebased this PR

@ishan16696 ishan16696 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Mar 9, 2022
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Mar 9, 2022
Copy link
Contributor

@sibucan sibucan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there, I saw this great PR, and I thought I would make a few drive-by comments! See below for some thoughts:

pkg/dataencryption/dataencryption.go Outdated Show resolved Hide resolved
pkg/dataencryption/dataencryption.go Outdated Show resolved Hide resolved
@KevinWang15 KevinWang15 changed the title support env DATA_ENCRYPTION_KEY for transparently encrypting backup data support transparently encrypting backup data (AES-GCM) Mar 14, 2022
@KevinWang15 KevinWang15 changed the title support transparently encrypting backup data (AES-GCM) support encrypting backup data (with AES-GCM) with flag --encryption-config-path Mar 14, 2022
@KevinWang15 KevinWang15 removed the request for review from a team March 14, 2022 07:14
@KevinWang15
Copy link
Author

Hi @sibucan, I have addressed these problems, PTAL again, thanks.

@ishan16696 ishan16696 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Mar 16, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Mar 16, 2022
@ishan16696
Copy link
Member

ishan16696 commented May 2, 2022

Hi @KevinWang15 ,
One check got failed in pipeline with this error:

pkg/dataencryption/snapstore.go:106:1: exported function DecorateSnapStore should have comment or be unexported
pkg/types/snapstore.go:226:6: exported type SnapstoreEncryptionConfig should have comment or be unexported

Can you fix this and you can also Locally test this by running make check.
Sorry for delaying in reviewing. I will review this PR ASAP

Copy link
Member

@ishan16696 ishan16696 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a example for encryption-config file here: https://github.com/gardener/etcd-backup-restore/tree/master/example

pkg/dataencryption/snapstore.go Outdated Show resolved Hide resolved
@ishan16696
Copy link
Member

{ "enabled": true, "key": "your-encryption-key" }

Can we move this enabled from the file to a configurable flag like this --enable-backup-encryption with default set to false.
Why I'm asking this is because if somebody don’t want to use the backup-encryption they still have to pass the --encryption-config-path with "enabled": false in file which doesn’t make much sense.

or if --encryption-config-path is optional then it itself can act as indicator that whether the encryption is enabled or not. WDYT?

@gardener-robot gardener-robot added the needs/changes Needs (more) changes label May 4, 2022
@KevinWang15
Copy link
Author

KevinWang15 commented May 4, 2022

Can you fix this and you can also Locally test this by running make check.

Okay, I will.

or if --encryption-config-path is optional then it itself can act as indicator that whether the encryption is enabled or not. WDYT?

I think it is a good idea to make --encryption-config-path optional, if the user doesn't want encryption, then they just don't pass in this flag. I will make sure this is the behavior, and remove enabled field.


The reason I added enabled field before was, the user can change this field in the JSON file to disable encryption on-the-fly. I guess this feature isn't really necessary, besides, setting key in the JSON file to an empty string will have the same effect.

@KevinWang15
Copy link
Author

I have updated the code, PTAL again

@ishan16696
Copy link
Member

ishan16696 commented May 5, 2022

The reason I added enabled field before was, the user can change this field in the JSON file to disable encryption on-the-fly.

hmm ... make sense but what about this scenario:

1. Enable the encryption (now snapshots are encrypted and uploaded to snapstore).
2. Now, disable the encryption (from this point snapshots are not encrypted and uploaded to snapstore).
3. Now we want to restore our etcd from my backup bucket.

Doubt 1: How it gonna distinguish between encrypted snapshot(which needs decryption while restoration) and non-encrypted snapsho(no need of decryption) to restore the etcd.
Doubt 2: When it start a restoration with disable the encryption or with enable the encryption, how it will behave in fetching the snapshot.

@KevinWang15
Copy link
Author

@ishan16696 Right, I guess changing encryption config on the fly isn't really practical.
I can update the code to only read encryption config once at start, to avoid the confusion.
What do you think? And do you see other issues with this PR?

@ishan16696
Copy link
Member

ishan16696 commented May 5, 2022

I guess changing encryption config on the fly isn't really practical.

Its not only about enabling encryption on the fly but also it’s about backward compatibility.
Lets suppose I want to enable this feature for my already running cluster but with same reason as above(#418 (comment)) restoration won’t work.
It means this feature only works with fresh cluster created with this feature with enable encryption. I hope you got what’s my concern.

To solve this issue of backward compatibility I would suggest to use a suffix appended to name of snapshot.
Basically something like this:

Full-00000000-00000001-1651731205.gz (without encryption enabled with compression )
Full-00000000-00000010-1651738200.crypt.gz (with encryption enabled and with compression )

and using this crypt suffix we can determine whether we need decryption or not while restoration. WDYT?
Please let me know if you need help.
you can also use some other relevant suffix. I use crypt just for an example .

@ishan16696
Copy link
Member

we also used the similar technique of appending a suffix in our compression implementation . check here

@sibucan
Copy link
Contributor

sibucan commented May 5, 2022

@ishan16696 I can see how there's value in appending a suffix to the snapshot to indicate whether it's encrypted or not, since it provides a path to restoration when you have e.g. some encrypted incremental snapshots over an unencrypted full snapshot, which is an expected situation when you've recently turned on encryption and a restoration needs to happen. But I think we should keep in mind:

  1. Disabling encryption (after having encryption enabled) via emptying out the key file, or by removing the flag and restarting the application should come with a huge warning that it will make restoration from earlier snapshots impossible, as it means that the key used for encrypting previous snapshots won't be available. I am of the opinion that this should be expected behavior, and that we should tell the user to take a manual full snapshot after disabling decryption but before restarting the application. This brings me to my second point:
  2. Rotating the key will also produce this behavior, and I think that without the capability of rotating the key while the application is still running, this feature is incomplete. The only sane way of rotating the key is by having backup-restore always read the from the file when encrypting a snapshot so we can change the key without restarting the application. There's also no way to indicate on an encrypted snapshot which key was used, which means that we should direct the user to take a manual full snapshot after changing the key being used.

To summarize, I think your suggestions are required for the (common) use case of turning on encryption, but once encryption is enabled, it really should be the user's knowledge that turning off encryption or rotating the key will invariably cause certain snapshots to be unrecoverable. It is their responsibility to ensure that in those situations, they take a manual snapshot to ensure that they're able to restore after that change. Please let me know if any of this is wrong.

@KevinWang15
Copy link
Author

My use case didn't involve rotating the key so I didn't think about it as much as you did. 😃
I understand your points. There's no conclusion yet? Maybe we can automatically trigger a full backup after encryption config has changed?

Anyway, let me know when you have decided how to do it. I will be happy to update the PR.

@ishan16696
Copy link
Member

ishan16696 commented May 6, 2022

hmm ... I'm against the need of manual intervention of take the snapshot.
what we can do is whenever a key is rotated or turned-off then we can trigger an out-of-schedule full snapshot.

case 1: when key is rotated/change.

backup-restore can have a `hash(key)` , if this hash changes it should imply that key has been rotated.
-> it will trigger a full snapshot with new key used for encryption.
->  update the hash value with this new key to detect future key rotation.

case 2: disabling the encryption on the fly.

I think we can use this ` { "enabled"}`  as an of indication of encryption turn on/off on the fly.
-> in this case also we have to trigger an out-of-schedule full snapshot.
-> update the hash value as well too.

WDYT?

cc @sibucan @aaronfern @abdasgupta @KevinWang15

@ishan16696
Copy link
Member

ishan16696 commented May 6, 2022

From our internal discussion we found out that there could be many edge cases possible with this feature :
enable/disable encryption with key rotation and triggering a full snapshot on every key rotation could be costly operation. So, there could be many aspects which we have to look into it and handle them as well .
As we are playing backup-data which could be very risky if something went wrong, so handling of edge cases is necessary.

@KevinWang15
Copy link
Author

Yes, I agree. Take your time, there's no hurry merging this PR

@aaronfern
Copy link
Contributor

case 2: disabling the encryption on the fly.

I think we can use this ` { "enabled"}`  as an of indication of encryption turn on/off on the fly.
-> in this case also we have to trigger an out-of-schedule full snapshot.
-> update the hash value as well too.

@ishan16696 with case 2 here, are you suggesting that we have the --encryption-config-path as part of the mounted JSON file rather than an env var?

@gardener-robot gardener-robot added the needs/rebase Needs git rebase label Jul 8, 2022
@gardener-robot
Copy link

@KevinWang15 You need rebase this pull request with latest master branch. Please check.

@gardener-robot gardener-robot added the lifecycle/stale Nobody worked on this for 6 months (will further age) label Jan 4, 2023
@gardener-robot gardener-robot added lifecycle/rotten Nobody worked on this for 12 months (final aging stage) and removed lifecycle/stale Nobody worked on this for 6 months (will further age) labels Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Nobody worked on this for 12 months (final aging stage) needs/changes Needs (more) changes needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/rebase Needs git rebase needs/review Needs review size/m Size of pull request is medium (see gardener-robot robot/bots/size.py)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants