-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a codec AES_128_GCM_SIV for encrypting columns on disk #19896
Add a codec AES_128_GCM_SIV for encrypting columns on disk #19896
Conversation
@depressed-pho , this is nice, but can we have it more generic? I.e. instead of AES codec define it as Encrypted with parameters. That will allow to make it more generic, as current 'encrypt' function: https://clickhouse.tech/docs/en/sql-reference/functions/encryption-functions/ For now it will only support AES_128_GCM_SIV, but we can extend later to other options. |
Like |
It should be conditionally compiled under |
@depressed-pho thanks, a very nice feature! I noticed that there are no tests, could you please add some? To simplify this, you may want to reuse the unit-test suite from other codes, please see |
Unit tests are optional but functional tests are mandatory. |
But I'm not sure how to write a functional test for this. Would it be enough to just add something like |
@depressed-pho Yes, you can add a new config with a script for static key, see |
If we make it more generic, it would nice for the encryption parameters to the codec to match our current |
178b908
to
d7957ec
Compare
Updated my patch. It's now named |
d7957ec
to
90dc94c
Compare
Amended it again to fix a build failure occuring when !USE_SSL. |
90dc94c
to
0ce27f6
Compare
Fixed !USE_SSL again. |
0ce27f6
to
8959ad8
Compare
Marked the test as being skipped in fasttest, where SSL is disabled. |
8959ad8
to
88824b7
Compare
Fixed several build errors found by the build check. |
And now only the Yandex-exclusive checks are failing. I can't do anything about that. |
88824b7
to
a13e56c
Compare
Rebased and resolved a conflict. |
@depressed-pho Please, run a script |
a13e56c
to
c8fedc1
Compare
I did, but the script updated no |
c8fedc1
to
1b337bc
Compare
While this is implemented as a compression codec, it does not actually compress data. It instead encrypts data on disk. The key is obtained by executing a user-specified command at the server startup, or if it's not specified the codec refuses to process any data. For now the only supported cipher is 'AES-128-GCM-SIV'.
1b337bc
to
089cdcb
Compare
(deleted) This was wrong. |
This codec is actually mostly safe to use but some considerations exist:
I think we can merge it and highlight these considerations in docs. PS. I did not know that AES-GCM-SIV is a well known technique, initially I thought it is some dirty hack :) |
For other modes of operation it will be "instant death" if we will not generate and store random IVs. But if we do it, replicas will argue about mismatched checksum of "compressed" data and download data parts from each other excessively. So, I think we should not allow more modes of operation here. |
For Linux with systemd: | ||
|
||
```xml | ||
<encryption> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This configuration can be confused with the configuration of entrypted disks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, probably a longer path in xml would be better, for example
<codecs>
<aes_128_gcm_siv>
<key_hex>...</key_hex>
</aes_128_gcm_siv>
</codecs>
|
||
// turbob64 doesn't like whitespace characters in input. Strip | ||
// them before decoding. | ||
std::erase_if(b64_key, [](char c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has an inconsistency: in this place, the encryption key is in base64 while for encrypted disks it is in hex or binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I think, it's better to read it in hex.
{ | ||
#if USE_BASE64 && USE_SSL && USE_INTERNAL_SSL_LIBRARY | ||
|
||
auto process = ShellCommand::execute(key_command); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another inconsistency - here the key is loaded by a command placed in config,
while for encrypted disks we don't allow to put a script in config file, we expect the key can be imported from env variable with from_env=...
attribute in config.
{ | ||
// Fixed nonce. Yes this is unrecommended, but we have to live | ||
// with it. | ||
std::string_view nonce("\0\0\0\0\0\0\0\0\0\0\0\0", 12); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is actually ok for AES_GCM_SIV?
Need help from @vitlibar for further polishing. |
* master: (482 commits) ru version colon en version colon Attempt to fix flaky 00705_drop_create_merge_tree Help with ClickHouse#26424 Adjust 00537_quarters to be timezone independent Add a codec AES_128_GCM_SIV for encrypting columns on disk (ClickHouse#19896) Update BaseDaemon.cpp Update PULL_REQUEST_TEMPLATE.md edits after review small fix in links Make the test parallelizable ru version edit ru version en version Fix 01600_quota_by_forwarded_ip typo Remove trailing whitespaces from docs Remove trailing whitespaces from docs fix system.zookeeper_log initialization Move formatBlock to its own file ... # Conflicts: # programs/server/Server.cpp
My thoughts about that:
or, for multiple keys
or
|
Yes, 100%. It was just some false advice... |
+1, it will be consistent with full disk encryption that we have implemented. |
+1. Using KDF will make it inconsistent with full disk encryption. |
While this is implemented as a compression codec, it does not actually compress data. It instead encrypts data on disk. The key is obtained by executing a user-specified command at the server startup, or if it's not specified the codec refuses to process any data.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
AES_128_GCM_SIV
which encrypts columns instead of compressing them.Detailed description / Documentation draft: