Skip to content

Commit

Permalink
Support for transparent encryption
Browse files Browse the repository at this point in the history
Co-authored-by: Florin Peter <Florin-Alexandru.Peter@t-systems.com>
  • Loading branch information
FlorinPeter and Florin Peter authored Mar 26, 2022
1 parent f536835 commit 217308a
Show file tree
Hide file tree
Showing 19 changed files with 3,066 additions and 4 deletions.
3 changes: 3 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ ENV \
S3PROXY_CORS_ALLOW_METHODS="" \
S3PROXY_CORS_ALLOW_HEADERS="" \
S3PROXY_IGNORE_UNKNOWN_HEADERS="false" \
S3PROXY_ENCRYPTED_BLOBSTORE="" \
S3PROXY_ENCRYPTED_BLOBSTORE_PASSWORD="" \
S3PROXY_ENCRYPTED_BLOBSTORE_SALT="" \
JCLOUDS_PROVIDER="filesystem" \
JCLOUDS_ENDPOINT="" \
JCLOUDS_REGION="" \
Expand Down
76 changes: 76 additions & 0 deletions docs/Encryption.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
S3Proxy

# Encryption

## Motivation
The motivation behind this implementation is to provide a fully transparent and secure encryption to the s3 client while having the ability to write into different clouds.

## Cipher mode
The chosen cipher is ```AES/CFB/NoPadding``` because it provides the ability to read from an offset like in the middle of a ```Blob```.
While reading from an offset the decryption process needs to consider the previous 16 bytes of the AES block.

### Key generation
The encryption uses a 128-bit key that will be derived from a given password and salt in combination with random initialization vector that will be stored in each part padding.

## How a blob is encrypted
Every uploaded part get a padding of 64 bytes that includes the necessary information for decryption. The input stream from a s3 client is passed through ```CipherInputStream``` and piped to append the 64 byte part padding at the end the encrypted stream. The encrypted input stream is then processed by the ```BlobStore``` to save the ```Blob```.

| Name | Byte size | Description |
|-----------|-----------|----------------------------------------------------------------|
| Delimiter | 8 byte | The delimiter is used to detect if the ```Blob``` is encrypted |
| IV | 16 byte | AES initialization vector |
| Part | 4 byte | The part number |
| Size | 8 byte | The unencrypted size of the ```Blob``` |
| Version | 2 byte | Version can be used in the future if changes are necessary |
| Reserved | 26 byte | Reserved for future use |

### Multipart handling
A single ```Blob``` can be uploaded by the client into multiple parts. After the completion all parts are concatenated into a single ```Blob```.
This procedure will result in multiple parts and paddings being held by a single ```Blob```.

### Single blob example
```
-------------------------------------
| ENCRYPTED BYTES | PADDING |
-------------------------------------
```

### Multipart blob example
```
-------------------------------------------------------------------------------------
| ENCRYPTED BYTES | PADDING | ENCRYPTED BYTES | PADDING | ENCRYPTED BYTES | PADDING |
-------------------------------------------------------------------------------------
```

## How a blob is decrypted
The decryption is way more complex than the encryption. Decryption process needs to take care of the following circumstances:
- decryption of the entire ```Blob```
- decryption from a specific offset by skipping initial bytes
- decryption of bytes by reading from the end (tail)
- decryption of a specific byte range like middle of the ```Blob```
- decryption of all previous situation by considering a underlying multipart ```Blob```

### Single blob decryption
First the ```BlobMetadata``` is requested to get the encrypted ```Blob``` size. The last 64 bytes of ```PartPadding``` are fetched and inspected to detect if a decryption is necessary.
The cipher is than initialized with the IV and the key.

### Multipart blob decryption
The process is similar to the single ```Blob``` decryption but with the difference that a list of parts is computed by fetching all ```PartPadding``` from end to the beginning.

## Blob suffix
Each stored ```Blob``` will get a suffix named ```.s3enc``` this helps to determine if a ```Blob``` is encrypted. For the s3 client the ```.s3enc``` suffix is not visible and the ```Blob``` size will always show the unencrypted size.

## Tested jClouds provider
- S3
- Minio
- OBS from OpenTelekomCloud
- AWS S3
- Azure
- GCP
- Local

## Limitation
- All blobs are encrypted with the same key that is derived from a given password
- No support for re-encryption
- Returned eTag always differs therefore clients should not verify it
- Decryption of a ```Blob``` will always result in multiple calls against the backend for instance a GET will result in a HEAD + GET because the size of the blob needs to be determined
5 changes: 5 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -461,6 +461,11 @@
<artifactId>commons-fileupload</artifactId>
<version>1.4</version>
</dependency>
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.15</version>
</dependency>
<dependency>
<groupId>org.apache.jclouds</groupId>
<artifactId>jclouds-allblobstore</artifactId>
Expand Down
Loading

0 comments on commit 217308a

Please sign in to comment.