Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added NewArchitecture but still needs a visual #330

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

njriasan
Copy link

@njriasan njriasan commented Mar 6, 2019

This is the template for the new architecture. May want to wait on me to make a visual but I wanted to know if you had any base images you wanted me to work with first.

Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good overall. Just some minor typos and clarifications. Will wait for feedback from @jf87 before merging

## Overview

The plans to change the e-mission architecture are oriented around keeping user data encrypted and only decrypting the data when an approved service or algorithm needs to run on the data. The general workflow for maintaining detail is:
1. The user collects data from the application. This application uses a phone specific private key to encrypt the data and sends the encrypted data to the server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about this, and at least for this set of requirements and implementation, I believe we don't need a private key (e.g. part of a public-private keypair). We just need a secret key (can be symmetric) that is only known by the user.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to have some similar schema like in PGP, where we create some random symmetric key which we use to encrypt the data with AES or similar. Then we can use the public key of the server to encrypt that key so that the server can securely decrypt the key and then can use this key to decrypt the actual data. Also in such way you avoid that a compromise of the single secret key will compromise all data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I would add an overview figure which displays the main components and their interactions. Then it's easier to follow the steps you describe.

1. The user collects data from the application. This application uses a phone specific private key to encrypt the data and sends the encrypted data to the server.
2. The user finds an algorithm which they wish to run on their data or an aggregating algorithm in which they comfortable participating. The user then acquires the hash for this algorithm (possibly with a QR code) and updates their profile on the server to grant permissions to run the algorithm.
3. The user decides they want to run one of the algorithms they have approved. To do so they need to send their private key to the server so that it can decrypt their stored data. This is done by spawning a user enclave built through Graphene SGX running in a docker container. The user then remotely attests this container and once this establishes a secure between the user and enclave, the user transmits the private key over that channel.
4. The server enclave uses the hash for the algorithm to determine a microservice to run on the server (or remotely). This then spawns a microservice enclave, which the server enclave will need to attest to develop a secure channel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and by "server enclave" here, you mean the "user enclave" that you talked about earlier, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because you also call it "secure enclave" in the next bullet point

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there is no need to send the key. My assumption is that we have SGX to store the keys to the data securely and store permissions. If any application wants to access user data, it calls some API on the server, then we can go through SGX to verify if the application should have access at all or on which granularity it should have access. It's then the responsibility of the application to make use of the data that it gets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also instead of hash, it might also be possible to use WAVE?


#### Aggregate Algorithms

It is also possible for a user to agree to be a participant in algorithms that aggregate over larger groups of data. This requires a few changes to the architecture and a different form of interaction. First to facility these algorithms that are not requested by the user it is necessary to have the server enclave available even when a user is offline. To do this we will keep the server enclave running with the private key and the user profile and only shut down the enclave upon request from the user or if it necessary to update details about the profile or key in a manner which modifies existing behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to facility -> to facilitate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this assumes SGX, can't we use sealing to suspend the server enclaves when they are not actively being used?

4. The server enclave uses the hash for the algorithm to determine a microservice to run on the server (or remotely). This then spawns a microservice enclave, which the server enclave will need to attest to develop a secure channel.
5. The server enclave sends the data to the microservice to use in conducting its algorithm. In doing so, the server enclave will decrypt the data inside the secure enclave and then transmit the data over the secure channel formed between the enclaves.
6. The microservice performs the algorithm and returns to the server enclave the output of the algorithm.
7. The server enclave then returns to the user the result of running the algorithm.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there should also be the option for the server enclave to store the results back to the encrypted datastore. The current algorithms do this (e.g. store the results of running the pipeline under different keys)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does this mean that the algorithm will run outside SGX? How do we protect user data when it is processed by an algorithm?


1. The user's smartphone makes a request to a known access location (essential a server at a known domain) with a request to spawn a user cloud instance.
2. The known access location spawns a container to produce a "user cloud." This user cloud consists of a server running inside a secure enclave via Graphene. The known access location then replies to the smartphone with an address and port of the spawned user cloud.
3. The smart phone connects to the known access location. The two establish a secure channel through SGX's remote attestation. All user cloud will run the same general program, so this component is trusted to only allow a new user to connect once at the beginning. While the known access location is untrusted the user cloud code's will be open source and its hash known, allowing us to verify the connection. Then the smartphone will send its private key and profile of allowed algorithm to the user cloud.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be "The smart phone connects to the spawned user cloud", right?

2. The known access location spawns a container to produce a "user cloud." This user cloud consists of a server running inside a secure enclave via Graphene. The known access location then replies to the smartphone with an address and port of the spawned user cloud.
3. The smart phone connects to the known access location. The two establish a secure channel through SGX's remote attestation. All user cloud will run the same general program, so this component is trusted to only allow a new user to connect once at the beginning. While the known access location is untrusted the user cloud code's will be open source and its hash known, allowing us to verify the connection. Then the smartphone will send its private key and profile of allowed algorithm to the user cloud.
4. The user sends some data to the user cloud that it wishes to store over the established secure connection.
5. The user cloud spawns the user's database instance as a container and provides the instance with the private key. The instance can be any paricular database which runs on a section of a distributed file system reserved just for the user (so all contents can be encrypted with the user's private key).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*particular

5. The user cloud spawns the user's database instance as a container and provides the instance with the private key. The instance can be any paricular database which runs on a section of a distributed file system reserved just for the user (so all contents can be encrypted with the user's private key).
6. The user cloud sends the data to the database instance. This database instance will then store the data encrypted with the private key.

Steps 1-3 constitute the process of launching a user cloud. If the user cloud is already running then in step 2 rather than launch a new user cloud the known access location should just return the address of the user's user cloud which is already running (which it should be possible to authenticate, although we may want to produce some shared secret for existing user clouds).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to authenticate? Since the known access location is untrusted, it can return an arbitrary user cloud. The phone should attest the user cloud before it communicates with it, presumably through the cert mechanism.

Steps 1-3 constitute the process of launching a user cloud. If the user cloud is already running then in step 2 rather than launch a new user cloud the known access location should just return the address of the user's user cloud which is already running (which it should be possible to authenticate, although we may want to produce some shared secret for existing user clouds).
Step 5 launches a database instance. It will likely be necessary to keep the database running for much of the life of the user cloud. This step may instead consist of resuming the container or can be skipped if it is actively running.

Below are diagrams showing a visual of the stages numbered with the appropriate steps. Untrusted entities are in pink while the trusted components are light green.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does the user cloud need to see encrypted data (in "Architecture after a user cloud is spawned.")? If we are using the private key to decrypt the contents of the filesystem, the database, and the user cloud can directly read encrypted data from it, right?


### Working with a Subset of Data

Another challenge is how to give algorithms approval for only a subset of data. For example imagine I wanted to give an algorithm access to all my travel data for only the previous month. The biggest challenge in this domain is managing the complexity it produces. Do we want to use a unique key for each permission category? What if a subset of data is approved for some algorithms but not others? What happens if a key is lost or needs to be changed? Ultimately we hope many of these issues can be avoided by our implicit trust in the data fetching server enclave, but we do have concerns about inflating its size and complexity given the important data it manages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is essentially the problem that WAVE (David's other student M. Andersen) is trying to solve. Once we get there, we should definitely explore WAVE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I also commented regarding WAVE above ;-)

Copy link
Contributor

@jf87 jf87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just went through all.
See my comments.

## Overview

The plans to change the e-mission architecture are oriented around keeping user data encrypted and only decrypting the data when an approved service or algorithm needs to run on the data. The general workflow for maintaining detail is:
1. The user collects data from the application. This application uses a phone specific private key to encrypt the data and sends the encrypted data to the server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to have some similar schema like in PGP, where we create some random symmetric key which we use to encrypt the data with AES or similar. Then we can use the public key of the server to encrypt that key so that the server can securely decrypt the key and then can use this key to decrypt the actual data. Also in such way you avoid that a compromise of the single secret key will compromise all data.

## Overview

The plans to change the e-mission architecture are oriented around keeping user data encrypted and only decrypting the data when an approved service or algorithm needs to run on the data. The general workflow for maintaining detail is:
1. The user collects data from the application. This application uses a phone specific private key to encrypt the data and sends the encrypted data to the server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I would add an overview figure which displays the main components and their interactions. Then it's easier to follow the steps you describe.


The plans to change the e-mission architecture are oriented around keeping user data encrypted and only decrypting the data when an approved service or algorithm needs to run on the data. The general workflow for maintaining detail is:
1. The user collects data from the application. This application uses a phone specific private key to encrypt the data and sends the encrypted data to the server.
2. The user finds an algorithm which they wish to run on their data or an aggregating algorithm in which they comfortable participating. The user then acquires the hash for this algorithm (possibly with a QR code) and updates their profile on the server to grant permissions to run the algorithm.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe each data segment that is sent to the server can be associated with some permissions. By default the user's smatphone has full access. Then when we allow access to other applications (algorithms), we can amend these permissions. This would allow us to have permissions on smaller granularity (e.g., one trip or one day of data) and not just either full access or none.

1. The user collects data from the application. This application uses a phone specific private key to encrypt the data and sends the encrypted data to the server.
2. The user finds an algorithm which they wish to run on their data or an aggregating algorithm in which they comfortable participating. The user then acquires the hash for this algorithm (possibly with a QR code) and updates their profile on the server to grant permissions to run the algorithm.
3. The user decides they want to run one of the algorithms they have approved. To do so they need to send their private key to the server so that it can decrypt their stored data. This is done by spawning a user enclave built through Graphene SGX running in a docker container. The user then remotely attests this container and once this establishes a secure between the user and enclave, the user transmits the private key over that channel.
4. The server enclave uses the hash for the algorithm to determine a microservice to run on the server (or remotely). This then spawns a microservice enclave, which the server enclave will need to attest to develop a secure channel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there is no need to send the key. My assumption is that we have SGX to store the keys to the data securely and store permissions. If any application wants to access user data, it calls some API on the server, then we can go through SGX to verify if the application should have access at all or on which granularity it should have access. It's then the responsibility of the application to make use of the data that it gets.

1. The user collects data from the application. This application uses a phone specific private key to encrypt the data and sends the encrypted data to the server.
2. The user finds an algorithm which they wish to run on their data or an aggregating algorithm in which they comfortable participating. The user then acquires the hash for this algorithm (possibly with a QR code) and updates their profile on the server to grant permissions to run the algorithm.
3. The user decides they want to run one of the algorithms they have approved. To do so they need to send their private key to the server so that it can decrypt their stored data. This is done by spawning a user enclave built through Graphene SGX running in a docker container. The user then remotely attests this container and once this establishes a secure between the user and enclave, the user transmits the private key over that channel.
4. The server enclave uses the hash for the algorithm to determine a microservice to run on the server (or remotely). This then spawns a microservice enclave, which the server enclave will need to attest to develop a secure channel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also instead of hash, it might also be possible to use WAVE?

4. The server enclave uses the hash for the algorithm to determine a microservice to run on the server (or remotely). This then spawns a microservice enclave, which the server enclave will need to attest to develop a secure channel.
5. The server enclave sends the data to the microservice to use in conducting its algorithm. In doing so, the server enclave will decrypt the data inside the secure enclave and then transmit the data over the secure channel formed between the enclaves.
6. The microservice performs the algorithm and returns to the server enclave the output of the algorithm.
7. The server enclave then returns to the user the result of running the algorithm.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does this mean that the algorithm will run outside SGX? How do we protect user data when it is processed by an algorithm?

#### Aggregate Algorithms

It is also possible for a user to agree to be a participant in algorithms that aggregate over larger groups of data. This requires a few changes to the architecture and a different form of interaction. First to facility these algorithms that are not requested by the user it is necessary to have the server enclave available even when a user is offline. To do this we will keep the server enclave running with the private key and the user profile and only shut down the enclave upon request from the user or if it necessary to update details about the profile or key in a manner which modifies existing behavior.
Since aggregation also occurs independent of user requests it is no longer feasible to have the server enclave launch a microservice. Instead the group intending to perform aggregation with launch an aggregator enclave which will launch a new enclave per user which produces a scalar value based upon the user's data. That scalar enclave will communicate directly with the server enclave to get the data and will need to be stored in the profile. Then this scalar can be directly communicated to the aggregator enclave to compute the aggregate result over the data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can we not calculate this scalar in the user enclave itself?
In my mind, user enclaves can provide the same interface to user initiated algorithms and aggregate algorithms initiated by 3rd parties. In both cases there is some request to the user-enclave for some data range and with some algorithm ID. Then the permissions stored in the user-enclave will ensure correct response. Does this make sense?


### Working with a Subset of Data

Another challenge is how to give algorithms approval for only a subset of data. For example imagine I wanted to give an algorithm access to all my travel data for only the previous month. The biggest challenge in this domain is managing the complexity it produces. Do we want to use a unique key for each permission category? What if a subset of data is approved for some algorithms but not others? What happens if a key is lost or needs to be changed? Ultimately we hope many of these issues can be avoided by our implicit trust in the data fetching server enclave, but we do have concerns about inflating its size and complexity given the important data it manages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I also commented regarding WAVE above ;-)

@shankari
Copy link
Contributor

Merging this for now to make it easier to read.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants