what we know

WIP. Sorry about any grammar mistakes.

What we want

We want the following magic box:

 Consumer                             Host
 --------                             ----
                    +---------+
{Consumer State} -> |         | <- {Content Resource State (for a resource)}
                    | *MAGIC* | <- {time()}
                    |         | -> access?
                    +---------+

Efficiency

The consumer should only have refresh its state (over the network) once per day or maybe once per hour.
The content host should not initiate any network connections.

Simplicity

We would like to avoid having content hosts run special DRACL servers. Instead, we would like to have provide two functions: make_challenge(acl) and verify_response(acl, challenge, response). Therefore, we establish the following rules:

The content host should be stateless with respect to authentication. The consumer should be responsible for storing any intermediate state.
The system should expose a limited API to the content host.

Proposed Multi-Round Trip API

Define one function, dracl_step: input -> Allow|Deny|Error|Continue(output). If dracl_step outputs continue, the output should be sent to the client and the protocol run again. Basically, this is a black box state machine.

Privacy/Security

This protocol should return false if any party inputs partial state.
A read-only compromise of the content host should not allow an attacker to authenticate as a consumer for any resource.
Preferably, the authentication provider would store no (or little) secret state on internet facing servers.

Pitfalls

Unavoidable

Consumers learn when they can access content

Users know the set of things they can access. Obviously.

Consumers learn that they cannot access content

Unfortunately, users may also learn about content they cannot access. This information leakage can be limited by not telling users about content that they can't access (or by mixing in content that no user can access).

Consumers loosely categorize consumers

Basically, there are a couple of ways in which consumers can learn about the producer's groups that we can't fix.

Through collusion

If multiple users collude, they can always determine which content is accessible by any subset of the colluding users. If some subset is able to access a set of content and another subset is not, the users can conclude that the first subset is probably in a group to which the second subset does not belong.

Over time

Over time, users will be added and removed from groups. They can use this to map out which content they gain and lose access to as follows:

time r1 r2 r3 ...
-----------------
1    y  n  n
2    y  y  y
...

If a consumer Alice learns, through some side channel (e.g. the content itself), that some other consumer Bob can access r2, Alice can deduce that Bob can likely also access r3.

Consumers can learn when added/removed from groups

This is a direct consequence of learning access patterns and changing access over time.

Content hosts can fingerprint users

Content hosts can fingerprint users based on the content they can access. This can be mitigated by asking a user if he or she wishes to authenticate.

Authenticating with partial state reveals groups (over time)

If a consumer can authenticate with part of its state, it can categorize resources by which pieces of its state can be used to access each resource and which pieces of state can't. If it knows anything about which other users can access each resource (for example, messages often address the readers by name), it can categorize users by the resources they can access. By transitivity, it can categorize users by the state it needed to authenticate.

If a content host can verify a user using partial information, it can fingerprint the user by learning which piece of information it needed to use to authenticate the user.

If a consumer can verify against a partial ACLs, the consumer can collude with another consumer to learn about group composition.

Content hosts should not store secret state

If a content host stores secret state, this state will have to be replaced on break in. Furthermore, if this state can be used to authenticate and access resources, an information disclosure exploit becomes a read-write exploit.

In practice, this means that all authentication should use asymmetric crypto.

Cached Credentials and Privacy

As a general rule, caching/reusing state tends to leak information. However, I can't quite formalize this observation.

Current Protocols

Trivial Protocol

The basic idea is to encrypt a resource key with every user's secret key.

Setup

Give every consumer a secret u_i. Generate the ACL as follows:

s = GenSalt()
k = GenKey()
H = {Hash(s, u_i) ⊕ k ∀ u_i ∈ ACL}
v = Hash(k) // Used to check k
C = (s, H, v)
ACL = (C,σ(C))

Protocol

To prove membership in this ACL, the consumer:

Verifies the signature.
Computes hu_i = Hash(s, u_i). To prove membership in this ACL, the consumer:
Verifies the signature.
Computes hu_i = Hash(s, u_i).
For all h_i∈ H, compute k_i = h_i⊕ hu_i and finds the k_i where Hash(k_i) == v.
Returns k_i (or proves knowledge there of).

Basic Bilinear Map Protocol

This first protocol provides consumer privacy in the face of a malicious content host.

Vinod: This is an improved version of the basic bilinear map protocol that we discussed that works in the face of a malicious content server.

Setup

Every day each consumer consumer gets...

...a public component where t changes daily:

Pk_t = h^t, σ(h^t) // σ is a signature by the producer.

...a secret component for each group γ_i the consumer is a member of:

Sk_u = {g^(t*γ_i)}

For each resource, the content host gets (all public).

// for all groups γ_ithat should be able to access the content.
Pk_r = {g^(r*γ_i)}
σ(Pk_r)
Sk_r = h^r

Protocol

Consumer

First, the consumer sends today's public component (Pk_t) to the content host.

Content Host

The content host checks the signature on the public component and verifies that it is valid today (proving that the credentials are valid today).

The content host generates a random (or, more likely derives) s and, for each entry in Pk_r, computes

k_i = e(Pk_r, Pk_t^s) = e(g^(r*γ_i), h^(s*t)) = e(g, h)^(t*r*s*γ_i)

Next, the content host uses each k_i to symmetrically encrypt s generating:

Chal = {SymmetricEncrypt(e(g, h)^(t*r*s*γ_i), s)}

Finally, the content sends Chal, Sk_r^s, and Pk_r (including the signature) to the consumer.

Consumer

The consumer checks the signature on Pk_r and then computes computes k_is for each element in Sk_u:

k_i = e(g^(t*γ_i), Sk_r^s) = e(g, h)^(t*r*s*γ_i)

Then, the consumer tries to decrypt each element in Chal with each k_i (we can probably make this linear). If the consumer succeeds, it uses the decrypted s, Pk_t, and the elements in Pk_r to generate the real k_is:

k_i = e(Pk_t^s, g^(r*γ_i)) = e(g, h)^(t*s*r*γ_i)

It then uses these to decrypt the rest of the entries in Chal and verify that they contain s. This prevents the content host from lying to the consumer by providing a partial challenge.

Finally, the consumer then sends s back to the content host to gain access to the content.

Polynomial Bilinear Map Protocol

This protocol should prevent consumers from learning anything about the groups they are in unless they collude. Yes, I understand that this guarantee is a bit vague. I'm working on it.

Unfortunately, this protocol limits how many groups a consumer can be in per-producer.

Setup

First, define some maximum number of groups a consumer can be in (n). Then, pick a set of n + 2 numbers (p). Finally, generate permutations of g and h as follows:

g_i = g^p_i // forall 0 ≤ i ≤ n
h_i = g^(1/p_i) // forall 0 ≤ i ≤ n
g_t = g^p_{n+1}
h_t = g^(1/p_{n+1})

This means that all e(g_i, h_i) = e(g, h). This ensures that an attacker can't pair some g_i with the wrong h_i and get anything useful back.

For each consumer, every day, the producer computes the consumer secret

// Example with 3 groups.
ax³ + bx² + cx + d = t*((x-γ_1)(x-γ_2)(x-γ_3) + 1)
Sk_u = [g_3^a, g_2^b, g_1^c, g_0^d]

The day's public component (Pk_t) is:

Pk_t = h_t^t, σ(h_t^t) // σ is a signature by the producer.

For the content host, for each resource, the producer computes two public components:

// For each group γ_i
// n is the maximum number of groups a consumer can be in per producer.
Pk_rh = {[h_n^(r*γ_iⁿ), ..., h_3^(r*γ_i³), h_2^(r*γ_i²), h_1^(r*γ_i), h_0^r], ...}
σ(Pk_r)
Pk_rg = g_t^r

Protocol

Consumer

First, the consumer sends Pk_t to the content host.

Content Host

Then, the content host verifies the signature on Pk_t.

Next, the content host chooses some secret s and computes,

K = e(Pk_rg^s, Pk_t) = e(g, h)^(r*s*t)
Chal = SymmetricEncrypt(K, s)
Pk_rh^s = {[h_n^(r*s*γ_iⁿ), ..., h_3^(r*s*γ_i³), h_2^(r*s*γ_i²), h_1^(r*s*γ_i), h_0^(r*s)], ...}

The content then sends Chal, Pk_rh^s, and Pk_rg to consumer.

Consumer

For each element in Pk_rh^s, the consumer uses Sk_u to calculate:

K = e(h_3^(r*s*γ_i³), g_3^a)*e(h_2^(r*s*γ_i²), g_2^b)*e(h_1^(r*s*γ_i¹), g_1^c)*e(h_0^(r*s), g_0^d)
  = e(h, g)^(r*s*t*(a*γ_i³ + b*γ_i² + c*γ_i¹ + d))

And try to decrypt Chal. If γ_i is one of the user's group, the polynomial should evaluate to 1 leaving e(h, g)^(r*s*t).

Return s to the content host.

Extension

We should probably include a randomly generated "group" in each consumer secret but this is mostly gut feeling.

Discussion

So, I'm a bit concerned about the security of this protocol. With collusion, users will likely be able to learn g^(t*γ_i) all and g^t but I don't think this breaks security.

Privacy

This system has the following privacy deficiencies in addition to the unavoidable ones.

The publisher is public

In this protocol, both authorized consumers (consumers who can access the content) and the content host must learn the identity (really, a pseudonym) of the publisher.

Consumers learn when a resource's ACL has not been replaced

The challenge contains a signature that's only replaced when the ACL itself is replaced so users can learn when the ACL has been replaced. Note: users can't tell if the new ACL differs from the old one.

Consumers can prove that they have been added/removed from group(s)

Learning that one has likely been added/removed from a group is impossible to prevent but this protocol allows users to prove that they have been added/removed from at least one group. They can do this because they can prove that they have lost access to a resource which hasn't had its ACL replaced.

Consumers can definitively learn if they share groups by collusion

If multiple users collude, they can definitively learn whether or not they are in the same group because challenges are not atomic. If two users can decrypt the same part of a challenge, they can conclude that they are in the same group.

Security

Consumers will retain access to content in groups from which they have been removed for a day. Note: we can shrink this window a bit which I'll explain later...

Been There, Done That

Stupid Always Online Scheme

This is a very simple authentication scheme that gives us everything we want in terms of privacy/security and simplicity but is terribly inefficient in terms of network communication.

https://idemixdemo.mybluemix.net/

Protocol

To create an ACL, a publisher creates a list of uids/gids that can access the content and encrypts it with a symmetric secret key.

To access a resource:

The content host sends the ACL to the consumer.
The consumer authenticates with the publisher's authentication provider and uploads the ACL.
The authentication provider decrypts the ACL, verifies that the consumer should be able to access the content, then signs the ACL certifying that this user should be able to access the content.
The consumer gives this certificate to the content host.

Why not?

The content host needs to be in the loop (hot path) for every transaction.
A break-in at the authentication provider's server compromises the entire system.
The authentication provider learns when any user tries to access any content.

Stupid Precompute Scheme

Basically, take the always online scheme and precompute the signatures. That is, every client gets a certificate/token for every resource it can access.

Why not?

If we use certificates, tons of up-front computation. If we use tokens, we have to modify ACLs to revoke access.
Publishers need to remember every piece of content they have ever published and who has access to it.
Consumers need to remember every piece of content they have access to.

Group Secret Keys

Every group is identified by a unique secret key.

Protocol

To create an ACL, generate a resource specific asymmetric keypair, encrypt the secret key with every group key for every group that should be able to access the resource, and sign this encrypted set. The ACL is the public key + the signed encrypted set.

To authenticate, a consumer downloads the encrypted set, checks the signature, and tries decrypting each entry. If the consumer succeeds in decrypting an entry, it proves it knows the asymmetric secret key (by, e.g., signing some document).

Why not?

Producers can't "take back" group keys so, to remove a consumer from a group, the producer must re-create that group with a new key. This means that, if the producer wants the removed user to loose access to existing resources, the producer must update all published ACLs that include that group.
Users learn about the groups they are in.

what we know

What we want

Efficiency

Simplicity

Proposed Multi-Round Trip API

Privacy/Security

Pitfalls

Unavoidable

Consumers learn when they can access content

Consumers learn that they cannot access content

Consumers loosely categorize consumers

Through collusion

Over time

Consumers can learn when added/removed from groups

Content hosts can fingerprint users

Authenticating with partial state reveals groups (over time)

Content hosts should not store secret state

Cached Credentials and Privacy

Current Protocols

Trivial Protocol

Setup

Protocol

Basic Bilinear Map Protocol

Setup

Protocol

Polynomial Bilinear Map Protocol

Setup

Protocol

Extension

Discussion

Privacy

The publisher is public

Consumers learn when a resource's ACL has not been replaced

Consumers can prove that they have been added/removed from group(s)

Consumers can definitively learn if they share groups by collusion

Security

Been There, Done That

Stupid Always Online Scheme

Protocol

Why not?

Stupid Precompute Scheme

Why not?

Group Secret Keys

Protocol

Why not?

Clone this wiki locally