-
Notifications
You must be signed in to change notification settings - Fork 0
what we know
WIP. Sorry about any grammar mistakes.
We want the following magic box:
Consumer Host
-------- ----
+---------+
{Consumer State} -> | | <- {Content Resource State (for a resource)}
| *MAGIC* | <- {time()}
| | -> access?
+---------+
- The consumer should only have refresh its state (over the network) once per day or maybe once per hour.
- The content host should not initiate any network connections.
We would like to avoid having content hosts run special DRACL servers. Instead,
we would like to have provide two functions: make_challenge(acl)
and
verify_response(acl, challenge, response)
. Therefore, we establish the
following rules:
- The content host should be stateless with respect to authentication. The consumer should be responsible for storing any intermediate state.
- The system should expose a limited API to the content host.
Define one function, dracl_step: input -> Allow|Deny|Error|Continue(output)
. If
dracl_step
outputs continue, the output should be sent to the client and the
protocol run again. Basically, this is a black box state machine.
- This protocol should return false if any party inputs partial state.
- A read-only compromise of the content host should not allow an attacker to authenticate as a consumer for any resource.
- Preferably, the authentication provider would store no (or little) secret state on internet facing servers.
Users know the set of things they can access. Obviously.
Unfortunately, users may also learn about content they cannot access. This information leakage can be limited by not telling users about content that they can't access (or by mixing in content that no user can access).
Basically, there are a couple of ways in which consumers can learn about the producer's groups that we can't fix.
If multiple users collude, they can always determine which content is accessible by any subset of the colluding users. If some subset is able to access a set of content and another subset is not, the users can conclude that the first subset is probably in a group to which the second subset does not belong.
Over time, users will be added and removed from groups. They can use this to map out which content they gain and lose access to as follows:
time r1 r2 r3 ...
-----------------
1 y n n
2 y y y
...
If a consumer Alice learns, through some side channel (e.g. the content itself), that some other consumer Bob can access r2, Alice can deduce that Bob can likely also access r3.
This is a direct consequence of learning access patterns and changing access over time.
Content hosts can fingerprint users based on the content they can access. This can be mitigated by asking a user if he or she wishes to authenticate.
If a consumer can authenticate with part of its state, it can categorize resources by which pieces of its state can be used to access each resource and which pieces of state can't. If it knows anything about which other users can access each resource (for example, messages often address the readers by name), it can categorize users by the resources they can access. By transitivity, it can categorize users by the state it needed to authenticate.
If a content host can verify a user using partial information, it can fingerprint the user by learning which piece of information it needed to use to authenticate the user.
If a consumer can verify against a partial ACLs, the consumer can collude with another consumer to learn about group composition.
If a content host stores secret state, this state will have to be replaced on break in. Furthermore, if this state can be used to authenticate and access resources, an information disclosure exploit becomes a read-write exploit.
In practice, this means that all authentication should use asymmetric crypto.
As a general rule, caching/reusing state tends to leak information. However, I can't quite formalize this observation.
The basic idea is to encrypt a resource key with every user's secret key.
Give every consumer a secret u_i
. Generate the ACL as follows:
s = GenSalt()
k = GenKey()
H = {Hash(s, u_i) ⊕ k ∀ u_i ∈ ACL}
v = Hash(k) // Used to check k
C = (s, H, v)
ACL = (C,σ(C))
To prove membership in this ACL, the consumer:
-
Verifies the signature.
-
Computes
hu_i = Hash(s, u_i)
. To prove membership in this ACL, the consumer: -
Verifies the signature.
-
Computes
hu_i = Hash(s, u_i)
. -
For all
h_i∈ H
, computek_i = h_i⊕ hu_i
and finds thek_i
whereHash(k_i) == v
. -
Returns
k_i
(or proves knowledge there of).
This first protocol provides consumer privacy in the face of a malicious content host.
Vinod: This is an improved version of the basic bilinear map protocol that we discussed that works in the face of a malicious content server.
Every day each consumer consumer gets...
...a public component where t
changes daily:
Pk_t = h^t, σ(h^t) // σ is a signature by the producer.
...a secret component for each group γ_i
the consumer is a member of:
Sk_u = {g^(t*γ_i)}
For each resource, the content host gets (all public).
// for all groups γ_ithat should be able to access the content.
Pk_r = {g^(r*γ_i)}
σ(Pk_r)
Sk_r = h^r
Consumer
First, the consumer sends today's public component (Pk_t
) to the content host.
Content Host
The content host checks the signature on the public component and verifies that it is valid today (proving that the credentials are valid today).
The content host generates a random (or, more likely derives) s
and, for each
entry in Pk_r
, computes
k_i = e(Pk_r, Pk_t^s) = e(g^(r*γ_i), h^(s*t)) = e(g, h)^(t*r*s*γ_i)
Next, the content host uses each k_i
to symmetrically encrypt s
generating:
Chal = {SymmetricEncrypt(e(g, h)^(t*r*s*γ_i), s)}
Finally, the content sends Chal
, Sk_r^s
, and Pk_r
(including the signature) to the
consumer.
Consumer
The consumer checks the signature on Pk_r
and then computes computes k_i
s for
each element in Sk_u
:
k_i = e(g^(t*γ_i), Sk_r^s) = e(g, h)^(t*r*s*γ_i)
Then, the consumer tries to decrypt each element in Chal
with each k_i
(we
can probably make this linear). If the consumer succeeds, it uses the decrypted
s
, Pk_t
, and the elements in Pk_r
to generate the real k_i
s:
k_i = e(Pk_t^s, g^(r*γ_i)) = e(g, h)^(t*s*r*γ_i)
It then uses these to decrypt the rest of the entries in Chal
and verify that
they contain s
. This prevents the content host from lying to the consumer by
providing a partial challenge.
Finally, the consumer then sends s
back to the content host to gain access to
the content.
This protocol should prevent consumers from learning anything about the groups they are in unless they collude. Yes, I understand that this guarantee is a bit vague. I'm working on it.
Unfortunately, this protocol limits how many groups a consumer can be in per-producer.
First, define some maximum number of groups a consumer can be in (n
). Then,
pick a set of n + 2
numbers (p
). Finally, generate permutations of g
and
h
as follows:
g_i = g^p_i // forall 0 ≤ i ≤ n
h_i = g^(1/p_i) // forall 0 ≤ i ≤ n
g_t = g^p_{n+1}
h_t = g^(1/p_{n+1})
This means that all e(g_i, h_i) = e(g, h)
. This ensures that an attacker can't
pair some g_i
with the wrong h_i
and get anything useful back.
For each consumer, every day, the producer computes the consumer secret
// Example with 3 groups.
ax³ + bx² + cx + d = t*((x-γ_1)(x-γ_2)(x-γ_3) + 1)
Sk_u = [g_3^a, g_2^b, g_1^c, g_0^d]
The day's public component (Pk_t
) is:
Pk_t = h_t^t, σ(h_t^t) // σ is a signature by the producer.
For the content host, for each resource, the producer computes two public components:
// For each group γ_i
// n is the maximum number of groups a consumer can be in per producer.
Pk_rh = {[h_n^(r*γ_iⁿ), ..., h_3^(r*γ_i³), h_2^(r*γ_i²), h_1^(r*γ_i), h_0^r], ...}
σ(Pk_r)
Pk_rg = g_t^r
Consumer
First, the consumer sends Pk_t
to the content host.
Content Host
Then, the content host verifies the signature on Pk_t
.
Next, the content host chooses some secret s
and computes,
K = e(Pk_rg^s, Pk_t) = e(g, h)^(r*s*t)
Chal = SymmetricEncrypt(K, s)
Pk_rh^s = {[h_n^(r*s*γ_iⁿ), ..., h_3^(r*s*γ_i³), h_2^(r*s*γ_i²), h_1^(r*s*γ_i), h_0^(r*s)], ...}
The content then sends Chal
, Pk_rh^s
, and Pk_rg
to consumer.
Consumer
For each element in Pk_rh^s
, the consumer uses Sk_u
to calculate:
K = e(h_3^(r*s*γ_i³), g_3^a)*e(h_2^(r*s*γ_i²), g_2^b)*e(h_1^(r*s*γ_i¹), g_1^c)*e(h_0^(r*s), g_0^d)
= e(h, g)^(r*s*t*(a*γ_i³ + b*γ_i² + c*γ_i¹ + d))
And try to decrypt Chal
. If γ_i
is one of the user's group, the polynomial
should evaluate to 1 leaving e(h, g)^(r*s*t)
.
Return s
to the content host.
We should probably include a randomly generated "group" in each consumer secret but this is mostly gut feeling.
So, I'm a bit concerned about the security of this protocol. With collusion,
users will likely be able to learn g^(t*γ_i)
all and g^t
but I don't think
this breaks security.
This system has the following privacy deficiencies in addition to the unavoidable ones.
In this protocol, both authorized consumers (consumers who can access the content) and the content host must learn the identity (really, a pseudonym) of the publisher.
The challenge contains a signature that's only replaced when the ACL itself is replaced so users can learn when the ACL has been replaced. Note: users can't tell if the new ACL differs from the old one.
Learning that one has likely been added/removed from a group is impossible to prevent but this protocol allows users to prove that they have been added/removed from at least one group. They can do this because they can prove that they have lost access to a resource which hasn't had its ACL replaced.
If multiple users collude, they can definitively learn whether or not they are in the same group because challenges are not atomic. If two users can decrypt the same part of a challenge, they can conclude that they are in the same group.
Consumers will retain access to content in groups from which they have been removed for a day. Note: we can shrink this window a bit which I'll explain later...
This is a very simple authentication scheme that gives us everything we want in terms of privacy/security and simplicity but is terribly inefficient in terms of network communication.
https://idemixdemo.mybluemix.net/
To create an ACL, a publisher creates a list of uids/gids that can access the content and encrypts it with a symmetric secret key.
To access a resource:
- The content host sends the ACL to the consumer.
- The consumer authenticates with the publisher's authentication provider and uploads the ACL.
- The authentication provider decrypts the ACL, verifies that the consumer should be able to access the content, then signs the ACL certifying that this user should be able to access the content.
- The consumer gives this certificate to the content host.
- The content host needs to be in the loop (hot path) for every transaction.
- A break-in at the authentication provider's server compromises the entire system.
- The authentication provider learns when any user tries to access any content.
Basically, take the always online scheme and precompute the signatures. That is, every client gets a certificate/token for every resource it can access.
- If we use certificates, tons of up-front computation. If we use tokens, we have to modify ACLs to revoke access.
- Publishers need to remember every piece of content they have ever published and who has access to it.
- Consumers need to remember every piece of content they have access to.
Every group is identified by a unique secret key.
To create an ACL, generate a resource specific asymmetric keypair, encrypt the secret key with every group key for every group that should be able to access the resource, and sign this encrypted set. The ACL is the public key + the signed encrypted set.
To authenticate, a consumer downloads the encrypted set, checks the signature, and tries decrypting each entry. If the consumer succeeds in decrypting an entry, it proves it knows the asymmetric secret key (by, e.g., signing some document).
-
Producers can't "take back" group keys so, to remove a consumer from a group, the producer must re-create that group with a new key. This means that, if the producer wants the removed user to loose access to existing resources, the producer must update all published ACLs that include that group.
-
Users learn about the groups they are in.