title | filename | chapternum |
---|---|---|
Software obfuscation |
lec_21_obfuscation |
22 |
Let us stop and think of the notions we have seen in cryptography. We have seen that under reasonable computational assumptions (such as LWE) we can achieve the following:
-
CPA secure private key encryption and Message Authentication codes (which can be combined to get CCA security or authenticated encryption)- this means that two parties that share a key can have virtual secure channel between them. An adversary cannot get any additional information beyond whatever is her prior knowledge given an encryption of a message sent from Alice to Bob. Even if Moreover, she cannot modify this message by even a single bit. It's lucky we only discovered these results from the 1970's onwards--- if the Germans have used such an encryption instead of ENIGMA in World War II there's no telling how many more lives would have been lost.
-
Public key encryption and digital signatures that enable Alice and Bob to set up such a virtually secure channel without sharing a prior key. This enables our "information economy" and protects virtually every financial transaction over the web. Moreover, it is the crucial mechanism for supplying "over the air" software updates which smart devices whether its phones, cars, thermostats or anything else. Some had predicted that this invention will change the nature of our form of government to crypto anarchy and while this may be hyperbole, governments everywhere are worried about this invention.
-
Hash functions and pseudorandom function enable us to create authentication tokens for deriving one-time passwords out of shared keys, or deriving long keys from short passwords. They are also useful as a tool in password based key exchange, which enables two parties to communicate securely (with fairly good but not overwhelming probability) when they share a 6 digit PIN, even if the adversary can easily afford much much more than
$10^6$ computational cycles. -
Fully homomorphic encryption allows computing over encrypted data. Bob could prepare Alice's taxes without knowing what her income is, and more generally store all her data and perform computations on it, without knowing what the data is.
-
Zero knowledge proofs can be used to prove a statement is true without revealing why its true. In particular since you can use zero knowledge proofs to prove that you posses X bitcoins without giving any information about their identity, they have been used to obtain fully anonymous electronic currency.
-
Multiparty secure computation are a fully general tool that enable Alice and Bob (and Charlie, David, Elana, Fran,.. ) to perform any computation on their private inputs, whether it is to compute the result of a vote, a second-price auction, privacy-preserving data mining, perform a cryptographic operation in a distributed manner (without any party ever learning the secret key) or simply play poker online without needing to trust any central server.
(BTW all of the above points are notions that you should be familiar and be able to explain what are their security guarantees if you ever need to use them, for example, in the unlikely event that you ever find yourself needing to take a cryptography final exam...)
While clearly there are issues of efficiency, is there anything more in terms of functionality we could ask for? Given all these riches, can we be even more greedy?
It turns out that the answer is yes. Here are some scenarios that are still not covered by the above tools:
Suppose that you have uncovered a conspiracy that involves very powerful people, and you are afraid that something bad might happen to you.
You would like an "insurance policy" in the form of writing down everything you know and making sure it is published in the case of your untimely death, but are afraid these powerful people could find and attack any trusted agent.
Ideally you would to publish an encrypted form of your manuscript far and wide, and make sure the decryption key is automatically revealed if anything happens to you, but how could you do that?
A UA-secure encryption (which stands for secure against an Underwood) attack) gives an ability to create an encryption
The technical term for this notion is witness encryption by which we mean that for every circuit
Here is another scenario that is seemingly not covered by our current tools.
Suppose that Alice uses a public key system
It's not just individuals that don't have all their needs met by our current tools.
Think of a large enterprise that uses a public key encryption
This will allow us to give the key
The general form of this is called a functional encryption. The idea is that for every function $f:{0,1}^\rightarrow{0,1}^$ we can create a decryption key
The formal definition of functional encryption is the following:
A tuple
- For every function
$f:{0,1}^\ell\rightarrow{0,1}$ , if$(d,e)=G(1^n)$ and$d_f = KeyDist(d,f)$ , then for every message$m$ ,$D_{d_f}(E_e(m))=f(m)$ .
- Every efficient adversary Eve wins the following game with probability at most
$1/2 + negl(n)$ :
- We generate
$(d,e) \leftarrow_R G(1^n)$ . \ - Eve is given
$e$ and for$i=1,\ldots,T=poly(n)$ repeatedly chooses$f_i$ and receives$d_{f_i}$ . \ - Eve chooses two messages
$m_0,m_1$ such that$f_i(m_0)=f_i(m_1)$ for all$i = 1,\ldots, T$ . \ - For
$b \leftarrow_R {0,1}$ , Eve receives$c^* = E_e(m_b)$ and outputs$b'$ . \ - Eve wins if
$b'=b$ .
\
It's not only exotic forms of encryption that we're missing.
Here is another application that is not yet solved by the above tools.
From time to time software companies discover a vulnerability in their products.
For example, they might discover that if fed an input
All these applications and more could in principle be solved by a single general tool known as virtual black-box (VBB) secure software obfuscation. In fact, such an obfuscation is a general tool that can also be directly used to yield public key encryption, fully homomorphic encryption, zero knowledge proofs, secure function evaluation, and many more applications.
We will now give the definition of VBB secure obfuscation and prove the central result about it, which is unfortunately that secure VBB obfuscators do not exist. We will then talk about the relaxed notion of indistinguishablity obfuscators (IO) - this object turns out to be good enough for many of the above applications and whether it exists is one of the most exciting open questions in cryptography at the moment. We will survey some of the research on this front.
Let's define a compiler to be an efficient (i.e., polynomial time) possibly probabilistic map
A compiler
$A(\mathcal{O}(C))$
-
$S^C(1^{|C|})$ where by this we mean the output of$S$ when it is given the length of$C$ and access to the function$x \mapsto C(x)$ as a black box (aka oracle access).
(Note that the distributions above are of a single bit, and so being indistinguishable simply means that the probability of outputting
The writings of Diffie and Hellman, James Ellis, and others that thought of public key encryption, shows that one of the first approaches they considered was to use obfuscation to transform a private-key encryption scheme into a public key one.
That is, given a private key encryption scheme
These days we know other approaches for obtaining public key encryption, but the obfuscation-based approach has significant additional flexibility.
To turn this into a fully homomorphic encryption, we simply publish the obfuscation of $c,c' \mapsto D_k(c) NAND D_k(c')$.
To turn this into a functional encryption, for every function
We can also use obfuscation to get a witness encryption, to encrypt a message error
otherwise.
To solve the patch problem, for a given regular expression we can obfuscate the function that maps
So far, we've learned that in cryptography no concept is too fantastic to be realized. Unfortunately, VBB secure obfuscation is an exception:
Under the PRG assumption, there does not exist a VBB secure obfuscating compiler.
We will now show the proof of obfimpthm{.ref}.
For starters, note that obfuscation is trivial for learnable functions.
That is, if
However, this is not so useful, since it's not hard to see that all the examples above where we wanted to use obfuscation involved functions that were unlearnable.
But it already suggests that we should use an unlearnable function for our negative result.
Here is an extremely simple unlearnable function.
For every
Given black box access for this function for a random
This function already yields a counterexample for a stronger version of the VBB definition.
We define a strong VBB obfuscator to be a compiler
There does not exist a strong VBB obfuscator.
Suppose towards a contradiction that there exists a strong VBB obfuscator
these probabilities are over the coins of
Clearly $()$ implies that that these two distributions are not indistinguishable, and so proving $()$ will finish the proof.
The algorithm
On the other hand, for
The adversary in the proof of strongvbblem{.ref} does not seem very impressive. After all, it merely printed out its input. Indeed, the definition of strong VBB security might simply be an overkill, and "plain" VBB is enough for almost all applications. However, as mentioned above, plain VBB is impossible to achieve as well. We'll prove a slightly weaker version of obfimpthm{.ref}:
If fully homomorphic encryption exists then there is no VBB secure obfuscating compiler.
(To get the original theorem from this, note that if VBB obfuscation exists then we can transform any private key encryption into a fully homomorphic public key encryption.)
Let
We will use this function family where
We claim that for every simulator
Indeed, the distinguisher
In contrast if we let
Case 1: The query is equal to
Case 2: The query is equal to
Case 2 only happens with negligible probability because if
Case 1 only happens with negligible probability because otherwise
Now if neither case happens, then
This proof is simple but deserves a second read.
A crucial point here is to use FHE to allow the adversary to essentially "feed
The proof can be generalized to give private key encryption for which the transformation to public key encryption would be insecure, and many other such constructions. So, this result might (and indeed to a large extent did) seem like a death blow to general-purpose obfuscation. However, already in that paper we noticed that there was a variant of obfuscation that we could not rule out, and this is the following:
We say a compiler
It is a good exercise to understand why the proof of the impossibility result above does not apply to rule out IO. Nevertheless, a reasonable guess would be that:
-
IO is impossible to achieve.
-
Even if it was possible to achieve, it is not good enough for most of the interesting applications of obfuscation.
However, it turns out that this guess is (most likely) wrong. New results have shown that IO is extremely useful for many applications, including those outlined above. They also gave some evidence that it might be possible to achieve. We'll talk about those works in the next lecture.
Footnotes
-
One could also think of a deniable witness encryption, and so if Janine in the scenario above is forced to open the ciphertexts she sent by reveal the randomness used to create them, she can credibly claim that she didn't encrypt her knowledge of the conspiracy, but merely wanted to make sure that her family secret recipe for pumpkin pie is not lost when she passes away. ↩