Session Key Construction: Extracting Randomness

Suppose Alice and Bob wish to perform a Diffie-Hellman key exchange. Then for some \(n\) they pick an \(n\)-bit prime \(p\), and an element \(g \in \mathbb{Z}_p^*\) that is a generator of a cyclic group of prime order \(q\).

Then Alice picks a random \(0 \le x \lt q\) and sends \(g^x\) to Bob, and Bob picks a random \(0 \le y \lt q\) and sends \(g^y\) to Alice. They both compute their shared secret \(g^xy\). But this is not the whole story: now they must somehow derive a session key from this quantity. To an adversary, the session key should be indistinguishable from a random string.

In practice, an off-the-shelf hash function such as SHA1 is applied to some representation of \(g^{xy}\), and then the 128 LSBs are used as the key. Unfortunately the only known way to prove that this is secure is to "prove by assumption": i.e. concoct the Hash Diffie-Hellman assumption.

Security from the CDH assumption

One obvious way to proceed is to extract a hardcore bit of \(g^{xy}\). For example, using the Goldreich-Levin method, they can find a hardcore bit \(B\), and to derive a 128-bit key, they simply perform 128 Diffie-Hellman exchanges and extract \(B(g^{xy})\) each time. This is secure provided we assume no efficient Computational Diffie-Hellman algorithm exists, for then the adversary should not be able to distinguish between the distributions \(\{g^x, g^y, B(g^{xy}), x,y \leftarrow \mathbb{Z}_q\}\) and \(\{g^x, g^y, r, x,y\leftarrow\mathbb{Z}_q, r\leftarrow\{0,1\}\}\). More formally:

Theorem: If the CDH problem is \((t,\epsilon)\)-hard then there is no \((t,\epsilon/128)\)-distinguisher between

\[\{g^{x_1},...,g^{x^{128}}, g^{y_1},...,g^{y_{128}}, K=B(g^{x_1 y_1})||...||B(g^{x_{128} y_{128}}), \bar{x} , \bar{y} \leftarrow \mathbb{Z}_q^{128}\}\]

and

\[\{g^{x_1},...,g^{x^{128}}, g^{y_1},...,g^{y_{128}}, K, \bar{x} , \bar{y} \leftarrow \mathbb{Z}_q^{128}, K \leftarrow\{0,1\}^{128}\}\]

Sketch of proof: distinguisher \(\implies\) predictor for \(i\)th bit \(\implies\) predictor for \(B(g^{xy})\) given \(g^x, g^y\) \(\implies\) CDH algorithm.

However, this method is impractical as it requires one Diffie-Hellman exchange for each bit of the session key. Unfortunately, it is not known how to improve this method much without using a stronger assumption:

Security from the DDH assumption

The \((t,\epsilon)\)-Decisional Diffie-Hellman (DDH) assumption says that the distribution

\[\{g,g^a,g^b,g^c | a,b,c \leftarrow \mathbb{Z}_q\}\]

is \((t,\epsilon)\)-indistinguishable from the distribution

\[\{g,g^a,g^b,g^{ab} | a,b \leftarrow \mathbb{Z}_q\}\]

(In other words, given \(g^a\) and \(g^b\), \(g^{a b}\) is indistinguishable from a random element, which implies that \(\lfloor lg q \rfloor\) bits of \(g^{ab}\) are simultanesouly secure given \(g^a, g^b\).

Using this assumption allows a provably secure scheme for extracting a session key, via the left-over lemma.

Let \(p\) be a 1024-bit prime, and let \(q\) be a 255-bit prime such that \(G\) is a subgroup of \(\mathbb{Z}_p^*\) of order \(q\), and let \(g\) be a generator of \(G\). Let \(m = \lfloor \lg q \rfloor\), and suppose \(|G| \approx 2^m\). Set \(e = m/3\), and let \(H=\{h_\lambda|\lambda\in\Lambda\}: \mathbb{Z}_p^* \rightarrow \{0,1\}^{m-2e} = \{0,1\}^{m/3} = \{0,1\}^{85}\). be almost universal. Then consider the following key exchange protocol:

Alice picks \(x\leftarrow\mathbb{Z}_q, \lambda_1\leftarrow\Lambda\) and sends \(g^x,\lambda_1\) to Bob. Similarly, Bob picks \(y\leftarrow\mathbb{Z}_q, \lambda_2\leftarrow\Lambda\) and sends \(g^y,\lambda_2\) to Alice. They both compute the secret key \(SK=h_{\lambda_1\oplus\lambda_2}(g^{xy})\).

Theorem: Assuming \((t,\epsilon)\)-DDH on \(G\), the distributions

\[\{g,g^x,g^y,\lambda_1,\lambda_2,SK | x,y\leftarrow\mathbb{Z}_q, \lambda_1,\lambda_2\leftarrow\Lambda\}\]

and

\[\{g,g^x,g^y,\lambda_1,\lambda_2,R | x,y\leftarrow\mathbb{Z}_q, \lambda_1,\lambda_2\leftarrow\Lambda, R\leftarrow\{0,1\}^{m/3}\}\]

are \((t,\epsilon+1/2^{m/3})\)-indistinguishable.

Proof: Suppose \(A\) is a \((t,\epsilon + 1/2^{m/3})\)-distiguisher. Then we can construct a DDH-distiguisher \(B\) as follows.

\(B\) is given a tuple \(I = \langle g,g_1,g_2,g_3 \rangle \in G^4\).
\(B\) picks random \(\lambda_1,\lambda_2\leftarrow\Lambda\), and computes \(S = h_{\lambda_1\oplus\lambda_2}(g_3)\).
\(B\) gives the tuple \(I'=\langle g_1,g_2,g_3,\lambda_1,\lambda_2,S\rangle\) to \(A\), and outputs the output of \(A\).

We show that \(B\) is a \((t,\epsilon)\)-DDH algorithm. Note

where we have used the inequality \(|A+B| \ge |A| - |B|\). Now the first term is at least \(\epsilon + 1/2^{m/3}\), while the second term is at most \(1/2^{m/3}\) by the leftover lemma.

Leftover Lemma

The leftover lemma shows how to extracts randomness from a "defective" random source.

Definition: Let

\[H=\{h_\lambda : \{0,1\}^n \rightarrow \{0,1\}^l\}_{\lambda \in \Lambda}\]

\(H\) is called a universal family of hash functions if for all \(x \ne y \in \{0,1\}^n\) we have

\[Pr_{h\leftarrow H}[h(x) = h(y)] = 1/2^l\]

\(H\) is almost universal if for all \(x \ne y \in \{0,1\}^n\) we have

\[Pr_{h\leftarrow H}[h(x) = h(y)] = 1/2^l + 1/2^n .\]

A set of all linear congruential generators (modulo some prime \(p\)) is an example of an almost universal family of hash functions.

Definition: Let \(S\) be a finite set and \(D, D'\) be distributions on \(S\). For all \(x\in S\), denote the probability of picking \(x\) according to \(D\) by \(D(x)\), and for all \(X\subseteq S\), define \(D(X) = \sum_{x\in X}D(x)\). Define the following:

The collision probability of \(D\) is \(coll(D)=\sum_{x\in S}D(x)^2\).
\(D, D'\) are \(\epsilon\)-statistically indistinguishable if for all \(X\subseteq S\), we have \(|D(X)-D'(X)| \lt \epsilon\).
\(D\) is \(\epsilon\)-quasirandom if \(D\) is \(\epsilon\)-statistically indistinguishable from the uniform distribution on \(S\) (which we denote by \(U(S)\). Hence for any subset \(X \subseteq S\) we have \(|D(X)-|X|/|S|| \lt \epsilon.\)

Quasirandomness is useful. For suppose \(D\) is \(\epsilon\)-quasirandom on \(S\). Then for any algorithm \(A\), we have

\[ |Pr_{x\leftarrow D}[A(x) \text{accepts}] - Pr_{x\leftarrow U(S)}[A(x) \text{accepts}]| \lt \epsilon \]

Proof: Let \(T\subseteq S\) be \(T\{x|A(x)\}\). Then

\[ \array { |Pr_{x\leftarrow D}[A(x)] - Pr_{x\leftarrow U(S)}[A(x)]| &=& |Pr_{x\leftarrow D}[A(x)|x\in T]\Pr[x\in T] \\ &+& Pr_{x\leftarrow D}[A(x)|x\notin T]\Pr[x\notin T] \\ &-& Pr_{x\leftarrow U(S)}[A(x)|x\in T]\Pr[x\in T] \\ &-& Pr_{x\leftarrow U(S)}[A(x)|x\notin T]\Pr[x\notin T]| \\ &=& |Pr_{x\leftarrow D}[x\in T] - Pr_{x\leftarrow U(S)}[x \in T]| \\ &=& |D(T) - U(T)| \lt \epsilon } \]

Leftover Lemma: Let \(G\subseteq\{0,1\}^n\), \(|G|\gt 2^l\). Let \(e\gt 0\) and let \(H\) be an almost universal hash family from \(\{0,1\}^n\) to \(\{0,1\}^{l-2e}\). Then the distribution \(\{(h,h(x))|h\leftarrow H, x\leftarrow G\}\) is \(1/2^e\)-quasirandom on \(H \times \{0,1\}^{l-2e}\).

Proof: there are two steps to the proof.

Claim: (proof omitted) because \(H\) is almost universal, the collision probability

\[ coll((h,h(x))_{h\leftarrow H,x\leftarrow G}) \lt (1+2/2^{2e})/|H|2^{l-2e} \]

Claim: Let \(D\) be a distribution on a finite set \(S\). If \(coll(D) \lt (1+2\delta^2)/|S|\) then \(D\) is \(\delta\)-quasirandom.

Proof of claim: Suppose there exists some subset \(X\subseteq S\) such that \(D(X) \ge |X|/|S| + \delta\). Without loss of generality, assume equality holds (we can just increase \(\delta\)). We have \(D(S\setminus X)=1-D(X)\). Then

\[ \array { coll(D) &=& Pr_{x_1,x_2\leftarrow D}[x_1 = x_2] \\ &=& Pr_D [x_1=x_2|x_1 \wedge x_2 \in X]\Pr[x_1 \wedge x_2\in X] + Pr_D [x_1=x_2|x_1 \wedge x_2 \notin X]\Pr[x_1 \wedge x_2\notin X] \\ &=& Pr_D [x_1=x_2|x_1 \wedge x_2 \in X]D(X)^2 + Pr_D [x_1=x_2|x_1 \wedge x_2 \notin X]D(S\setminus X)^2 \\ &\ge& 1/|X| D(X)^2 + 1/|S\setminus X| D(S\setminus X)^2 \\ &=& D(X)^2/|X| + (1-D(X))^2/(|S|-|X|) \\ &=& 1/|S| + \delta^2/|X| + \delta^2/(|S|-|X|) } \]

(Because we are considering collison probabilities, we can ignore the cases when only one of \(x_1, x_2\) lies in \(X\)). The last line is minimized when \(|X| = |S|/2\), thus we have

\[coll(D) \ge 1/|S| + 4\delta^2/|S| \ge 1|S| + 2\delta^2/|S| \]

So by the first claim

\[ coll((h,h(x))_{h\leftarrow H,x\leftarrow G}) \lt (1+2(1/2^e)^2)/|H|2^{l-2e} \]

and we have \(|S| = |H|2^{l-2e}\), \(\delta = 1/2^e\). Then by the second claim, \((h,h(x))\) is \(1/2^e\)-quasirandom.

More Powerful Extractors

There are more effective techniques than the leftover lemma [SZ94]. Let \(D\) be a distribution on a finiste set \(S\) and define

\[min-entropy(D)=\min_{x\in S}\{-\lg D(x)|D(x) \ne 0\}.\]

For example, the min-entropy of a point distribution is zero, and for the uniform distribution \(U(S)\), it is \(\lg|S|\).

Now suppose \(G\) is a set with \(|G| = |S|/2^\delta\). Then \(min-entropy(U(G)) = lg |S| - \delta\). Thus we see that \(min-entropy(D)=\delta\) implies that for all \(x \in S\) we have \(D(x)\le 2^{-\delta}\). Sometimes the distribution \(D\) is referred to as a \(\delta\)-source. It turns out that it is possible to extract about \(\delta\) bits from an arbitrary \(\delta\)-source.

Theorem: For any \(l\), and \(e\le l/2\), there exists a family \(H\) of hash functions

\[ H=\{h_\lambda:\{0,1\}^n\rightarrow\{0,1\}^{l-2e}\}_{\lambda\in\Lambda} \]

such that

\(|H| = O(n 2^{4(l-e)})\) (only \(4(l-e)+O(\log n)\) bits are needed to describe the family)
For any distribution \(D\) on \(\{0,1\}^m\) with \(min-entropy(D) = l\), the distribution \(\{(h,h(x)|h\leftarrow H, x\leftarrow D\}\) is \(2/2^e\)-quasirandom

The theorem is used to construct cryptosystems with exposure-resilience: even if an adversary learns "crazy" bits of a key, security is still maintained. The idea is to split a key into two parts, the first part specifies the hash function to use while the other bits are fed into the hash function.

Ben Lynn blynn@cs.stanford.edu 💡

Session Key Construction: Extracting Randomness

Security from the CDH assumption

Security from the DDH assumption

Leftover Lemma

More Powerful Extractors

Contents