# Session Key Construction: Extracting Randomness

Suppose Alice and Bob wish to perform a Diffie-Hellman key exchange. Then for some $$n$$ they pick an $$n$$-bit prime $$p$$, and an element $$g \in \mathbb{Z}_p^*$$ that is a generator of a cyclic group of prime order $$q$$.

Then Alice picks a random $$0 \le x \lt q$$ and sends $$g^x$$ to Bob, and Bob picks a random $$0 \le y \lt q$$ and sends $$g^y$$ to Alice. They both compute their shared secret $$g^xy$$. But this is not the whole story: now they must somehow derive a session key from this quantity. To an adversary, the session key should be indistinguishable from a random string.

In practice, an off-the-shelf hash function such as SHA1 is applied to some representation of $$g^{xy}$$, and then the 128 LSBs are used as the key. Unfortunately the only known way to prove that this is secure is to "prove by assumption": i.e. concoct the Hash Diffie-Hellman assumption.

## Security from the CDH assumption

One obvious way to proceed is to extract a hardcore bit of $$g^{xy}$$. For example, using the Goldreich-Levin method, they can find a hardcore bit $$B$$, and to derive a 128-bit key, they simply perform 128 Diffie-Hellman exchanges and extract $$B(g^{xy})$$ each time. This is secure provided we assume no efficient Computational Diffie-Hellman algorithm exists, for then the adversary should not be able to distinguish between the distributions $$\{g^x, g^y, B(g^{xy}), x,y \leftarrow \mathbb{Z}_q\}$$ and $$\{g^x, g^y, r, x,y\leftarrow\mathbb{Z}_q, r\leftarrow\{0,1\}\}$$. More formally:

Theorem: If the CDH problem is $$(t,\epsilon)$$-hard then there is no $$(t,\epsilon/128)$$-distinguisher between

$\{g^{x_1},...,g^{x^{128}}, g^{y_1},...,g^{y_{128}}, K=B(g^{x_1 y_1})||...||B(g^{x_{128} y_{128}}), \bar{x} , \bar{y} \leftarrow \mathbb{Z}_q^{128}\}$

and

$\{g^{x_1},...,g^{x^{128}}, g^{y_1},...,g^{y_{128}}, K, \bar{x} , \bar{y} \leftarrow \mathbb{Z}_q^{128}, K \leftarrow\{0,1\}^{128}\}$

Sketch of proof: distinguisher $$\implies$$ predictor for $$i$$th bit $$\implies$$ predictor for $$B(g^{xy})$$ given $$g^x, g^y$$ $$\implies$$ CDH algorithm.

However, this method is impractical as it requires one Diffie-Hellman exchange for each bit of the session key. Unfortunately, it is not known how to improve this method much without using a stronger assumption:

## Security from the DDH assumption

The $$(t,\epsilon)$$-Decisional Diffie-Hellman (DDH) assumption says that the distribution

$\{g,g^a,g^b,g^c | a,b,c \leftarrow \mathbb{Z}_q\}$

is $$(t,\epsilon)$$-indistinguishable from the distribution

$\{g,g^a,g^b,g^{ab} | a,b \leftarrow \mathbb{Z}_q\}$

(In other words, given $$g^a$$ and $$g^b$$, $$g^{a b}$$ is indistinguishable from a random element, which implies that $$\lfloor lg q \rfloor$$ bits of $$g^{ab}$$ are simultanesouly secure given $$g^a, g^b$$.

Using this assumption allows a provably secure scheme for extracting a session key, via the left-over lemma.

Let $$p$$ be a 1024-bit prime, and let $$q$$ be a 255-bit prime such that $$G$$ is a subgroup of $$\mathbb{Z}_p^*$$ of order $$q$$, and let $$g$$ be a generator of $$G$$. Let $$m = \lfloor \lg q \rfloor$$, and suppose $$|G| \approx 2^m$$. Set $$e = m/3$$, and let $$H=\{h_\lambda|\lambda\in\Lambda\}: \mathbb{Z}_p^* \rightarrow \{0,1\}^{m-2e} = \{0,1\}^{m/3} = \{0,1\}^{85}$$. be almost universal. Then consider the following key exchange protocol:

Alice picks $$x\leftarrow\mathbb{Z}_q, \lambda_1\leftarrow\Lambda$$ and sends $$g^x,\lambda_1$$ to Bob. Similarly, Bob picks $$y\leftarrow\mathbb{Z}_q, \lambda_2\leftarrow\Lambda$$ and sends $$g^y,\lambda_2$$ to Alice. They both compute the secret key $$SK=h_{\lambda_1\oplus\lambda_2}(g^{xy})$$.

Theorem: Assuming $$(t,\epsilon)$$-DDH on $$G$$, the distributions

$\{g,g^x,g^y,\lambda_1,\lambda_2,SK | x,y\leftarrow\mathbb{Z}_q, \lambda_1,\lambda_2\leftarrow\Lambda\}$

and

$\{g,g^x,g^y,\lambda_1,\lambda_2,R | x,y\leftarrow\mathbb{Z}_q, \lambda_1,\lambda_2\leftarrow\Lambda, R\leftarrow\{0,1\}^{m/3}\}$

are $$(t,\epsilon+1/2^{m/3})$$-indistinguishable.

Proof: Suppose $$A$$ is a $$(t,\epsilon + 1/2^{m/3})$$-distiguisher. Then we can construct a DDH-distiguisher $$B$$ as follows.

• $$B$$ is given a tuple $$I = \langle g,g_1,g_2,g_3 \rangle \in G^4$$.

• $$B$$ picks random $$\lambda_1,\lambda_2\leftarrow\Lambda$$, and computes $$S = h_{\lambda_1\oplus\lambda_2}(g_3)$$.

• $$B$$ gives the tuple $$I'=\langle g_1,g_2,g_3,\lambda_1,\lambda_2,S\rangle$$ to $$A$$, and outputs the output of $$A$$.

We show that $$B$$ is a $$(t,\epsilon)$$-DDH algorithm. Note

$|Pr[B(I)|I-DDH] - Pr[B(I)|I-randg3]| = |Pr[A(I')|I'-DDH] - Pr[A(I')|I'-randg3hash] \ge |Pr[A(I')|I'-DDH] - Pr(A(I')|I'-rand01m3]| - |Pr[A(I')|I'-rand01m3 - Pr[A(I')|I'-randg3hash]|$

where we have used the inequality $$|A+B| \ge |A| - |B|$$. Now the first term is at least $$\epsilon + 1/2^{m/3}$$, while the second term is at most $$1/2^{m/3}$$ by the leftover lemma.

## Leftover Lemma

The leftover lemma shows how to extracts randomness from a "defective" random source.

Definition: Let

$H=\{h_\lambda : \{0,1\}^n \rightarrow \{0,1\}^l\}_{\lambda \in \Lambda}$

$$H$$ is called a universal family of hash functions if for all $$x \ne y \in \{0,1\}^n$$ we have

$Pr_{h\leftarrow H}[h(x) = h(y)] = 1/2^l$

$$H$$ is almost universal if for all $$x \ne y \in \{0,1\}^n$$ we have

$Pr_{h\leftarrow H}[h(x) = h(y)] = 1/2^l + 1/2^n .$

A set of all linear congruential generators (modulo some prime $$p$$) is an example of an almost universal family of hash functions.

Definition: Let $$S$$ be a finite set and $$D, D'$$ be distributions on $$S$$. For all $$x\in S$$, denote the probability of picking $$x$$ according to $$D$$ by $$D(x)$$, and for all $$X\subseteq S$$, define $$D(X) = \sum_{x\in X}D(x)$$. Define the following:

• The collision probability of $$D$$ is $$coll(D)=\sum_{x\in S}D(x)^2$$.

• $$D, D'$$ are $$\epsilon$$-statistically indistinguishable if for all $$X\subseteq S$$, we have $$|D(X)-D'(X)| \lt \epsilon$$.

• $$D$$ is $$\epsilon$$-quasirandom if $$D$$ is $$\epsilon$$-statistically indistinguishable from the uniform distribution on $$S$$ (which we denote by $$U(S)$$. Hence for any subset $$X \subseteq S$$ we have $$|D(X)-|X|/|S|| \lt \epsilon.$$

Quasirandomness is useful. For suppose $$D$$ is $$\epsilon$$-quasirandom on $$S$$. Then for any algorithm $$A$$, we have

$|Pr_{x\leftarrow D}[A(x) \text{accepts}] - Pr_{x\leftarrow U(S)}[A(x) \text{accepts}]| \lt \epsilon$

Proof: Let $$T\subseteq S$$ be $$T\{x|A(x)\}$$. Then

$\array { |Pr_{x\leftarrow D}[A(x)] - Pr_{x\leftarrow U(S)}[A(x)]| &=& |Pr_{x\leftarrow D}[A(x)|x\in T]Pr[x\in T] \\ &+& Pr_{x\leftarrow D}[A(x)|x\notin T]Pr[x\notin T] \\ &-& Pr_{x\leftarrow U(S)}[A(x)|x\in T]Pr[x\in T] \\ &-& Pr_{x\leftarrow U(S)}[A(x)|x\notin T]Pr[x\notin T]| \\ &=& |Pr_{x\leftarrow D}[x\in T] - Pr_{x\leftarrow U(S)}[x \in T]| \\ &=& |D(T) - U(T)| \lt \epsilon }$

Leftover Lemma: Let $$G\subseteq\{0,1\}^n$$, $$|G|\gt 2^l$$. Let $$e\gt 0$$ and let $$H$$ be an almost universal hash family from $$\{0,1\}^n$$ to $$\{0,1\}^{l-2e}$$. Then the distribution $$\{(h,h(x))|h\leftarrow H, x\leftarrow G\}$$ is $$1/2^e$$-quasirandom on $$H \times \{0,1\}^{l-2e}$$.

Proof: there are two steps to the proof.

Claim: (proof omitted) because $$H$$ is almost universal, the collision probability

$coll((h,h(x))_{h\leftarrow H,x\leftarrow G}) \lt (1+2/2^{2e})/|H|2^{l-2e}$

Claim: Let $$D$$ be a distribution on a finite set $$S$$. If $$coll(D) \lt (1+2\delta^2)/|S|$$ then $$D$$ is $$\delta$$-quasirandom.

Proof of claim: Suppose there exists some subset $$X\subseteq S$$ such that $$D(X) \ge |X|/|S| + \delta$$. Without loss of generality, assume equality holds (we can just increase $$\delta$$). We have $$D(S\setminus X)=1-D(X)$$. Then

$\array { coll(D) &=& Pr_{x_1,x_2\leftarrow D}[x_1 = x_2] \\ &=& Pr_D [x_1=x_2|x_1 \wedge x_2 \in X]Pr[x_1 \wedge x_2\in X] + Pr_D [x_1=x_2|x_1 \wedge x_2 \notin X]Pr[x_1 \wedge x_2\notin X] \\ &=& Pr_D [x_1=x_2|x_1 \wedge x_2 \in X]D(X)^2 + Pr_D [x_1=x_2|x_1 \wedge x_2 \notin X]D(S\setminus X)^2 \\ &\ge& 1/|X| D(X)^2 + 1/|S\setminus X| D(S\setminus X)^2 \\ &=& D(X)^2/|X| + (1-D(X))^2/(|S|-|X|) \\ &=& 1/|S| + \delta^2/|X| + \delta^2/(|S|-|X|) }$

(Because we are considering collison probabilities, we can ignore the cases when only one of $$x_1, x_2$$ lies in $$X$$). The last line is minimized when $$|X| = |S|/2$$, thus we have

$coll(D) \ge 1/|S| + 4\delta^2/|S| \ge 1|S| + 2\delta^2/|S|$

So by the first claim

$coll((h,h(x))_{h\leftarrow H,x\leftarrow G}) \lt (1+2(1/2^e)^2)/|H|2^{l-2e}$

and we have $$|S| = |H|2^{l-2e}$$, $$\delta = 1/2^e$$. Then by the second claim, $$(h,h(x))$$ is $$1/2^e$$-quasirandom.

## More Powerful Extractors

There are more effective techniques than the leftover lemma [SZ94]. Let $$D$$ be a distribution on a finiste set $$S$$ and define

$min-entropy(D)=\min_{x\in S}\{-\lg D(x)|D(x) \ne 0\}.$

For example, the min-entropy of a point distribution is zero, and for the uniform distribution $$U(S)$$, it is $$\lg|S|$$.

Now suppose $$G$$ is a set with $$|G| = |S|/2^\delta$$. Then $$min-entropy(U(G)) = lg |S| - \delta$$. Thus we see that $$min-entropy(D)=\delta$$ implies that for all $$x \in S$$ we have $$D(x)\le 2^{-\delta}$$. Sometimes the distribution $$D$$ is referred to as a $$\delta$$-source. It turns out that it is possible to extract about $$\delta$$ bits from an arbitrary $$\delta$$-source.

Theorem: For any $$l$$, and $$e\le l/2$$, there exists a family $$H$$ of hash functions

$H=\{h_\lambda:\{0,1\}^n\rightarrow\{0,1\}^{l-2e}\}_{\lambda\in\Lambda}$

such that

• $$|H| = O(n 2^{4(l-e)})$$ (only $$4(l-e)+O(\log n)$$ bits are needed to describe the family)

• For any distribution $$D$$ on $$\{0,1\}^m$$ with $$min-entropy(D) = l$$, the distribution $$\{(h,h(x)|h\leftarrow H, x\leftarrow D\}$$ is $$2/2^e$$-quasirandom

The theorem is used to construct cryptosystems with exposure-resilience: even if an adversary learns "crazy" bits of a key, security is still maintained. The idea is to split a key into two parts, the first part specifies the hash function to use while the other bits are fed into the hash function.

Ben Lynn blynn@cs.stanford.edu 💡