Suppose Alice and Bob wish to perform a Diffie-Hellman key exchange. Then for some $n$ they pick an $n$-bit prime $p$, and an element $g \in \mathbb{Z}_p^*$ that is a generator of a cyclic group of prime order $q$.

Then Alice picks a random $0 \le x \lt q$ and sends $g^x$ to Bob, and Bob picks a random $0 \le y \lt q$ and sends $g^y$ to Alice. They both compute their shared secret $g^xy$. But this is not the whole story: now they must somehow derive a session key from this quantity. To an adversary, the session key should be indistinguishable from a random string.

In practice, an off-the-shelf hash function such as SHA1 is applied to some representation of $g^{xy}$, and then the 128 LSBs are used as the key. Unfortunately the only known way to prove that this is secure is to "prove by assumption": i.e. concoct the Hash Diffie-Hellman assumption.

## Security from the CDH assumption

One obvious way to proceed is to extract a hardcore bit of $g^{xy}$. For example, using the Goldreich-Levin method, they can find a hardcore bit $B$, and to derive a 128-bit key, they simply perform 128 Diffie-Hellman exchanges and extract $B(g^{xy})$ each time. This is secure provided we assume no efficient Computational Diffie-Hellman algorithm exists, for then the adversary should not be able to distinguish between the distributions $\{g^x, g^y, B(g^{xy}), x,y \leftarrow \mathbb{Z}_q\}$ and $\{g^x, g^y, r, x,y\leftarrow\mathbb{Z}_q, r\leftarrow\{0,1\}\}$. More formally:

Theorem: If the CDH problem is $(t,\epsilon)$-hard then there is no $(t,\epsilon/128)$-distinguisher between

$\{g^{x_1},...,g^{x^{128}}, g^{y_1},...,g^{y_{128}}, K=B(g^{x_1 y_1})||...||B(g^{x_{128} y_{128}}), \bar{x} , \bar{y} \leftarrow \mathbb{Z}_q^{128}\}$

and

$\{g^{x_1},...,g^{x^{128}}, g^{y_1},...,g^{y_{128}}, K, \bar{x} , \bar{y} \leftarrow \mathbb{Z}_q^{128}, K \leftarrow\{0,1\}^{128}\}$

Sketch of proof: distinguisher $\implies$ predictor for $i$th bit $\implies$ predictor for $B(g^{xy})$ given $g^x, g^y$ $\implies$ CDH algorithm.

However, this method is impractical as it requires one Diffie-Hellman exchange for each bit of the session key. Unfortunately, it is not known how to improve this method much without using a stronger assumption:

## Security from the DDH assumption

The $(t,\epsilon)$-Decisional Diffie-Hellman (DDH) assumption says that the distribution

$\{g,g^a,g^b,g^c | a,b,c \leftarrow \mathbb{Z}_q\}$

is $(t,\epsilon)$-indistinguishable from the distribution

$\{g,g^a,g^b,g^{ab} | a,b \leftarrow \mathbb{Z}_q\}$

(In other words, given $g^a$ and $g^b$, $g^{a b}$ is indistinguishable from a random element, which implies that $\lfloor lg q \rfloor$ bits of $g^{ab}$ are simultanesouly secure given $g^a, g^b$.

Using this assumption allows a provably secure scheme for extracting a session key, via the left-over lemma.

Let $p$ be a 1024-bit prime, and let $q$ be a 255-bit prime such that $G$ is a subgroup of $\mathbb{Z}_p^*$ of order $q$, and let $g$ be a generator of $G$. Let $m = \lfloor \lg q \rfloor$, and suppose $|G| \approx 2^m$. Set $e = m/3$, and let $H=\{h_\lambda|\lambda\in\Lambda\}: \mathbb{Z}_p^* \rightarrow \{0,1\}^{m-2e} = \{0,1\}^{m/3} = \{0,1\}^{85}$. be almost universal. Then consider the following key exchange protocol:

Alice picks $x\leftarrow\mathbb{Z}_q, \lambda_1\leftarrow\Lambda$ and sends $g^x,\lambda_1$ to Bob. Similarly, Bob picks $y\leftarrow\mathbb{Z}_q, \lambda_2\leftarrow\Lambda$ and sends $g^y,\lambda_2$ to Alice. They both compute the secret key $SK=h_{\lambda_1\oplus\lambda_2}(g^{xy})$.

Theorem: Assuming $(t,\epsilon)$-DDH on $G$, the distributions

$\{g,g^x,g^y,\lambda_1,\lambda_2,SK | x,y\leftarrow\mathbb{Z}_q, \lambda_1,\lambda_2\leftarrow\Lambda\}$

and

$\{g,g^x,g^y,\lambda_1,\lambda_2,R | x,y\leftarrow\mathbb{Z}_q, \lambda_1,\lambda_2\leftarrow\Lambda, R\leftarrow\{0,1\}^{m/3}\}$

are $(t,\epsilon+1/2^{m/3})$-indistinguishable.

Proof: Suppose $A$ is a $(t,\epsilon + 1/2^{m/3})$-distiguisher. Then we can construct a DDH-distiguisher $B$ as follows.

• $B$ is given a tuple $I = \langle g,g_1,g_2,g_3 \rangle \in G^4$.

• $B$ picks random $\lambda_1,\lambda_2\leftarrow\Lambda$, and computes $S = h_{\lambda_1\oplus\lambda_2}(g_3)$.

• $B$ gives the tuple $I'=\langle g_1,g_2,g_3,\lambda_1,\lambda_2,S\rangle$ to $A$, and outputs the output of $A$.

We show that $B$ is a $(t,\epsilon)$-DDH algorithm. Note

$|Pr[B(I)|I-DDH] - Pr[B(I)|I-randg3]| = |Pr[A(I')|I'-DDH] - Pr[A(I')|I'-randg3hash] \ge |Pr[A(I')|I'-DDH] - Pr(A(I')|I'-rand01m3]| - |Pr[A(I')|I'-rand01m3 - Pr[A(I')|I'-randg3hash]|$

where we have used the inequality $|A+B| \ge |A| - |B|$. Now the first term is at least $\epsilon + 1/2^{m/3}$, while the second term is at most $1/2^{m/3}$ by the leftover lemma.

## Leftover Lemma

The leftover lemma shows how to extracts randomness from a "defective" random source.

Definition: Let

$H=\{h_\lambda : \{0,1\}^n \rightarrow \{0,1\}^l\}_{\lambda \in \Lambda}$

$H$ is called a universal family of hash functions if for all $x \ne y \in \{0,1\}^n$ we have

$Pr_{h\leftarrow H}[h(x) = h(y)] = 1/2^l$

$H$ is almost universal if for all $x \ne y \in \{0,1\}^n$ we have

$Pr_{h\leftarrow H}[h(x) = h(y)] = 1/2^l + 1/2^n .$

A set of all linear congruential generators (modulo some prime $p$) is an example of an almost universal family of hash functions.

Definition: Let $S$ be a finite set and $D, D'$ be distributions on $S$. For all $x\in S$, denote the probability of picking $x$ according to $D$ by $D(x)$, and for all $X\subseteq S$, define $D(X) = \sum_{x\in X}D(x)$. Define the following:

• The collision probability of $D$ is $coll(D)=\sum_{x\in S}D(x)^2$.

• $D, D'$ are $\epsilon$-statistically indistinguishable if for all $X\subseteq S$, we have $|D(X)-D'(X)| \lt \epsilon$.

• $D$ is $\epsilon$-quasirandom if $D$ is $\epsilon$-statistically indistinguishable from the uniform distribution on $S$ (which we denote by $U(S)$. Hence for any subset $X \subseteq S$ we have $|D(X)-|X|/|S|| \lt \epsilon.$

Quasirandomness is useful. For suppose $D$ is $\epsilon$-quasirandom on $S$. Then for any algorithm $A$, we have

$|Pr_{x\leftarrow D}[A(x) \text{accepts}] - Pr_{x\leftarrow U(S)}[A(x) \text{accepts}]| \lt \epsilon$

Proof: Let $T\subseteq S$ be $T\{x|A(x)\}$. Then

$\array { |Pr_{x\leftarrow D}[A(x)] - Pr_{x\leftarrow U(S)}[A(x)]| &=& |Pr_{x\leftarrow D}[A(x)|x\in T]Pr[x\in T] \\ &+& Pr_{x\leftarrow D}[A(x)|x\notin T]Pr[x\notin T] \\ &-& Pr_{x\leftarrow U(S)}[A(x)|x\in T]Pr[x\in T] \\ &-& Pr_{x\leftarrow U(S)}[A(x)|x\notin T]Pr[x\notin T]| \\ &=& |Pr_{x\leftarrow D}[x\in T] - Pr_{x\leftarrow U(S)}[x \in T]| \\ &=& |D(T) - U(T)| \lt \epsilon }$

Leftover Lemma: Let $G\subseteq\{0,1\}^n$, $|G|\gt 2^l$. Let $e\gt 0$ and let $H$ be an almost universal hash family from $\{0,1\}^n$ to $\{0,1\}^{l-2e}$. Then the distribution $\{(h,h(x))|h\leftarrow H, x\leftarrow G\}$ is $1/2^e$-quasirandom on $H \times \{0,1\}^{l-2e}$.

Proof: there are two steps to the proof.

Claim: (proof omitted) because $H$ is almost universal, the collision probability

$coll((h,h(x))_{h\leftarrow H,x\leftarrow G}) \lt (1+2/2^{2e})/|H|2^{l-2e}$

Claim: Let $D$ be a distribution on a finite set $S$. If $coll(D) \lt (1+2\delta^2)/|S|$ then $D$ is $\delta$-quasirandom.

Proof of claim: Suppose there exists some subset $X\subseteq S$ such that $D(X) \ge |X|/|S| + \delta$. Without loss of generality, assume equality holds (we can just increase $\delta$). We have $D(S\setminus X)=1-D(X)$. Then

$\array { coll(D) &=& Pr_{x_1,x_2\leftarrow D}[x_1 = x_2] \\ &=& Pr_D [x_1=x_2|x_1 \wedge x_2 \in X]Pr[x_1 \wedge x_2\in X] + Pr_D [x_1=x_2|x_1 \wedge x_2 \notin X]Pr[x_1 \wedge x_2\notin X] \\ &=& Pr_D [x_1=x_2|x_1 \wedge x_2 \in X]D(X)^2 + Pr_D [x_1=x_2|x_1 \wedge x_2 \notin X]D(S\setminus X)^2 \\ &\ge& 1/|X| D(X)^2 + 1/|S\setminus X| D(S\setminus X)^2 \\ &=& D(X)^2/|X| + (1-D(X))^2/(|S|-|X|) \\ &=& 1/|S| + \delta^2/|X| + \delta^2/(|S|-|X|) }$

(Because we are considering collison probabilities, we can ignore the cases when only one of $x_1, x_2$ lies in $X$). The last line is minimized when $|X| = |S|/2$, thus we have

$coll(D) \ge 1/|S| + 4\delta^2/|S| \ge 1|S| + 2\delta^2/|S|$

So by the first claim

$coll((h,h(x))_{h\leftarrow H,x\leftarrow G}) \lt (1+2(1/2^e)^2)/|H|2^{l-2e}$

and we have $|S| = |H|2^{l-2e}$, $\delta = 1/2^e$. Then by the second claim, $(h,h(x))$ is $1/2^e$-quasirandom.

## More Powerful Extractors

There are more effective techniques than the leftover lemma [SZ94]. Let $D$ be a distribution on a finiste set $S$ and define

$min-entropy(D)=\min_{x\in S}\{-\lg D(x)|D(x) \ne 0\}.$

For example, the min-entropy of a point distribution is zero, and for the uniform distribution $U(S)$, it is $\lg|S|$.

Now suppose $G$ is a set with $|G| = |S|/2^\delta$. Then $min-entropy(U(G)) = lg |S| - \delta$. Thus we see that $min-entropy(D)=\delta$ implies that for all $x \in S$ we have $D(x)\le 2^{-\delta}$. Sometimes the distribution $D$ is referred to as a $\delta$-source. It turns out that it is possible to extract about $\delta$ bits from an arbitrary $\delta$-source.

Theorem: For any $l$, and $e\le l/2$, there exists a family $H$ of hash functions

$H=\{h_\lambda:\{0,1\}^n\rightarrow\{0,1\}^{l-2e}\}_{\lambda\in\Lambda}$

such that

• $|H| = O(n 2^{4(l-e)})$ (only $4(l-e)+O(\log n)$ bits are needed to describe the family)

• For any distribution $D$ on $\{0,1\}^m$ with $min-entropy(D) = l$, the distribution $\{(h,h(x)|h\leftarrow H, x\leftarrow D\}$ is $2/2^e$-quasirandom

The theorem is used to construct cryptosystems with exposure-resilience: even if an adversary learns "crazy" bits of a key, security is still maintained. The idea is to split a key into two parts, the first part specifies the hash function to use while the other bits are fed into the hash function.