Hardcore Bits

Let \(f:\{0,1\}^n\rightarrow\{0,1\}^n\) be a permutation. A \((t,\epsilon)\)-hardcore bit is a function \(B:\{0,1\}^n\rightarrow\{0,1\}\) (predicate) such that for any \(t\)-time algorithm \(M\) we have

\[ | \Pr[M(f(X)) = B(X) | X\leftarrow\{0,1\}^n] - 1/2 | \lt \epsilon \]

("\(B(x)\) is hard to compute from \(f(x)\).")

Examples:

Hardcore bit of discrete log [Blum-Micali '81]: Let \(p\) be an \(n\)-bit prime, \(g \in \mathbb{Z}_p^*\). Define \(B:\mathbb{Z}_p^*\rightarrow\{0,1\}\) as

\[ B(x) = msb(x) = \left\{ \array { 0 & \text{if } x \lt p/2 \\ 1 & \text{if } x \gt p/2 } \right. \]

Theorem: \(B(x)\) is a \((t,\epsilon)\)-hardcore bit of \(f(x) = g^x \bmod p\) assuming there is no \(t n^3/\epsilon\)-time discrete log algorithm in \(\mathbb{Z}_p^*\) (that succeeds with constant probability).

Hardcore bit of RSA [ACGS '84]: Let \(N=pq\) be an \(n\)-bit RSA modulus and \(e \ge 2\). Define \(B(x) = lsb(x) = x \bmod 2\) (\(B:\mathbb{Z}_n\rightarrow\{0,1\}\).)

Theorem: \(B(x)\) is a \((t,\epsilon)\)-hardcore bit of \(f(x) = x^e \bmod N\) assuming there is no \(t n^2/\epsilon^3\)-time \(e\)th root algorithm for \(\mathbb{Z}_N^*\).

The Blum-Micali generator is a PRNG that is constructed from hardcore bits.

Hardcore bits for general one-way permutations

It is possible to find a hardcore bit of any given one-way permutation:

Yao’s XOR Lemma ['82]
Goldreich-Levin ['87]
Naslund ['96]

Yao’s Method

Let \(f:\{0,1\}^n\rightarrow\{0,1\}^n\) be \((t,\epsilon)\)-one-way.

First we note that there exists \(1\le i \le n\) such that for all \((t/n)\)-time algorithms \(A\),

\[\Pr[A(f(x))=x|_i] \lt 1 - (1-\epsilon)/n\]

(If not, then for all \(i\) there exists an algorithm \(A_i\) with \(\Pr[A_i(f(x))=x_i]\ge1-(1-\epsilon)/n\), which would imply \(\Pr[(\text{for all }i) A_i(f(x))=x|_i] \ge \epsilon\) by the union bound, a contradiction.

Yao’s XOR Lemma

Theorem: Suppose \(B:\{0,1\}^n\rightarrow\{0,1\}\) is a predicate such that for all \(t\)-time algorithms \(A\), \(\Pr[A(f(x))=x]\lt 1-\epsilon\). Define \(B_m(x_1,...,x_m)=B(x_1)\oplus ... \oplus B(x_m)\). Then if \(m=\lfloor 1/\epsilon\rfloor\), then for all \((t/m)\)-time \(A_m\), we have

\[\Pr[A_m(f(x_1),...,f(x_m)) = B_m(x_1,...,x_m)]\lt 1/2 +\epsilon\]

(where \(x_1,...,x_m\leftarrow\{0,1\}^n\)).

This implies that for \(m=\lfloor 1/\epsilon\rfloor\), \(B_m\) is a \((t/m,\epsilon)\)-hardcore bit of \(f(x_1)||...||f(x_m)\).

Goldreich-Levin Method

Given \(x,y \in \{0,1\}^n\), define \(\langle x,y \rangle = \sum_{i=1}^n x_i y_i \bmod 2\).

Let \(f:\{0,1\}^n\rightarrow\{0,1\}^n\) be a \((t,\epsilon/2)\)-one-way permutation. Define \(g:\{0,1\}^{2n}\rightarrow\{0,1\}^{2n}\) by \(g(x,r)=f(X)||r\). Define \(B:\{0,1\}^{2n}\rightarrow\{0,1\}\) by \(B(x,r)=\langle x,r\rangle\).

Theorem: Let \(f\) be a \((t,\epsilon/2)\)-one-way permutation. Then \(B\) is \((t',\epsilon)\)-hardcore for \(g\) where \(t'=t/TIME(GL)\), where GL is the Goldreich-Levin algorithm.

Proof: Suppose \(B\) is not \((t',\epsilon)\) hardcore for \(g\). Then there exists a \(t'\)-time algorithm \(A\) such that

\[ \Pr[A(f(x),r) = \langle x,r\rangle |x,r\leftarrow\{0,1\}^n]\gt 1/2 +\epsilon \]

Define the set \(S\subseteq\{0,1\}^n\) by

\[ \{x\in\{0,1\}^n|\Pr[A(f(x),r) = \langle x,r\rangle|r\leftarrow\{0,1\}^n] \gt 1/2 + \epsilon/2 \} \]

In other words, all \(x\)'s that the algorithm works with good probability.

By a counting argument, \(|S| \gt (\epsilon/2)2^n\). We shall construct an algorithm \(D\) such that for all \(x\in S\), \(D(f(x)) = x\) with \(E[TIME(D)] = t' TIME(GL) = O(t'(n^2/\epsilon^2))\).

For a fixed \(x\in S\), define \(A_x(r) = A(f(x),r)\). Then \(Pr_r[A_x(r) = \langle x,r\rangle] \gt 1/2 + \epsilon / 2\).

The Goldreich-Levin Algorithm: given an oracle for \(A_x\), the algorithm outputs all \(y\in\{0,1\}^n\) such that \(\Pr[A_x(r) = \langle y,r\rangle] \gt 1/2 + \epsilon/2\) with running time \(O(n^2/\epsilon^2)\).

It turns out there are at most \(1/\epsilon^2\) such \(y \in\{0,1\}^n\).

Then \(D\) works by scanning the output of the \(GL\) algorithm to find \(f^{-1}(f(x))\).

The Rackoff method for the GL algorithm corrects the predictor probability for \(A\) from \(1/2 + \epsilon\) to \(1-\epsilon\). This corrected predictor can be used to find \(x\).

Every one-way permutation has an efficient hardcore bit.
The proof is based on a noisy predictor for the inner product.
The GL algorithm is found in learning theory.

Naslund Method

Let \(f:\mathbb{F}_p\rightarrow\{0,1\}^n\) be a one-to-one function for some \(n\)-bit prime. Define \(g(x,a,b)=[f(x),a,b)]\) where \(a,b\in\mathbb{F}_p\). Then for \(1\le i\le n-\Omega(\log n)\), define \(B_i(x,a,b) = (ax + b \bmod p)|_i\). Note we ignore the \(\log n\) most significant bits because they are biased.

Theorem: If \(f\) is \((t,\epsilon/2)\)-one-way, then for all \(1\le i \le n-\Omega(\log n)\), \(B_i\) is \((t',\epsilon')\)-hardcore for \(g\) where \(t' = t/TIME(Naslund) = t(n^2/\epsilon^2)\) and \(\epsilon' = \epsilon + (c/p)\) for some constant \(c\).

Proof: Suppose there exists \(i, A_i\) such that

\[ \Pr[A_i(f(x),a,b) = (ax+b)|_i \gt 1/2 + \epsilon \]

Define \(S \subseteq \mathbb{F}_p\) by

\[ S = \{ x\in\mathbb{F}_p|Pr_{a,b}[A_i((f(x),a,b) = (ax+b)|_i] \gt 1/2 + \epsilon' /2\} \]

We shall build an algorithm \(D\) such that for all \(x\in S\), \(D(f(x)) =x\) and the running time of \(D\) is \(t(n^2/\epsilon^2)\). Naslund’s algorithm finds all such \(x\) in time \(n^2/\epsilon^2\).

Corollary: Let \(p,q\) be prime, and suppose \(q\) is \(n\)-bits. Let \(g\in\mathbb{Z}_p^*\) be an element of order \(q\). Let \(G=\langle g\rangle \triangleleft\mathbb{F}_p^*\). Then for all \(1\le i \le n-\Omega(\log n)\), \(x|_i\) is \((t,\epsilon)\)-hardcore for \(f(x)=g^x \bmod p\) assuming no \((t n^2/\epsilon^2)\)-time discrete logarithm algorithm exists.

Proof: Suppose there exists \(i, A_i\) such that

\[Pr_{x\in\mathbb{F}_p}[A_i(g^x \bmod p) = x|_i ] \gt 1/2 + \epsilon \]

and \(A_i\) runs in time \(t\). Then we shall construct a discrete log algorithm that outputs \(x\) when given \(y = g^x \bmod p\). Define \(B_y(a,b) = (ax + b \bmod q)|_i\).

Then \(B_y(a,b) = Dlog_g(y^a g^b)|_i\) and

\[Pr_{a,b}[A_i(y^a g^b) = (ax + b \bmod q)|_i] \gt 1/2 + \epsilon. \]

Naslund’s algorithm will find all \(x\)'s that satisfy the above, and one of them is the discrete log of \(y\).

Subset Sum Generator [IN '88]

Let \(a_1,...,a_n\in\mathbb{F}_p\) where \(p\) is an \(m\)-bit prime, and let \(s\in\{0,1\}^n\). Set \(b=\sum_{i=1}^n s_i a_i\). The subset sum problem is to find \(s\) given \(a_1,...,a_n,b\). The problem is believed to be hard for \(n\approx m\approx 1000\) on random instances.

Take \(m = (1+\epsilon)n\), for, say, \(\epsilon=1/10\).

Define a generator as follows

\[ G(a_1,...,a_n,s)=(a_1,...,a_n,\sum_{i=1}^n s_i a_i). \]

On every iteration, \(G\) produces \(mn + m\) bits from \(mn + n\) bits.

Theorem [IN]: \(G\) is a \((t,\epsilon)\)-PRNG assuming no \(t n^2/\epsilon^2\)-time subset sum algorithm exists.

Proof: Suppose there exists \(A\) that \((t,\epsilon)\)-distinguishes between

\[\{(a_1,...,a_n,R)|a_1,...,a_n\leftarrow\mathbb{F}_p\}\]

and

\[\{(a_1,...,a_n,b)|a_1,...,a_n\leftarrow\mathbb{F}_p,s\leftarrow\{0,1\}^n, b = \sum_{i=1}^n s_i a_i \} \]

Then we shall construct an algorithm that takes as input \(a_1,...,a_n,b \in\mathbb{F}_p\) and whose goal is to find \(s\in\{0,1\}^n\).

To do this, we build a predictor \(D\) such that

\[ \Pr[D(a_1,...,a_n,b,r)=\langle r,s\rangle] \gt 1/2 +\epsilon \]

and then use the Goldreich-Levin algorithm to find \(s\). \(D\) is given \(a_1,..,a_n\in\mathbb{F}_p, r\in\{0,1\}^n\), and works as follows:

Pick random \(t \leftarrow\mathbb{F}_p\), \(k\leftarrow\{0,...,n\}\)
Define \(a'_i = a_i\) if \(r_i = 0\), otherwise \(a'_i = a_i + t\) if \(r_i = 1\)
Run algorithm \(A\) on input \((a'_0,...,a'_n,b+kt)\).
If \(A\) fails, output \(parity(k)\), otherwise output \(1 - parity(k)\).

Ben Lynn blynn@cs.stanford.edu 💡