Let $f:\{0,1\}^n\rightarrow\{0,1\}^n$ be a permutation. A $(t,\epsilon)$-hardcore bit is a function $B:\{0,1\}^n\rightarrow\{0,1\}$ (predicate) such that for any $t$-time algorithm $M$ we have

$| Pr[M(f(X)) = B(X) | X\leftarrow\{0,1\}^n] - 1/2 | \lt \epsilon$

("$B(x)$ is hard to compute from $f(x)$.")

Examples:

• Hardcore bit of discrete log [Blum-Micali '81]: Let $p$ be an $n$-bit prime, $g \in \mathbb{Z}_p^*$. Define $B:\mathbb{Z}_p^*\rightarrow\{0,1\}$ as

$B(x) = msb(x) = \left\{ \array { 0 & if x \lt p/2 \\ 1 & if x \gt p/2 } \right.$

Theorem: $B(x)$ is a $(t,\epsilon)$-hardcore bit of $f(x) = g^x mod p$ assuming there is no $t n^3/\epsilon$-time discrete log algorithm in $\mathbb{Z}_p^*$ (that succeeds with constant probability).

• Hardcore bit of RSA [ACGS '84]: Let $N=pq$ be an $n$-bit RSA modulus and $e \ge 2$. Define $B(x) = lsb(x) = x mod 2$ ($B:\mathbb{Z}_n\rightarrow\{0,1\}$.)

Theorem: $B(x)$ is a $(t,\epsilon)$-hardcore bit of $f(x) = x^e mod N$ assuming there is no $t n^2/\epsilon^3$-time $e$th root algorithm for $\mathbb{Z}_N^*$.

The Blum-Micali generator is a PRNG that is constructed from hardcore bits.

## Hardcore bits for general one-way permutations

It is possible to find a hardcore bit of any given one-way permutation:

• Yao’s XOR Lemma ['82]

• Goldreich-Levin ['87]

• Naslund ['96]

### Yao’s Method

Let $f:\{0,1\}^n\rightarrow\{0,1\}^n$ be $(t,\epsilon)$-one-way.

First we note that there exists $1\le i \le n$ such that for all $(t/n)$-time algorithms $A$,

$Pr[A(f(x))=x|_i] \lt 1 - (1-\epsilon)/n$

(If not, then for all $i$ there exists an algorithm $A_i$ with $Pr[A_i(f(x))=x_i]\ge1-(1-\epsilon)/n$, which would imply $Pr[(\text{for all }i) A_i(f(x))=x|_i] \ge \epsilon$ by the union bound, a contradiction.

### Yao’s XOR Lemma

Theorem: Suppose $B:\{0,1\}^n\rightarrow\{0,1\}$ is a predicate such that for all $t$-time algorithms $A$, $Pr[A(f(x))=x]\lt 1-\epsilon$. Define $B_m(x_1,...,x_m)=B(x_1)\oplus ... \oplus B(x_m)$. Then if $m=\lfloor 1/\epsilon\rfloor$, then for all $(t/m)$-time $A_m$, we have

$Pr[A_m(f(x_1),...,f(x_m)) = B_m(x_1,...,x_m)]\lt 1/2 +\epsilon$

(where $x_1,...,x_m\leftarrow\{0,1\}^n$).

This implies that for $m=\lfloor 1/\epsilon\rfloor$, $B_m$ is a $(t/m,\epsilon)$-hardcore bit of $f(x_1)||...||f(x_m)$.

### Goldreich-Levin Method

Given $x,y \in \{0,1\}^n$, define $\langle x,y \rangle = \sum_{i=1}^n x_i y_i \mod 2$.

Let $f:\{0,1\}^n\rightarrow\{0,1\}^n$ be a $(t,\epsilon/2)$-one-way permutation. Define $g:\{0,1\}^{2n}\rightarrow\{0,1\}^{2n}$ by $g(x,r)=f(X)||r$. Define $B:\{0,1\}^{2n}\rightarrow\{0,1\}$ by $B(x,r)=\langle x,r\rangle$.

Theorem: Let $f$ be a $(t,\epsilon/2)$-one-way permutation. Then $B$ is $(t',\epsilon)$-hardcore for $g$ where $t'=t/TIME(GL)$, where GL is the Goldreich-Levin algorithm.

Proof: Suppose $B$ is not $(t',\epsilon)$ hardcore for $g$. Then there exists a $t'$-time algorithm $A$ such that

$Pr[A(f(x),r) = \langle x,r\rangle |x,r\leftarrow\{0,1\}^n]\gt 1/2 +\epsilon$

Define the set $S\subseteq\{0,1\}^n$ by

$\{x\in\{0,1\}^n|Pr[A(f(x),r) = \langle x,r\rangle|r\leftarrow\{0,1\}^n] \gt 1/2 + \epsilon/2 \}$

In other words, all $x$'s that the algorithm works with good probability.

By a counting argument, $|S| \gt (\epsilon/2)2^n$. We shall construct an algorithm $D$ such that for all $x\in S$, $D(f(x)) = x$ with $E[TIME(D)] = t' TIME(GL) = O(t'(n^2/\epsilon^2))$.

For a fixed $x\in S$, define $A_x(r) = A(f(x),r)$. Then $Pr_r[A_x(r) = \langle x,r\rangle] \gt 1/2 + \epsilon / 2$.

The Goldreich-Levin Algorithm: given an oracle for $A_x$, the algorithm outputs all $y\in\{0,1\}^n$ such that $Pr[A_x(r) = \langle y,r\rangle] \gt 1/2 + \epsilon/2$ with running time $O(n^2/\epsilon^2)$.

It turns out there are at most $1/\epsilon^2$ such $y \in\{0,1\}^n$.

Then $D$ works by scanning the output of the $GL$ algorithm to find $f^{-1}(f(x))$.

The Rackoff method for the GL algorithm corrects the predictor probability for $A$ from $1/2 + \epsilon$ to $1-\epsilon$. This corrected predictor can be used to find $x$.

• Every one-way permutation has an efficient hardcore bit.

• The proof is based on a noisy predictor for the inner product.

• The GL algorithm is found in learning theory.

### Naslund Method

Let $f:\mathbb{F}_p\rightarrow\{0,1\}^n$ be a one-to-one function for some $n$-bit prime. Define $g(x,a,b)=[f(x),a,b)]$ where $a,b\in\mathbb{F}_p$. Then for $1\le i\le n-\Omega(\log n)$, define $B_i(x,a,b) = (ax + b \mod p)|_i$. Note we ignore the $\log n$ most significant bits because they are biased.

Theorem: If $f$ is $(t,\epsilon/2)$-one-way, then for all $1\le i \le n-\Omega(\log n)$, $B_i$ is $(t',\epsilon')$-hardcore for $g$ where $t' = t/TIME(Naslund) = t(n^2/\epsilon^2)$ and $\epsilon' = \epsilon + (c/p)$ for some constant $c$.

Proof: Suppose there exists $i, A_i$ such that

$Pr[A_i(f(x),a,b) = (ax+b)|_i \gt 1/2 + \epsilon$

Define $S \subseteq \F_p$ by

$S = \{ x\in\mathbb{F}_p|Pr_{a,b}[A_i((f(x),a,b) = (ax+b)|_i] \gt 1/2 + \epsilon' /2\}$

We shall build an algorithm $D$ such that for all $x\in S$, $D(f(x)) =x$ and the running time of $D$ is $t(n^2/\epsilon^2)$. Naslund’s algorithm finds all such $x$ in time $n^2/\epsilon^2$.

Corollary: Let $p,q$ be prime, and suppose $q$ is $n$-bits. Let $g\in\mathbb{Z}_p^*$ be an element of order $q$. Let $G=\langle g\rangle \triangleleft\mathbb{F}_p^*$. Then for all $1\le i \le n-\Omega(\log n)$, $x|_i$ is $(t,\epsilon)$-hardcore for $f(x)=g^x \mod p$ assuming no $(t n^2/\epsilon^2)$-time discrete logarithm algorithm exists.

Proof: Suppose there exists $i, A_i$ such that

$Pr_{x\in\mathbb{F}_p}[A_i(g^x \mod p) = x|_i ] \gt 1/2 + \epsilon$

and $A_i$ runs in time $t$. Then we shall construct a discrete log algorithm that outputs $x$ when given $y = g^x \mod p$. Define $B_y(a,b) = (ax + b mod q)|_i$.

Then $B_y(a,b) = Dlog_g(y^a g^b)|_i$ and

$Pr_{a,b}[A_i(y^a g^b) = (ax + b \mod q)|_i] \gt 1/2 + \epsilon.$

Naslund’s algorithm will find all $x$'s that satisfy the above, and one of them is the discrete log of $y$.

### Subset Sum Generator [IN '88]

Let $a_1,...,a_n\in\mathbb{F}_p$ where $p$ is an $m$-bit prime, and let $s\in\{0,1\}^n$. Set $b=\sum_{i=1}^n s_i a_i$. The subset sum problem is to find $s$ given $a_1,...,a_n,b$. The problem is believed to be hard for $n\approx\m\approx 1000$ on random instances.

Take $m = (1+\epsilon)n$, for, say, $\epsilon=1/10$.

Define a generator as follows

$G(a_1,...,a_n,s)=(a_1,...,a_n,\sum_{i=1}^n s_i a_i).$

On every iteration, $G$ produces $mn + m$ bits from $mn + n$ bits.

Theorem [IN]: $G$ is a $(t,\epsilon)$-PRNG assuming no $t n^2/\epsilon^2$-time subset sum algorithm exists.

Proof: Suppose there exists $A$ that $(t,\epsilon)$-distinguishes between

$\{(a_1,...,a_n,R)|a_1,...,a_n\leftarrow\mathbb{F}_p\}$

and

$\{(a_1,...,a_n,b)|a_1,...,a_n\leftarrow\mathbb{F}_p,s\leftarrow\{0,1\}^n, b = \sum_{i=1}^n s_i a_i \}$

Then we shall construct an algorithm that takes as input $a_1,...,a_n,b \in\mathbb{F}_p$ and whose goal is to find $s\in\{0,1\}^n$.

To do this, we build a predictor $D$ such that

$Pr[D(a_1,...,a_n,b,r)=\langle r,s\rangle] \gt 1/2 +\epsilon$

and then use the Goldreich-Levin algorithm to find $s$. $D$ is given $a_1,..,a_n\in\mathbb{F}_p, r\in\{0,1\}^n$, and works as follows:

1. Pick random $t \leftarrow\mathbb{F}_p$, $k\leftarrow\{0,...,n\}$

2. Define $a'_i = a_i$ if $r_i = 0$, otherwise $a'_i = a_i + t$ if $r_i = 1$

3. Run algorithm $A$ on input $(a'_0,...,a'_n,b+kt)$.

4. If $A$ fails, output $parity(k)$, otherwise output $1 - parity(k)$.