# Hardcore Bits

Let $$f:\{0,1\}^n\rightarrow\{0,1\}^n$$ be a permutation. A $$(t,\epsilon)$$-hardcore bit is a function $$B:\{0,1\}^n\rightarrow\{0,1\}$$ (predicate) such that for any $$t$$-time algorithm $$M$$ we have

$| \Pr[M(f(X)) = B(X) | X\leftarrow\{0,1\}^n] - 1/2 | \lt \epsilon$

("$$B(x)$$ is hard to compute from $$f(x)$$.")

Examples:

• Hardcore bit of discrete log [Blum-Micali '81]: Let $$p$$ be an $$n$$-bit prime, $$g \in \mathbb{Z}_p^*$$. Define $$B:\mathbb{Z}_p^*\rightarrow\{0,1\}$$ as

$B(x) = msb(x) = \left\{ \array { 0 & \text{if } x \lt p/2 \\ 1 & \text{if } x \gt p/2 } \right.$

Theorem: $$B(x)$$ is a $$(t,\epsilon)$$-hardcore bit of $$f(x) = g^x \bmod p$$ assuming there is no $$t n^3/\epsilon$$-time discrete log algorithm in $$\mathbb{Z}_p^*$$ (that succeeds with constant probability).

• Hardcore bit of RSA [ACGS '84]: Let $$N=pq$$ be an $$n$$-bit RSA modulus and $$e \ge 2$$. Define $$B(x) = lsb(x) = x \bmod 2$$ ($$B:\mathbb{Z}_n\rightarrow\{0,1\}$$.)

Theorem: $$B(x)$$ is a $$(t,\epsilon)$$-hardcore bit of $$f(x) = x^e \bmod N$$ assuming there is no $$t n^2/\epsilon^3$$-time $$e$$th root algorithm for $$\mathbb{Z}_N^*$$.

The Blum-Micali generator is a PRNG that is constructed from hardcore bits.

## Hardcore bits for general one-way permutations

It is possible to find a hardcore bit of any given one-way permutation:

• Yao’s XOR Lemma ['82]

• Goldreich-Levin ['87]

• Naslund ['96]

### Yao’s Method

Let $$f:\{0,1\}^n\rightarrow\{0,1\}^n$$ be $$(t,\epsilon)$$-one-way.

First we note that there exists $$1\le i \le n$$ such that for all $$(t/n)$$-time algorithms $$A$$,

$\Pr[A(f(x))=x|_i] \lt 1 - (1-\epsilon)/n$

(If not, then for all $$i$$ there exists an algorithm $$A_i$$ with $$\Pr[A_i(f(x))=x_i]\ge1-(1-\epsilon)/n$$, which would imply $$\Pr[(\text{for all }i) A_i(f(x))=x|_i] \ge \epsilon$$ by the union bound, a contradiction.

### Yao’s XOR Lemma

Theorem: Suppose $$B:\{0,1\}^n\rightarrow\{0,1\}$$ is a predicate such that for all $$t$$-time algorithms $$A$$, $$\Pr[A(f(x))=x]\lt 1-\epsilon$$. Define $$B_m(x_1,...,x_m)=B(x_1)\oplus ... \oplus B(x_m)$$. Then if $$m=\lfloor 1/\epsilon\rfloor$$, then for all $$(t/m)$$-time $$A_m$$, we have

$\Pr[A_m(f(x_1),...,f(x_m)) = B_m(x_1,...,x_m)]\lt 1/2 +\epsilon$

(where $$x_1,...,x_m\leftarrow\{0,1\}^n$$).

This implies that for $$m=\lfloor 1/\epsilon\rfloor$$, $$B_m$$ is a $$(t/m,\epsilon)$$-hardcore bit of $$f(x_1)||...||f(x_m)$$.

### Goldreich-Levin Method

Given $$x,y \in \{0,1\}^n$$, define $$\langle x,y \rangle = \sum_{i=1}^n x_i y_i \bmod 2$$.

Let $$f:\{0,1\}^n\rightarrow\{0,1\}^n$$ be a $$(t,\epsilon/2)$$-one-way permutation. Define $$g:\{0,1\}^{2n}\rightarrow\{0,1\}^{2n}$$ by $$g(x,r)=f(X)||r$$. Define $$B:\{0,1\}^{2n}\rightarrow\{0,1\}$$ by $$B(x,r)=\langle x,r\rangle$$.

Theorem: Let $$f$$ be a $$(t,\epsilon/2)$$-one-way permutation. Then $$B$$ is $$(t',\epsilon)$$-hardcore for $$g$$ where $$t'=t/TIME(GL)$$, where GL is the Goldreich-Levin algorithm.

Proof: Suppose $$B$$ is not $$(t',\epsilon)$$ hardcore for $$g$$. Then there exists a $$t'$$-time algorithm $$A$$ such that

$\Pr[A(f(x),r) = \langle x,r\rangle |x,r\leftarrow\{0,1\}^n]\gt 1/2 +\epsilon$

Define the set $$S\subseteq\{0,1\}^n$$ by

$\{x\in\{0,1\}^n|\Pr[A(f(x),r) = \langle x,r\rangle|r\leftarrow\{0,1\}^n] \gt 1/2 + \epsilon/2 \}$

In other words, all $$x$$'s that the algorithm works with good probability.

By a counting argument, $$|S| \gt (\epsilon/2)2^n$$. We shall construct an algorithm $$D$$ such that for all $$x\in S$$, $$D(f(x)) = x$$ with $$E[TIME(D)] = t' TIME(GL) = O(t'(n^2/\epsilon^2))$$.

For a fixed $$x\in S$$, define $$A_x(r) = A(f(x),r)$$. Then $$Pr_r[A_x(r) = \langle x,r\rangle] \gt 1/2 + \epsilon / 2$$.

The Goldreich-Levin Algorithm: given an oracle for $$A_x$$, the algorithm outputs all $$y\in\{0,1\}^n$$ such that $$\Pr[A_x(r) = \langle y,r\rangle] \gt 1/2 + \epsilon/2$$ with running time $$O(n^2/\epsilon^2)$$.

It turns out there are at most $$1/\epsilon^2$$ such $$y \in\{0,1\}^n$$.

Then $$D$$ works by scanning the output of the $$GL$$ algorithm to find $$f^{-1}(f(x))$$.

The Rackoff method for the GL algorithm corrects the predictor probability for $$A$$ from $$1/2 + \epsilon$$ to $$1-\epsilon$$. This corrected predictor can be used to find $$x$$.

• Every one-way permutation has an efficient hardcore bit.

• The proof is based on a noisy predictor for the inner product.

• The GL algorithm is found in learning theory.

### Naslund Method

Let $$f:\mathbb{F}_p\rightarrow\{0,1\}^n$$ be a one-to-one function for some $$n$$-bit prime. Define $$g(x,a,b)=[f(x),a,b)]$$ where $$a,b\in\mathbb{F}_p$$. Then for $$1\le i\le n-\Omega(\log n)$$, define $$B_i(x,a,b) = (ax + b \bmod p)|_i$$. Note we ignore the $$\log n$$ most significant bits because they are biased.

Theorem: If $$f$$ is $$(t,\epsilon/2)$$-one-way, then for all $$1\le i \le n-\Omega(\log n)$$, $$B_i$$ is $$(t',\epsilon')$$-hardcore for $$g$$ where $$t' = t/TIME(Naslund) = t(n^2/\epsilon^2)$$ and $$\epsilon' = \epsilon + (c/p)$$ for some constant $$c$$.

Proof: Suppose there exists $$i, A_i$$ such that

$\Pr[A_i(f(x),a,b) = (ax+b)|_i \gt 1/2 + \epsilon$

Define $$S \subseteq \mathbb{F}_p$$ by

$S = \{ x\in\mathbb{F}_p|Pr_{a,b}[A_i((f(x),a,b) = (ax+b)|_i] \gt 1/2 + \epsilon' /2\}$

We shall build an algorithm $$D$$ such that for all $$x\in S$$, $$D(f(x)) =x$$ and the running time of $$D$$ is $$t(n^2/\epsilon^2)$$. Naslund’s algorithm finds all such $$x$$ in time $$n^2/\epsilon^2$$.

Corollary: Let $$p,q$$ be prime, and suppose $$q$$ is $$n$$-bits. Let $$g\in\mathbb{Z}_p^*$$ be an element of order $$q$$. Let $$G=\langle g\rangle \triangleleft\mathbb{F}_p^*$$. Then for all $$1\le i \le n-\Omega(\log n)$$, $$x|_i$$ is $$(t,\epsilon)$$-hardcore for $$f(x)=g^x \bmod p$$ assuming no $$(t n^2/\epsilon^2)$$-time discrete logarithm algorithm exists.

Proof: Suppose there exists $$i, A_i$$ such that

$Pr_{x\in\mathbb{F}_p}[A_i(g^x \bmod p) = x|_i ] \gt 1/2 + \epsilon$

and $$A_i$$ runs in time $$t$$. Then we shall construct a discrete log algorithm that outputs $$x$$ when given $$y = g^x \bmod p$$. Define $$B_y(a,b) = (ax + b \bmod q)|_i$$.

Then $$B_y(a,b) = Dlog_g(y^a g^b)|_i$$ and

$Pr_{a,b}[A_i(y^a g^b) = (ax + b \bmod q)|_i] \gt 1/2 + \epsilon.$

Naslund’s algorithm will find all $$x$$'s that satisfy the above, and one of them is the discrete log of $$y$$.

### Subset Sum Generator [IN '88]

Let $$a_1,...,a_n\in\mathbb{F}_p$$ where $$p$$ is an $$m$$-bit prime, and let $$s\in\{0,1\}^n$$. Set $$b=\sum_{i=1}^n s_i a_i$$. The subset sum problem is to find $$s$$ given $$a_1,...,a_n,b$$. The problem is believed to be hard for $$n\approx m\approx 1000$$ on random instances.

Take $$m = (1+\epsilon)n$$, for, say, $$\epsilon=1/10$$.

Define a generator as follows

$G(a_1,...,a_n,s)=(a_1,...,a_n,\sum_{i=1}^n s_i a_i).$

On every iteration, $$G$$ produces $$mn + m$$ bits from $$mn + n$$ bits.

Theorem [IN]: $$G$$ is a $$(t,\epsilon)$$-PRNG assuming no $$t n^2/\epsilon^2$$-time subset sum algorithm exists.

Proof: Suppose there exists $$A$$ that $$(t,\epsilon)$$-distinguishes between

$\{(a_1,...,a_n,R)|a_1,...,a_n\leftarrow\mathbb{F}_p\}$

and

$\{(a_1,...,a_n,b)|a_1,...,a_n\leftarrow\mathbb{F}_p,s\leftarrow\{0,1\}^n, b = \sum_{i=1}^n s_i a_i \}$

Then we shall construct an algorithm that takes as input $$a_1,...,a_n,b \in\mathbb{F}_p$$ and whose goal is to find $$s\in\{0,1\}^n$$.

To do this, we build a predictor $$D$$ such that

$\Pr[D(a_1,...,a_n,b,r)=\langle r,s\rangle] \gt 1/2 +\epsilon$

and then use the Goldreich-Levin algorithm to find $$s$$. $$D$$ is given $$a_1,..,a_n\in\mathbb{F}_p, r\in\{0,1\}^n$$, and works as follows:

1. Pick random $$t \leftarrow\mathbb{F}_p$$, $$k\leftarrow\{0,...,n\}$$

2. Define $$a'_i = a_i$$ if $$r_i = 0$$, otherwise $$a'_i = a_i + t$$ if $$r_i = 1$$

3. Run algorithm $$A$$ on input $$(a'_0,...,a'_n,b+kt)$$.

4. If $$A$$ fails, output $$parity(k)$$, otherwise output $$1 - parity(k)$$.

Ben Lynn blynn@cs.stanford.edu 💡