# A Solution to problem 20 in Scott Aaronson’s “Projects aplenty”

Added 2016-06-04: Although I wrote this at the bottom, to be extra clear I’m writing this at the top: After further investigation, the method I described was found not to work.

Scott Aaronson has a post in his blog, Projects aplenty, where he proposes many research problems for students. One of them is:

20. Do there exist probability distributions $D_1$, $D_2$ over n-bit strings such that $(D_1^2+D_2^2)/2$ (an equal mixture of two independent samples from $D_1$ and two independent samples from $D_2$) is efficiently samplable, even though $D_1$ and $D_2$ themselves are not efficiently samplable?  (This is closely-related to a beautiful question posed by Daniel Roy, of whether the de Finetti Theorem has a “polynomial-time analogue.”)

In the comments, I proposed one solution to the problem. Scott Aaronson sent me an email pointing out that the conditions in which the solution work are so strong that the solution is effectively trivial. I have thought about the problem more and I believe I have found a much better solution.

To be specific, what I have is: Given data $A$ there are distributions $D_0(A)$ and $D_1(A)$ such that it is possible to efficiently approximately sample $D_0(A)^2 + D_1(A)^2$ given $A$. However, neither $D_0(A)$ nor $D_1(A)$ are efficiently approximately sampleable, even given $O (\log |A|)$ bits of advice. Moreover, it is easy to generate appropiate $A$, in the sense that there is a polynomial-time probablistic algorithm which given $1^\lambda$ creates an $A$ with $|A| = \mathrm{poly} (\lambda)$ and $A$ satisfies these properties.

Before I present the solution, some definitions: If $p$ and $P$ are prime with $p | P-1$ and $n \in \mathbb{Z} / P \mathbb{Z}$, then I define $\left( \frac {n} {P} \right)_p = n ^{P/p} \mod P$. When $N$ is an odd number, not necessarily prime, and $n \in \mathbb{Z} / N \mathbb{Z}$, then $\left( \frac {n} {N} \right)$ is the Jacobi symbol of $n$ mod $N$.

Now, to generate $A$: first, pick primes $p$, $q_0$, and $q_1$ with approximately $\lambda$ bits. Let $P$ and $Q$ be primes with $p | P-1$ and $q_0 q_1 | Q-1$. Set $N = P Q$. Set $n$ to be a natural number with $\log(N) \ll n$, and $m = \lfloor \sqrt{P} \rfloor$. Set $g \in \mathbb{Z}/ P\mathbb{Z}$ to be a random $p$th root of unity. Set $(a_{ij})$ to be a random $n \times n$ of elements of $\mathbb{Z} / n\mathbb{Z}$ subject to these properties:

• For all $0 \leq i,j < n$, $\left( \frac {a_{ij}} {N} \right) = 1$ and $a_{ij} a_{ji}$ is a square.
• For any $0 \leq i_0,i_1,j < n$, we have $\left( \frac {a_{i_0 j}} {Q} \right) _{q_0} = \left( \frac {a_{i_1 j}} {Q}) \right) _{q_0}$.
• Similarly, for any $0 \leq i,j_0,j_1 < n$, we have $\left( \frac {a_{i j_0}} {Q} \right) _{q_1} = \left( \frac {a_{i j_1}} {Q}) \right) _{q_1}$.
• For any $0 \leq i,j < n$, $\left( \frac {a_{ij}} {P} \right)_p = g^i$, where $i$ is chosen randomly from the binomial distribution $B(m, 1/2)$.

Then $A = (N, n, (a_{ij}))$.

Given $A$ and a permutation $\pi \in S_n$, define the value $f(\pi) = \prod _{0 \leq i < n} a_{i \pi(i)}$. Then for random $\pi$, $f(\pi)$ samples from a distribution $D$, which constists of random $x \in \mathbb{Z} / n\mathbb{Z}$ with $(x/Q)_{q_0} = \prod_{i and $(x/Q)_{q_1} = \prod_{j and such that $(x/P)_p = g^i$ where $i$ is sampled from $B(n m, 1/2)$.

It is possible to split the distribution $D = (D_0 + D_1) /2$, where $D_0$ consists of sampling $D$ and post-selecting for squares, and $D_1$ consists of sampling $D$ and post-selecting for nonsquares. Then it is possible to sample $(D_0^2 + D_1^2)/2$ as follows: Pick a random permutation $\pi \in S_n$ and take $(f(\pi), f(\pi^{-1}))$. However, as far as I can tell, it is impossible to sample either $D_0$ or $D_1$ individually.

Note that $m$ must be large. Otherwise, there is a nonnegligable probability that $\left( \frac {a_{i_0 j_0} a_{i_1 j_1}} {P} \right)_p = \left( \frac {a_{i_0 j_1} a_{i_1 j_0}} {P} \right)_p$, in which case both $D_0$ and $D_1$ can be sampled by taking $f(\pi) (a_{i_0 j_0} a_{i_1 j_1} / a_{i_0 j_1} a_{i_1 j_0}) ^{2 k}$ for appropiate $\pi$ and large random $k$.

One concern with this approach is that $(D_0^2 + D_1^2)/2$ is not being sampled exactly. However, I believe what is being sampled has a neglible statistical distance from $(D_0^2 + D_1^2)$, at least for large enough $n$. However, I haven’t proven this.

Added 2013-10-25: I have now written an implementation of this distribution. Find it here.

Added 2014-01-01: Two problems were found in this method. First of all, this method
relies on the fact that $f (\pi)$ and $f (\pi ^{-1})$ have the same
square-status. However, more is true. If $\pi = c_0 \dots c_{r-1}$ is the
decomposition of $\pi$ into cycles and given any $\pi' = c'_0 \dots c'_{r-1}$ with each $c'_i$ equal to either $c_i$ or $c_i ^{-1}$, then still $f (\pi)$ and $f (\pi ^{-1})$ have the same
square-status. Then picking a $\pi$ with many cycles and sampling randomly
from $f (\pi')$ is a way to sample from $D_i$ where $i = 0$ if
$f (\pi)$ is a square and $i = 1$ otherwise.

This alternate process has around $C^n$ possible outcomes while the
process for sampling $(D_0^2 + D_1^2)/2$ has $n!$ possible outcomes.
This suggests a possible resolution of this problem, namely, fine-tuning $n$ so that the latter has enough entropy to sample randomly, while the former
does not. To investigate weather this works, I analyzed more carefully how
random $f (\pi)$ is using a Fourier analysis. I discovered a second
problem: My method does not perfectly sample $(D_0^2 + D_1^2)/2$, but
creates a small correlation between the elements of this pair which decrease
polynomially in $n$ rather than exponentially in $n$.