A Solution to problem 20 in Scott Aaronson’s “Projects aplenty”

Added 2016-06-04: Although I wrote this at the bottom, to be extra clear I’m writing this at the top: After further investigation, the method I described was found not to work.

Scott Aaronson has a post in his blog, Projects aplenty, where he proposes many research problems for students. One of them is:

20. Do there exist probability distributions D_1, D_2 over n-bit strings such that (D_1^2+D_2^2)/2 (an equal mixture of two independent samples from D_1 and two independent samples from D_2) is efficiently samplable, even though D_1 and D_2 themselves are not efficiently samplable?  (This is closely-related to a beautiful question posed by Daniel Roy, of whether the de Finetti Theorem has a “polynomial-time analogue.”)

In the comments, I proposed one solution to the problem. Scott Aaronson sent me an email pointing out that the conditions in which the solution work are so strong that the solution is effectively trivial. I have thought about the problem more and I believe I have found a much better solution.

To be specific, what I have is: Given data A there are distributions D_0(A) and D_1(A) such that it is possible to efficiently approximately sample D_0(A)^2 + D_1(A)^2 given A. However, neither D_0(A) nor D_1(A) are efficiently approximately sampleable, even given O (\log |A|) bits of advice. Moreover, it is easy to generate appropiate A, in the sense that there is a polynomial-time probablistic algorithm which given 1^\lambda creates an A with |A| = \mathrm{poly} (\lambda) and A satisfies these properties.

Before I present the solution, some definitions: If p and P are prime with p | P-1 and n \in \mathbb{Z} / P \mathbb{Z}, then I define \left( \frac {n} {P} \right)_p = n ^{P/p} \mod P. When N is an odd number, not necessarily prime, and n \in \mathbb{Z} / N \mathbb{Z}, then \left( \frac {n} {N} \right) is the Jacobi symbol of n mod N.

Now, to generate A: first, pick primes p, q_0, and q_1 with approximately \lambda bits. Let P and Q be primes with p | P-1 and q_0 q_1 | Q-1. Set N = P Q. Set n to be a natural number with \log(N) \ll n, and m = \lfloor \sqrt{P} \rfloor. Set g \in \mathbb{Z}/ P\mathbb{Z} to be a random pth root of unity. Set (a_{ij}) to be a random n \times n of elements of \mathbb{Z} / n\mathbb{Z} subject to these properties:

  • For all 0 \leq i,j < n, \left( \frac {a_{ij}} {N} \right) = 1 and a_{ij} a_{ji} is a square.
  • For any 0 \leq i_0,i_1,j < n, we have \left( \frac {a_{i_0 j}} {Q} \right) _{q_0} = \left( \frac {a_{i_1 j}} {Q}) \right) _{q_0}.
  • Similarly, for any 0 \leq i,j_0,j_1 < n, we have \left( \frac {a_{i j_0}} {Q} \right) _{q_1} = \left( \frac {a_{i j_1}} {Q}) \right) _{q_1}.
  • For any 0 \leq i,j < n, \left( \frac {a_{ij}} {P} \right)_p = g^i, where i is chosen randomly from the binomial distribution B(m, 1/2).

Then A = (N, n, (a_{ij})).

Given A and a permutation \pi \in S_n, define the value f(\pi) = \prod _{0 \leq i < n} a_{i \pi(i)}. Then for random \pi, f(\pi) samples from a distribution D, which constists of random x \in \mathbb{Z} / n\mathbb{Z} with (x/Q)_{q_0} = \prod_{i<n} (a_{0i} / Q)_{q_0} and (x/Q)_{q_1} = \prod_{j<n} (a_{j0} / Q)_{q_1} and such that (x/P)_p = g^i where i is sampled from B(n m, 1/2).

It is possible to split the distribution D = (D_0 + D_1) /2, where D_0 consists of sampling D and post-selecting for squares, and D_1 consists of sampling D and post-selecting for nonsquares. Then it is possible to sample (D_0^2 + D_1^2)/2 as follows: Pick a random permutation \pi \in S_n and take (f(\pi), f(\pi^{-1})). However, as far as I can tell, it is impossible to sample either D_0 or D_1 individually.

Note that m must be large. Otherwise, there is a nonnegligable probability that \left( \frac {a_{i_0 j_0} a_{i_1 j_1}} {P} \right)_p = \left( \frac {a_{i_0 j_1} a_{i_1 j_0}} {P} \right)_p, in which case both D_0 and D_1 can be sampled by taking f(\pi) (a_{i_0 j_0} a_{i_1 j_1} / a_{i_0 j_1} a_{i_1 j_0}) ^{2 k} for appropiate \pi and large random k.

One concern with this approach is that (D_0^2 + D_1^2)/2 is not being sampled exactly. However, I believe what is being sampled has a neglible statistical distance from (D_0^2 + D_1^2), at least for large enough n. However, I haven’t proven this.

Added 2013-10-25: I have now written an implementation of this distribution. Find it here.

Added 2014-01-01: Two problems were found in this method. First of all, this method
relies on the fact that f (\pi) and f (\pi ^{-1}) have the same
square-status. However, more is true. If \pi = c_0 \dots c_{r-1} is the
decomposition of \pi into cycles and given any \pi' = c'_0 \dots c'_{r-1} with each c'_i equal to either c_i or c_i ^{-1}, then still f (\pi) and f (\pi ^{-1}) have the same
square-status. Then picking a \pi with many cycles and sampling randomly
from f (\pi') is a way to sample from D_i where i = 0 if
f (\pi) is a square and i = 1 otherwise.

This alternate process has around C^n possible outcomes while the
process for sampling (D_0^2 + D_1^2)/2 has n! possible outcomes.
This suggests a possible resolution of this problem, namely, fine-tuning n so that the latter has enough entropy to sample randomly, while the former
does not. To investigate weather this works, I analyzed more carefully how
random f (\pi) is using a Fourier analysis. I discovered a second
problem: My method does not perfectly sample (D_0^2 + D_1^2)/2, but
creates a small correlation between the elements of this pair which decrease
polynomially in n rather than exponentially in n.

Advertisements