Perché la casualità ha un effetto maggiore sulle riduzioni che sugli algoritmi?


36

Si ipotizza che la casualità non estenda la potenza degli algoritmi del tempo polinomiale, cioè è congetturata per trattenere. D'altra parte, la casualità sembra avere un effetto alquanto diverso sulla riduzione dei tempi polinomiali . Con il noto risultato di Valiant e Vazirani, S A T si riduce a U S A T mediante riduzione del tempo polinomiale randomizzata. Non è probabile che la riduzione possa essere derandomizzata, dal momento che produrrebbeP=BPPSATUSATNP=UP, which is thought unlikely.

Mi chiedo, quale potrebbe essere la ragione di questa situazione asimmetrica: la derandomizzazione sembra abbastanza possibile negli algoritmi del tempo polinomiale probabilistico, ma non nelle riduzioni del tempo polinomiale probabilistico?


3
I guess the reason is that randomness helps when computation is interactive (e.g. preventing another player from cheating), and a reduction can be considered a very simple kind of interactive computation.
Kaveh

11
what evidence is there for NP not being equal to UP?
Sasho Nikolov

Another situation where randomness seems to make a difference is "value oracle algorithms". For example, while there is a randomized 1/2 approximation algorithm for unconstrained submodular maximization, the best known deterministic algorithm is only a 1/3 approximation. The 1/2 approximation is known to be optimal, and the 1/3 approximation is suspected to be optimal by at least one of the authors.
Yuval Filmus

@Yuval, potresti espandere il tuo commento in una risposta? Sarei interessato a leggere una spiegazione più lunga.
Kaveh,

4
Ottima domanda!
Gil Kalai,

Risposte:


28

Innanzi tutto, vorrei commentare il caso specifico della riduzione di Valiant-Vazirani; questo, spero, aiuterà a chiarire la situazione generale.

FF, and an unsatisfiable F to an unsatisfiable F. All output formulas are always obtained by further restricting F, so unsatisfiability is always preserved. The reduction can be defined either as outputting a single F, or as outputting a list of F1,,Ft. In the latter case, "success" in the case FSAT is defined as having at least one uniquely satisfiable Fi in the list. Call these two variants the "singleton reduction" and "list-reduction" respectively (this is not standard terminology).

The first point it's important to note is that the success probability in the singleton reduction is quite small, namely Θ(1/n) where n is the number of variables. The difficulties in improving this success probability are explored in the paper

"Is Valiant-Vazirani's Isolation Probability Improvable?" by Dell et al.

http://eccc.hpi-web.de/report/2011/151/#revision1

In the list-reduction, the success probability can be made large, 12n say, with a poly(n)-sized list. (One can simply repeat the singleton reduction many times, for example.)

Now, it is not at all evident or intuitive that we should be able to directly derandomize a reduction that only has success probability 1/n. Indeed, none of the hardness-vs-randomness results give hypotheses under which we can do so in this case. It is much more plausible that the list-reduction can be derandomized (with a somewhat larger list). Note though that this would not imply NP=UP: our output list of formulas may have many uniquely-satisfiable formulas, and perhaps some with many satisfying assignments, and it seems hopeless to try to define a uniquely-accepting computation over such a list.

Even if we could somehow give a list-reduction in which a satisfiable F always induced a list F1,,Ft where most of the Fj's are uniquely satisfiable, there is no clear way to turn that into a deterministic singleton reduction for isolation. The real underlying difficulty is that we don't know of any "approximate-majority operation for uniquely-satisfiable formulas", that is, a reduction R(F1,,Ft) whose output is uniquely satisfiable if most Fj's are uniquely satisfiable, and unsatisfiable if most Fj's are unsatisfiable. This also seems like a general phenomenon: reductions output more complex objects than decision algorithms, and the properties of these objects are harder to check, so it's harder to combine many of these objects into a single object that inherits some property of the majority.

For the Valiant-Vazirani case, it does not even seem likely under plausible derandomization assumptions that we'd be able to obtain NP=FewP, that is, to deterministically reduce satisfiable formulas to satisfiable formulas with poly(n) solutions. Intuitively this stems from the fact that the isolating procedure has no idea of even the rough size of the solution set of the formula F it is given.


1
I wish that everyone that ever learned about Valiant-Vazirani would read this answer. The misunderstanding that derandomizing VV would imply NP=UP is unfortunately and doggedly persistent, and this gives a clear discussion of the issues and alternatives involved.
Joshua Grochow

13

In the oracle world, it is easy to give examples where randomness gives us much more power. Consider, for example, the problem of finding a zero of a balanced Boolean function. A randomized algorithm accomplishes that using O(1) queries with constant success probability, while any deterministic algorithm requires at least n/2 queries.

Here is another situation where it is suspected that randomization helps. Suppose we want to maximize a monotone submodular function over a matroid constraint. There are two different algorithms which give a 11/e approximation, and this is optimal in this model by a result of Vondrák. Both algorithms need to compute a function of the form ExXf(x), where X is a distribution with exponential support. Computing this function exactly is too costly, but it can be approximated by sampling, and the result is a randomized algorithm. In contrast, the best known deterministic algorithm, the greedy algorithm, gives a 1/2 approximation.

A similar situation occurs in unconstrained submodular maximization (here the function is not necessarily monotone). The recent breakthrough algorithm gives an optimal 1/2 approximation, but its deterministic version gives only a 1/3 approximation. Here the randomization manifests itself either in exactly the same way as in the monotone case, or (in a different version of the algorithm) by making a few random choices along the way.

One of the authors of the latter paper conjectures that 1/3 is the best that a deterministic algorithm can achieve, and we can similarly conjecture that 1/2 is the best that can be achieved in the previous problem. If these conjectures are true, then this is a very natural situation in which randomization provably helps.

Recently, Dobzinski and Vondrák showed how to transform value oracle lower bounds (for randomized algorithms) into hardness results, conditional on NP different from RP (the key ingredient is list decoding). We should mention that the transformation relies on the specific method used to prove the oracle lower bounds. Perhaps it is true that deterministic value oracle lower bounds also translate into hardness results.


I wonder if the volume estimation problem falls under this "value oracle" model. In that model, you're given a membership oracle for the convex object whose volume you're estimating, and it's well known that this cannot be approximated deterministically to even an exponential factor, but can be approximated arbitrarily well by a randomized algorithm.
Suresh Venkat

12

One reason why it might seem strange to you, that we seem to think there is more apparent (or conjectured) power in the randomized reductions from NP to UP than the comparable one from BPP to P, is because you may be tempted to think of randomness as something which is either powerful (or not powerful) independently of what "machine" you add it to (if we caricature these complexity classes as classes arising from machine models).

And yet, these reductions of different power exist. In fact, a computational resource such as randomness does not necessarily have a fixed amount of computational power, which is either "significant" or "not significant".

We may consider any complexity class which is low for itself — for instance, L, P, BPP, BQP, P, or PSPACE — to be amenable to a machine model of sorts in which the machine always has a well-defined state about which you can ask questions at any point in time, while also allowing for the computation to continue beyond the question that you ask: in essence, exactly that the machine can simulate one algorithm as a subroutine for another. The machine which performs the computation may not be particularly realistic if we restrict ourselves to practical constraints on resources (e.g. physically realisable and able to produce answers in low-degree polynomial time for problems of interest), but unlike classes such as NP — for which we have no idea how a nondeterministic machine could produce the answer to another problem in NP and use the answer in any way aside from (iterated) conjunctive and disjunctive truth-table reductions — imagining such a class as being embodied by a machine with a well-defined state which we can enquire into does not lead us badly astray.

If we take this position, we can ask what happens if we provide these computational models M with extra facilities such as randomness or nondeterminism. (These extra facilities don't necessarily preserve the property of being interpretable by a machine model, especially in the case of nondeterminism, but they do give rise to 'new' classes.) If this extra facility gives the model more power, giving rise to a class C, this is in effect equivalent to saying that there is a reduction from C to M using that facility, e.g. a randomized reduction in the case of randomness.

The reason why I'm describing this in terms of classes which are low for themselves is that if we take seriously that they are "possible models of computation in another world", your question about randomized reductions corresponds to the fact that it seems that randomness dramatically increases the power of some models but not others.

In place of randomized reductions from NP to UP, we can observe that there is a randomized reduction from all of PH to the class BPP — which is obtained if you add bounded-error randomness to P — by Toda's Theorem. And your question can then be posed as: why does this happen? Why should some machines gain so much from randomness, and others so little? In the case of PHBPP, it seems as though the modulo-2 nondeterminism entailed in the definition of P (essentially a counting quantifier modulo 2) catalyses the randomness entailed in bounded error (essentially a counting quantifier with a promise gap) to give us the equivalent of an entire unbounded hierarchy of existential and universal quantifiers. But this does not mean that we suppose that P is itself approximately as powerful as the entire polynomial hierarchy, does it? Neither the resources of bounded-error randomness nor modulo-2 counting are thought to be nearly that powerful. What we observe is that together, these two quantifiers are that powerful.

There's also a question of whether we can really say that randomness is weak in absolute terms, compared say to nondeterminism: if randomness is so weak, and if we're so convinced that BPP=P, why can we only bound BPPΣ2pΔ2p in the polynomial hierarchy, using two levels of indeterminism, let alone one? But this may just be a result that, while we suspect that randomness added to simple polynomial-time computation doesn't give much power, we have no idea of how to simulate that additional power using only a small amount of nondeterminism of the sort involved in NP and coNP. (Of course, it's difficult to prove anything nontrivial in complexity theory; but that again is just the statement that these different sorts of resources are difficult to compare on a scale!)

There is no strong argument that I can give to defend why this should be the case, other than to observe that so far it simply is the case; and that if you think that PH doesn't collapse, is different from P, and that BPPP, then you should consider the possibility that facilities such as randomness and nondeterminism can have powers which are not easily comparable to one another, and which can synergize or catalyse one another to give computational power that neither one would plausibly have on its own. The hypothesis that BPP=P is not that "randomness has no power", but that randomness alone (or rather, supplemented only by polynomial time computation and provided to an otherwise deterministic computational model) is not powerful. But this does not mean that there can be no power in randomness, which may be catalysed by other computational resources.


"aside from a disjunctive truth-table reduction —" what about other monotone truth-table reductions, such as a conjunctive truth-table reduction?

@RickyDemer: Quite right. At the time I wrote this, I was working on certain nondeterministic classes related to NL, for which closure under d.t.t- and c.t.t.-reductions both would have implied closure under complements, and so I omitted mention of c.t.t.; but the same is clearly not true for NL or NP themselves. I'll edit my answer.
Niel de Beaudrap

@NieldeBeaudrap This is a very good answer as well.
Tayfun Pay
Utilizzando il nostro sito, riconosci di aver letto e compreso le nostre Informativa sui cookie e Informativa sulla privacy.
Licensed under cc by-sa 3.0 with attribution required.