One reason why it might seem strange to you, that we seem to think there is more apparent (or conjectured) power in the randomized reductions from NP$\mathsf{N}\mathsf{P}$ to UP$\mathsf{U}\mathsf{P}$ than the comparable one from BPP$\mathsf{B}\mathsf{P}\mathsf{P}$ to P$\mathsf{P}$, is because you may be tempted to think of randomness as something which is either powerful (or not powerful) independently of what "machine" you add it to (if we caricature these complexity classes as classes arising from machine models).

And yet, these reductions of different power exist. In fact, a computational resource such as randomness does not necessarily have a fixed amount of computational power, which is either "significant" or "not significant".

We may consider any complexity class which is low for itself — for instance, L$\mathsf{L}$, P$\mathsf{P}$, BPP$\mathsf{B}\mathsf{P}\mathsf{P}$, BQP$\mathsf{B}\mathsf{Q}\mathsf{P}$, ⊕P$\oplus \mathsf{P}$, or PSPACE$\mathsf{P}\mathsf{S}\mathsf{P}\mathsf{A}\mathsf{C}\mathsf{E}$ — to be amenable to a machine model of sorts in which the machine always has a well-defined state about which you can ask questions at any point in time, while also allowing for the computation to continue beyond the question that you ask: in essence, exactly that the machine can simulate one algorithm as a subroutine for another. The machine which performs the computation may not be particularly *realistic* if we restrict ourselves to practical constraints on resources (*e.g.* physically realisable and able to produce answers in low-degree polynomial time for problems of interest), but unlike classes such as NP$\mathsf{N}\mathsf{P}$ — for which we have no idea how a nondeterministic machine could produce the answer to another problem in NP$\mathsf{N}\mathsf{P}$ and use the answer in any way aside from (iterated) conjunctive and disjunctive truth-table reductions — imagining such a class as being embodied by a machine with a well-defined state which we can enquire into does not lead us badly astray.

If we take this position, we can ask what happens if we provide these computational models M$\mathsf{M}$ with extra facilities such as randomness or nondeterminism. (These extra facilities don't necessarily preserve the property of being interpretable by a machine model, especially in the case of nondeterminism, but they do give rise to 'new' classes.) If this extra facility gives the model more power, giving rise to a class C$\mathsf{C}$, this is in effect equivalent to saying that there is a reduction from C$\mathsf{C}$ to M$\mathsf{M}$ using that facility, *e.g.* a randomized reduction in the case of randomness.

The reason why I'm describing this in terms of classes which are low for themselves is that if we take seriously that they are "possible models of computation in another world", your question about randomized reductions corresponds to the fact that it seems that *randomness dramatically increases the power of some models but not others*.

In place of randomized reductions from NP$\mathsf{N}\mathsf{P}$ to UP$\mathsf{U}\mathsf{P}$, we can observe that there is a randomized reduction from all of PH$\mathsf{P}\mathsf{H}$ to the class BP⋅⊕P$\mathsf{B}\mathsf{P}\cdot \oplus \mathsf{P}$ — which is obtained if you add bounded-error randomness to ⊕P$\oplus \mathsf{P}$ — by Toda's Theorem. And your question can then be posed as: *why does this happen*? Why should some machines gain so much from randomness, and others so little? In the case of PH⊆BP⋅⊕P$\mathsf{P}\mathsf{H}\subseteq \mathsf{B}\mathsf{P}\cdot \oplus \mathsf{P}$, it seems as though the modulo-2 nondeterminism entailed in the definition of ⊕P$\oplus \mathsf{P}$ (essentially a counting quantifier modulo 2) catalyses the randomness entailed in bounded error (essentially a counting quantifier with a promise gap) to give us the equivalent of an entire unbounded hierarchy of existential and universal quantifiers. But this does not mean that we suppose that ⊕P$\oplus \mathsf{P}$ is itself approximately as powerful as the entire polynomial hierarchy, does it? Neither the resources of bounded-error randomness nor modulo-2 counting are thought to be nearly that powerful. What we observe is that *together*, these two quantifiers *are* that powerful.

There's also a question of whether we can really say that randomness is weak in absolute terms, compared say to nondeterminism: if randomness is so weak, and if we're so convinced that BPP=P$\mathsf{B}\mathsf{P}\mathsf{P}=\mathsf{P}$, why can we only bound BPP⊆Σp2∩Δp2$\mathsf{B}\mathsf{P}\mathsf{P}\subseteq {\mathrm{\Sigma}}_{2}^{\mathsf{p}}\cap {\mathrm{\Delta}}_{2}^{\mathsf{p}}$ in the polynomial hierarchy, using *two* levels of indeterminism, let alone one? But this may just be a result that, while we suspect that randomness added to simple polynomial-time computation doesn't give much power, we have no idea of how to simulate that additional power using only a small amount of nondeterminism of the sort involved in NP$\mathsf{N}\mathsf{P}$ and coNP$\mathsf{c}\mathsf{o}\mathsf{N}\mathsf{P}$. (Of course, it's difficult to prove *anything* nontrivial in complexity theory; but that *again* is just the statement that these different sorts of resources are difficult to compare on a scale!)

There is no strong argument that I can give to defend why this should be the case, other than to observe that so far it simply *is* the case; and that if you think that PH$\mathsf{P}\mathsf{H}$ doesn't collapse, is different from ⊕P$\oplus \mathsf{P}$, and that BPP≈P$\mathsf{B}\mathsf{P}\mathsf{P}\approx \mathsf{P}$, then you should consider the possibility that facilities such as randomness and nondeterminism can have powers which are not easily comparable to one another, and which can *synergize* or catalyse one another to give computational power that neither one would plausibly have on its own. The hypothesis that BPP=P$\mathsf{B}\mathsf{P}\mathsf{P}=\mathsf{P}$ is not that "randomness has no power", but that randomness *alone* (or rather, supplemented only by polynomial time computation and provided to an otherwise deterministic computational model) is not powerful. But this does not mean that there can be no power in randomness, which may be catalysed by other computational resources.