# Robustezza nel dividere una giunta

16

Diciamo che una funzione booleana è una -junta se ha al massimo variabili influenzanti.f:{0,1}n{0,1}$f: \{0,1\}^n \to \{0,1\}$k$k$f$f$k$k$

Sia sia una -junta. Indica le variabili di con . Correggi Chiaramente, esiste tale che contiene almeno delle variabili influenti di .f:{0,1}n{0,1}$f: \{0,1\}^n \to \{0,1\}$2k$2k$f$f$x1,x2,,xn$x_1, x_2, \ldots, x_n$

S1={x1,x2,,xn2},S2={xn2+1,xn2+2,,xn}.
S{S1,S2}$S \in \{S_1, S_2\}$S$S$k$k$f$f$

Ora lascia e supponi che sia -far da ogni -junta (cioè, si deve cambiare una frazione di almeno dei valori di per renderlo una -junta). Possiamo fare una versione "robusta" della precedente dichiarazione? Cioè, c'è una costante universale e un insieme tale che è -far da ogni funzione che contiene al massimo variabili influenzanti in ?ϵ>0$\epsilon > 0$f:{0,1}n{0,1}$f: \{0,1\}^n \to \{0,1\}$ϵ$\epsilon$2k$2k$ϵ$\epsilon$f$f$2k$2k$c$c$S{S1,S2}$S \in \{S_1, S_2\}$f$f$ϵc$\frac{\epsilon}{c}$k$k$S$S$

Nota: nella formulazione originale della domanda, stato corretto come . L'esempio di Neal mostra che tale valore di non è sufficiente. Tuttavia, poiché nei test di proprietà di solito non ci occupiamo troppo delle costanti, ho leggermente rilassato la condizione.c$c$2$2$c$c$

Puoi chiarire le tue condizioni? Una variabile "influenza" a meno che il valore di f sia sempre indipendente dalla variabile? "Cambia un valore di " significa, cambia uno dei valori per qualche particolare ? f$f$f(x)$f(x)$x$x$
Neal Young,

Naturalmente, la variabile sta influenzando se esiste un stringa bit tale che , dove è la stringa con 'th coordinate capovolto. Cambiare il valore di significa fare un cambiamento nella sua tabella di verità. xi$x_i$n$n$y$y$f(y)f(y)$f(y) \neq f(y')$y$y'$y$y$i$i$f$f$

Risposte:

17

La risposta è si". La prova è per contraddizione.

Per comodità notazionale, denotiamo le prime variabili per e le seconde variabili per . Supponiamo che sia -cludi a una funzione che dipende solo dalle coordinate di . Indica le sue coordinate influenti con . Allo stesso modo, supponiamo che sian/2$n/2$x$x$n/2$n/2$y$y$f(x,y)$f(x,y)$δ$\delta$f1(x,y)$f_1(x,y)$k$k$x$x$T1$T_1$f(x,y)$f(x,y)$ -clude a una funzione che dipende solo dallecoordinate di . Indica le sue coordinate influenti con . Dobbiamo dimostrare che è - vicino a un -junta .δ$\delta$f2(x,y)$f_2(x,y)$k$k$y$y$T2$T_2$f$f$4δ$4\delta$2k$2k$f~(x,y)$\tilde f(x,y)$

Diciamo che se e concordano su tutte le coordinate in e e concordano su tutte le coordinate in . Scegliamo in modo uniforme a caso un rappresentante per ogni classe di equivalenza. Sia il rappresentante della classe di (x1,y1)(x2,y2)$(x_1,y_1) \sim (x_2,y_2)$x1$x_1$x2$x_2$T1$T_1$y1$y_1$y2$y_2$T2$T_2$(x¯,y¯)$(\bar x, \bar y)$ . Definire come segue: (x,y)$(x,y)$f~$\tilde f$

f~(x,y)=f(x¯,y¯).

È ovvio che è una -junta (dipende solo dalle variabili in . Dimostreremo che è a distanza di da nell'aspettativa.f~$\tilde f$2k$2k$T1T2)$T_1 \cup T_2)$4δ$4\delta$f$f$

Vogliamo dimostrare che dove ed sono scelti uniformemente casuale. Prendi in considerazione un vettore casuale

Prf~(Prx,y(f~(x,y)f(x,y)))=Pr(f(x¯,y¯)f(x,y))4δ,
x$x$y$y$ ottenuti da, mantenendo tutti i bit ine lanciando casualmente tutti i bit non in, e un vettore definito in modo simile. Nota che x~$\tilde x$x$x$T1$T_1$T1$T_1$y~$\tilde y$
Pr(f~(x,y)f(x,y))=Pr(f(x¯,y¯)f(x,y))=Pr(f(x~,y~)f(x,y)).

Abbiamo,

Pr(f(x,y)f(x~,y))Pr(f(x,y)f1(x,y))+Pr(f1(x,y)f1(x~,y))+Pr(f1(x~,y)f(x~,y))δ+0+δ=2δ.

Similarly, Pr(f(x~,y)f(x~,y~))2δ$\Pr(f(\tilde x,y) \neq f(\tilde x, \tilde y)) \leq 2\delta$. We have

Pr(f(x¯,y¯)f(x,y))4δ.
QED

It easy to “derandomize” this proof. For every (x,y)$(x,y)$, let f~(x,y)=1$\tilde f(x,y) = 1$ if f(x,y)=1$f(x,y) = 1$ for most (x,y)$(x',y')$ in the equivalence class of (x,y)$(x,y)$, and f~(x,y)=0$\tilde f(x,y) = 0$, otherwise.

12

The smallest c$c$ that the bound holds for is c=1212.41$c = \frac{1}{\sqrt 2 - 1} \approx 2.41$.

Lemmas 1 and 2 show that the bound holds for this c$c$. Lemma 3 shows that this bound is tight.

(In comparison, Juri's elegant probabilistic argument gives c=4$c=4$.)

Let c=121$c=\frac{1}{\sqrt 2 - 1}$. Lemma 1 gives the upper bound for k=0$k=0$.

Lemma 1: If f$f$ is ϵg$\epsilon_g$-near a function g$g$ that has no influencing variables in S2$S_2$, and f$f$ is ϵh$\epsilon_h$-near a function h$h$ that has no influencing variables in S1$S_1$, then f$f$ is ϵ$\epsilon$-near a constant function, where ϵ(ϵg+ϵh)/2c$\epsilon \le \frac{(\epsilon_g+\epsilon_h)/2}{c}$.

Proof. Let ϵ$\epsilon$ be the distance from f$f$ to a constant function. Suppose for contradiction that ϵ$\epsilon$ does not satisfy the claimed inequality. Let y=(x1,x2,,xn/2)$y=(x_1,x_2,\ldots,x_{n/2})$ and z=(xn/2+1,,xn)$z=(x_{n/2}+1,\ldots,x_n)$ and write f$f$, g$g$, and h$h$ as f(y,z)$f(y,z)$, g(y,z)$g(y,z)$ and h(y,z)$h(y,z)$, so g(y,z)$g(y,z)$ is independent of z$z$ and h(y,z)$h(y,z)$ is independent of y$y$.

(I find it helpful to visualize f$f$ as the edge-labeling of the complete bipartite graph with vertex sets {y}$\{y\}$ and {z}$\{z\}$, where g$g$ gives a vertex-labeling of {y}$\{y\}$, and h$h$ gives a vertex-labeling of {z}$\{z\}$.)

Let g0$g_0$ be the fraction of pairs (y,z)$(y,z)$ such that g(y,z)=0$g(y,z) = 0$. Let g1=1g0$g_1=1-g_0$ be the fraction of pairs such that g(y,z)=1$g(y,z) = 1$. Likewise let h0$h_0$ be the fraction of pairs such that h(y,z)=0$h(y,z) = 0$, and let h1$h_1$ be the fraction of pairs such that h(y,z)=1$h(y,z) = 1$.

Without loss of generality, assume that, for any pair such that g(y,z)=h(y,z)$g(y,z) = h(y,z)$, it also holds that f(y,z)=g(y,z)=h(y,z)$f(y,z) = g(y,z) = h(y,z)$. (Otherwise, toggling the value of f(y,z)$f(y,z)$ allows us to decrease both ϵg$\epsilon_g$ and ϵh$\epsilon_h$ by 1/2n$1/2^n$, while decreasing the ϵ$\epsilon$ by at most 1/2n$1/2^n$, so the resulting function is still a counter-example.) Say any such pair is in agreement''.

The distance from f$f$ to g$g$ plus the distance from f$f$ to h$h$ is the fraction of (x,y)$(x,y)$ pairs that are not in agreement. That is, ϵg+ϵh=g0h1+g1h0$\epsilon_g + \epsilon_h = g_0 h_1 + g_1 h_0$.

The distance from f$f$ to the all-zero function is at most 1g0h0$1 - g_0 h_0$.

The distance from f$f$ to the all-ones function is at most 1g1h1$1-g_1 h_1$.

Further, the distance from f$f$ to the nearest constant function is at most 1/2$1/2$.

Thus, the ratio ϵ/(ϵg+ϵh)$\epsilon/(\epsilon_g+\epsilon_h)$ is at most

min(1/2,1g0h0,1g1h1)g0h1+g1h0,
where g0,h0[0,1]$g_0,h_0 \in [0,1]$ and g1=1g0$g_1 = 1-g_0$ and h1=1h0$h_1=1-h_0$.

By calculation, this ratio is at most 12(21)=c/2$\frac{1}{2(\sqrt 2 - 1)} = c/2$. QED

Lemma 2 extends Lemma 1 to general k$k$ by arguing pointwise, over every possible setting of the 2k$2k$ influencing variables. Recall that c=121$c=\frac{1}{\sqrt 2 - 1}$.

Lemma 2: Fix any k$k$. If f$f$ is ϵg$\epsilon_g$-near a function g$g$ that has k$k$ influencing variables in S2$S_2$, and f$f$ is ϵh$\epsilon_h$-near a function h$h$ that has k$k$ influencing variables in S1$S_1$, then f$f$ is ϵ$\epsilon$-near a function f^$\hat f$ that has at most 2k$2k$ influencing variables, where ϵ(ϵg+ϵh)/2c$\epsilon \le \frac{(\epsilon_g+\epsilon_h)/2}{c}$.

Proof. Express f$f$ as f(a,y,b,z)$f(a,y,b,z)$ where (a,y)$(a,y)$ contains the variables in S1$S_1$ with a$a$ containing those that influence h$h$, while (b,z)$(b,z)$ contains the variables in S2$S_2$ with b$b$ containing those influencing g$g$. So g(a,y,b,z)$g(a,y,b,z)$ is independent of z$z$, and h(a,y,b,z)$h(a,y,b,z)$ is independent of y$y$.

For each fixed value of $a$ and $b$, define $F_{ab}(y,z) = f(a,y,b,z)$, and define $G_{ab}$ and $H_{ab}$ similarly from $g$ and $h$ respectively. Let $\epsilon^g_{ab}$ be the distance from $F_{ab}$ to $G_{ab}$ (restricted to $(y,z)$ pairs). Likewise let $\epsilon^h_{ab}$ be the distance from $F_{ab}$ to $H_{ab}$.

By Lemma 1, there exists a constant $c_{ab}$ such that the distance (call it $\epsilon_{ab}$) from $F_{ab}$ to the constant function $c_{ab}$ is at most $(\epsilon^h_{ab} + \epsilon^g_{ab})/(2c)$. Define $\hat f(a,y,b,z) = c_{ab}$.

Clearly $\hat f$ depends only on $a$ and $b$ (and thus at most $k$ variables).

Let $\epsilon_{\hat f}$ be the average, over the $(a,b)$ pairs, of the $\epsilon_{ab}$'s, so that the distance from $f$ to $\hat f$ is $\epsilon_{\hat f}$.

Likewise, the distances from $f$ to $g$ and from $f$ to $h$ (that is, $\epsilon_g$ and $\epsilon_h)$ are the averages, over the $(a,b)$ pairs, of, respectively, $\epsilon^g_{ab}$ and $\epsilon^h_{ab}$.

Since $\epsilon_{ab} \le (\epsilon^h_{ab} + \epsilon^g_{ab})/(2c)$ for all $a, b$, it follows that $\epsilon_{\hat f} \le (\epsilon_g + \epsilon_h)/(2c)$. QED

Lemma 3 shows that the constant $c$ above is the best you can hope for (even for $k=0$ and $\epsilon=0.5$).

Lemma 3: There exists $f$ such that $f$ is $(0.5/c)$-near two functions $g$ and $h$, where $g$ has no influencing variables in $S_2$ and $h$ has no influencing variables in $S_1$, and $f$ is $0.5$-far from every constant function.

Proof. Let $y$ and $z$ be $x$ restricted to, respectively, $S_1$ and $S_2$. That is, $y=(x_1,\ldots,x_{n/2})$ and $z=(x_{n/2+1},\ldots,x_n)$.

Identify each possible $y$ with a unique element of $[N]$, where $N=2^{n/2}$. Likewise, identify each possible $z$ with a unique element of $[N]$. Thus, we think of $f$ as a function from $[N]\times[N]$ to $\{0,1\}$.

Define $f(y,z)$ to be 1 iff $\max(y,z) \ge \frac{1}{\sqrt 2}N$.

By calculation, the fraction of $f$'s values that are zero is $(\frac{1}{\sqrt 2})^2 = \frac{1}{2}$, so both constant functions have distance $\frac{1}{2}$ to $f$.

Define $g(y,z)$ to be 1 iff $y\ge \frac{1}{\sqrt 2}N$. Then $g$ has no influencing variables in $S_2$. The distance from $f$ to $g$ is the fraction of pairs $(y,z)$ such that $y<\frac{1}{\sqrt 2}N$ and $z\ge \frac{1}{\sqrt 2}N$. By calculation, this is at most $\frac{1}{\sqrt 2}(1-\frac{1}{\sqrt2}) = 0.5/c$

Similarly, the distance from $f$ to $h$, where $h(y,z)=1$ iff $z\ge \frac{1}{\sqrt 2}N$, is at most $0.5/c$.

QED

First of all, thanks Neal! This indeed sums it up for $k=0$, and sheds some light on the general problem. However in the case of $k=0$ the problem is a bit degenerate (as $2k=k$), so I'm more curious regarding the case of $k \ge 1$. I didn't manage to extend this claim for $k>0$, so if you have an idea on how to do it - I'd appreciate it. If it simplifies the problem, then the exact constants are not crucial; that is, $\epsilon/2$-far can be replaced by $\epsilon/c$-far, for some universal constant $c$.

2
I've edited it to add the extension to general k. And Yuri's argument below gives a slightly looser factor with an elegant probabilistic argument.
Neal Young

Sincere thanks Neal! This line of reasoning is quite enlightening.
Utilizzando il nostro sito, riconosci di aver letto e compreso le nostre Informativa sui cookie e Informativa sulla privacy.
Licensed under cc by-sa 3.0 with attribution required.