Considera il modello lineare semplice:

y y = X' β β + ϵ

$\pmb{y}=X'\pmb{\beta}+\epsilon$

dove $\epsilon_i\sim\mathrm{i.i.d.}\;\mathcal{N}(0,\sigma^2)$ e $X\in\mathbb{R}^{n\times p}$ , $p\geq2$ e $X$ contiene una colonna di costanti.

La mia domanda è, dato $\mathrm{E}(X'X)$ , e , esiste una formula per un limite superiore non banale su *? (supponendo che il modello sia stato stimato da OLS). $\beta$ $\sigma$ $\mathrm{E}(R^2)$

* Ho ipotizzato, scrivendo questo, che ottenere stesso non sarebbe possibile. $E(R^2)$

EDIT1

usando la soluzione derivata da Stéphane Laurent (vedi sotto) possiamo ottenere un limite superiore non banale su . Alcune simulazioni numeriche (sotto) mostrano che questo limite è in realtà piuttosto stretto. $E(R^2)$

Stéphane Laurent ha derivato quanto segue: dove è una distribuzione beta non centrale con parametro di non centralità con $R^2\sim\mathrm{B}(p-1,n-p,\lambda)$ $\mathrm{B}(p-1,n-p,\lambda)$ $\lambda$

λ = | | X ' β - E ( X ) ' β 1 n | | 2 σ 2

$\lambda=\frac{||X'\beta-\mathrm{E}(X)'\beta1_n||^2}{\sigma^2}$

Così

E (R 2) = E (χ 2 p - 1 ( λ ) χ 2 p - 1 ( λ ) + χ 2 n - p) \geq E ( χ 2 p - 1 ( λ ) ) E ( χ 2 p - 1 ( λ ) ) + E ( χ 2 n - p )

$\mathrm{E}(R^2)=\mathrm{E}\left(\frac{\chi^2_{p-1}(\lambda)}{\chi^2_{p-1}(\lambda)+\chi^2_{n-p}}\right)\geq\frac{\mathrm{E}\left(\chi^2_{p-1}(\lambda)\right)}{\mathrm{E}\left(\chi^2_{p-1}(\lambda)\right)+\mathrm{E}\left(\chi^2_{n-p}\right)}$

dove è un non-centrale con parametro e gradi di libertà. Quindi un limite superiore non banale per è $\chi^2_{k}(\lambda)$ $\chi^2$ $\lambda$ $k$ $\mathrm{E}(R^2)$

λ + p - 1 λ + n - 1

$\frac{\lambda+p-1}{\lambda+n-1}$

è molto stretto (molto più stretto di quanto mi aspettassi sarebbe possibile):

ad esempio, usando:

rho<-0.75
p<-10
n<-25*p
Su<-matrix(rho,p-1,p-1)
diag(Su)<-1
su<-1
set.seed(123)
bet<-runif(p)

la media delle simulazioni su 1000 è . Il limite superiore teorico sopra dà . Il limite sembra essere ugualmente preciso su molti valori di . Davvero sorprendente! $R^2$ 0.9608190.9609081 $R^2$

EDIT2:

dopo ulteriori ricerche, sembra che la qualità dell'approssimazione del limite superiore a migliorerà all'aumentare di (e tutto il resto uguale, aumenta con ). $E(R^2)$ $\lambda+p$ $\lambda$ $n$

linear-model expected-value

— user603
fonte

ha una distribuzione Beta con parametri che dipendono solo da

. No ? R2 $R^2$

n $n$

p $p$

— Stéphane Laurent

Oooppss mi dispiace, la mia affermazione precedente è vera solo sotto l'ipotesi del "modello nullo" (solo intercettazione). Altrimenti la distribuzione di

dovrebbe essere qualcosa di simile a una distribuzione Beta non centrale, con un parametro di non centralità che coinvolge i parametri sconosciuti. R2 $R^2$

— Stéphane Laurent,

@StéphaneLaurent: thanks. Would you know more about the relationship between the unknown parameters and the parameters of the Beta? I'm stuck, so any pointer would be welcome...

— user603

Do you absolutely need to deal with

E[R2] $E[R^2]$ ? Perhaps there is a simple exact formula for

E[R2/(1−R2)] $E[R^2/(1-R^2)]$ .

— Stéphane Laurent

With the notations of my answer,

R2/(1−R2)=kF $R^2/(1-R^2) = k F$ for some scalar

k $k$ and the first moment of the noncentral

F $F$ -distribution is simple.

— Stéphane Laurent

Any linear model can be written $\boxed{Y=\mu+\sigma G}$ where $G$ has the standard normal distribution on $\mathbb{R}^n$ and $\mu$ is assumed to belong to a linear subspace $W$ of $\mathbb{R}^n$ . In your case $W=\text{Im}(X)$ .

Let $[1] \subset W$ be the one-dimensional linear subspace generated by the vector $(1,1,\ldots,1)$ . Taking $U=[1]$ below, the $R^2$ is highly related to the classical Fisher statistic

F = ∥ P Z Y ∥ 2 / ( m - ℓ ) ∥ P ⊥ W Y ∥ 2 / ( n - m ),

$F = \frac{{\Vert P_Z Y\Vert}^2/(m-\ell)}{{\Vert P_W^\perp Y\Vert}^2/(n-m)},$ for the hypothesis test of

H0:{μ∈U} $H_0\colon\{\mu \in U\}$ where

U⊂W $U\subset W$ is a linear subspace, and denoting by

Z=U⊥∩W $Z=U^\perp \cap W$ the orthogonal complement of

U $U$ in

W $W$ , and denoting

m=dim(W) $m=\dim(W)$ and

ℓ=dim(U) $\ell=\dim(U)$ (then

m=p $m=p$ and

ℓ=1 $\ell=1$ in your situation).

Indeed,

∥ P Z Y ∥ 2 ∥ P ⊥ W Y ∥ 2 = R 2 1 - R 2

$\dfrac{{\Vert P_Z Y\Vert}^2}{{\Vert P_W^\perp Y\Vert}^2} = \frac{R^2}{1-R^2}$ because the definition of

R2 $R^2$ is

R 2 = ∥ P Z Y ∥ 2 ∥ P ⊥ U Y ∥ 2 = 1 - ∥ P ⊥ W Y ∥ 2 ∥ P ⊥ U Y ∥ 2 .

$R^2 = \frac{{\Vert P_Z Y\Vert}^2}{{\Vert P_U^\perp Y\Vert}^2}=1 - \frac{{\Vert P^\perp_W Y\Vert}^2}{{\Vert P_U^\perp Y\Vert}^2}.$

Obviously $\boxed{P_Z Y = P_Z \mu + \sigma P_Z G}$ and $\boxed{P_W^\perp Y = \sigma P_W^\perp G}$ .

When $H_0\colon\{\mu \in U\}$ is true then $P_Z \mu = 0$ and therefore

F = ∥ P Z G ∥ 2 / ( m - ℓ ) ∥ P ⊥ W G ∥ 2 / ( n - m ) \sim F m - ℓ, n - m

$F = \frac{{\Vert P_Z G\Vert}^2/(m-\ell)}{{\Vert P_W^\perp G\Vert}^2/(n-m)} \sim F_{m-\ell,n-m}$ has the Fisher

Fm−ℓ,n−m $F_{m-\ell,n-m}$ distribution. Consequently, from the classical relation between the Fisher distribution and the Beta distribution,

R2∼B(m−ℓ,n−m) $R^2 \sim {\cal B}(m-\ell, n-m)$ .

In the general situation we have to deal with $P_Z Y = P_Z \mu + \sigma P_Z G$ when $P_Z\mu \neq 0$ . In this general case one has ${\Vert P_Z Y\Vert}^2 \sim \sigma^2\chi^2_{m-\ell}(\lambda)$ , the noncentral $\chi^2$ distribution with $m-\ell$ degrees of freedom and noncentrality parameter $\boxed{\lambda=\frac{{\Vert P_Z \mu\Vert}^2}{\sigma^2}}$ , and then $\boxed{F \sim F_{m-\ell,n-m}(\lambda)}$ (noncentral Fisher distribution). This is the classical result used to compute power of $F$ -tests.

The classical relation between the Fisher distribution and the Beta distribution hold in the noncentral situation too. Finally $R^2$ has the noncentral beta distribution with "shape parameters" $m-\ell$ and $n-m$ and noncentrality parameter $\lambda$ . I think the moments are available in the literature but they possibly are highly complicated.

Finally let us write down $P_Z\mu$ . Note that $P_Z = P_W - P_U$ . One has $P_U \mu = \bar\mu 1$ when $U=[1]$ , and $P_W \mu = \mu$ . Hence $P_Z \mu =\mu - \bar\mu 1$ where here $\mu=X\beta$ for the unknown parameters vector $\beta$ .

— Stéphane Laurent
fonte

$P_Z x$ is the orthogoanl projection of

$x$ on the linear subspace

$Z$ . And

$P^\perp$ denotes projection on the orthogonal.

— Stéphane Laurent

Beware of

$Px \neq \Vert P x \Vert^2$ . I'm going to edit my post to write the formulas.

— Stéphane Laurent

Done - do you see any simplification ?

— Stéphane Laurent

$\bar \mu = \frac{1}{n} \sum \mu_i$

— Stéphane Laurent

Type I, obviously: type II are distributed on

$(0, \infty)$ . Actually

$R^2/(1-R^2)$ has the type II distribution. I have done the last corrections for today.

— Stéphane Laurent

Aspettativa condizionale di R al quadrato

EDIT1

EDIT2: