Come posso verificare se le due stime dei parametri nello stesso modello sono significativamente diverse?

Ho il modello

y = x^{a} \times z^{b} + e

$y=x^a \times z^b + e$

dove $y$ è la variabile dipendente, $x$ e $z$ sono variabili esplicative, $a$ e $b$ sono i parametri ed $e$ è un termine di errore. Ho delle stime dei parametri di $a$ e $b$ e una matrice di covarianza di queste stime. Come faccio a testare se $a$ e $b$ sono significativamente differenti?

statistical-significance nonlinear-regression

— K. Roelofs
fonte

Valutare l'ipotesi che $a$ e $b$ sono diversi equivale a testare l'ipotesi nulla $a - b = 0$ (contro l'alternativa che $a-b\ne 0$ ).

I seguenti presume analisi è ragionevole per voi a stimare $a-b$ come

U = \hat{a} - \hat{b} .

$U = \hat a - \hat b.$ Accetta anche la tua formulazione del modello (che spesso è ragionevole), che - poiché gli errori sono additivi (e potrebbero persino produrre valori negativi osservati di

y

$y$ ) - non ci consente di linearizzarla prendendo logaritmi di entrambi i lati.

La varianza di $U$ può essere espresso in termini della matrice di covarianza $(c_{ij})$ di $(\hat a, \hat b)$ come

Var (U) = Var (\hat{a} - \hat{b}) = Var (\hat{a}) + Var (\hat{b}) - 2 Cov (\hat{a}, \hat{b}) = c_{11} + c_{22} - 2 c_{12}^{2} .

$\operatorname{Var}(U) = \operatorname{Var}(\hat a - \hat b) = \operatorname{Var}(\hat a) + \operatorname{Var}(\hat b) - 2 \operatorname{Cov}(\hat a, \hat b) = c_{11} + c_{22} - 2c_{12}^2.$

Quando $(\hat a, \hat b)$ è stimata con minimi quadrati, di solito si usa un "test t;" cioè, la distribuzione di

t = U / \sqrt{V a r (U)}

$t = U / \sqrt{\operatorname{Var(U)}}$ è approssimato da unadistribuzione t di Studentcon

n - 2

$n-2$ gradi di libertà (dove

n

$n$ è il conteggio dei dati e

2

$2$ conta il numero di coefficienti). Indipendentemente da ciò,

t

$t$ solito è la base di qualsiasi test. È possibile eseguire un test Z (quando

n

$n$ è grande o quando si adatta con Maximum Likelihood) o avviarlo, ad esempio.

Per essere precisi, il valore p del test t è dato da

p = 2 t_{n - 2} (- | t |)

$p = 2t_{n-2}(-|t|)$

$t_{n-2}$ $n-2$ $|t|.$

$c_1,$ $c_2,$ $\mu$

H_{0} : c_{1} a + c_{2} b = μ

$H_0: c_1 a + c_2 b = \mu$

$(c_{ij})$ $U = c_1 a + c_2 b$

t = (c_{1} \hat{a} + c_{2} \hat{b} - μ) / \sqrt{Var (U)} .

$t = (c_1 \hat a + c_2 \hat b - \mu) / \sqrt{\operatorname{Var}(U)}.$

Quanto sopra è il caso $(c_1,c_2) = (1,-1)$ e $\mu=0.$

Re $t$ $t$ $500$ $n=5$ $t$ $a=b=-1/2.$

$a,$ $b,$ $\sigma$ $n$

Ecco il codice

#
# Specify the true parameters.
#
set.seed(17)
a <- -1/2
b <- -1/2
sigma <- 0.25 # Variance of the errors
n <- 5        # Sample size
n.sim <- 500  # Simulation size
#
# Specify the hypothesis.
#
H.0 <- c(1, -1) # Coefficients of `a` and `b`.
mu <- 0 
#
# Provide x and z values in terms of their logarithms.
#
log.x <- log(rexp(n))
log.z <- log(rexp(n))
#
# Compute y without error.
#
y.0 <- exp(a * log.x + b * log.z)
#
# Conduct a simulation to estimate the sampling distribution of the t statistic.
#
sim <- replicate(n.sim, {
  #
  # Add the errors.
  #
  e <- rnorm(n, 0, sigma)
  df <- data.frame(log.x=log.x, log.z=log.z, y.0, y=y.0 + e)
  #
  # Guess the solution.
  #
  fit.ols <- lm(log(y) ~ log.x + log.z - 1, subset(df, y > 0))
  start <- coefficients(fit.ols) # Initial values of (a.hat, b.hat)
  #
  # Polish it using nonlinear least squares.
  #
  fit <- nls(y ~ exp(a * log.x + b * log.z), df, list(a=start[1], b=start[2]))
  #
  # Test a hypothesis.
  #
  cc <- vcov(fit)
  s <- sqrt((H.0 %*% cc %*% H.0))
  (crossprod(H.0, coef(fit)) - mu) / s
})
#
# Display the simulation results.
#
summary(lm(sort(sim) ~ 0 + ppoints(length(sim))))
qqplot(qt(ppoints(length(sim)), df=n-2), sim, 
       pch=21, bg="#00000010", col="#00000040",
       xlab="Student t reference value", 
       ylab="Test statistic")
abline(0:1, col="Red", lwd=2)

— whuber
fonte

Questo è eccellente La risposta con la teoria, con i passaggi da seguire per ripetere per altri test, con un approccio numerico per chiarezza e con il codice. Questo è il gold standard.

— SecretAgentMan

Trovo " L' ipotesi che aeb siano diversi" ambigua nella tua frase di apertura, perché non è chiaro se si tratti di un'ipotesi nulla o alternativa. La domanda del PO chiarisce che stanno cercando prove della differenza, e la seconda clausola della tua frase ne parla. Pedagogicamente penso che aiutare le persone più nuove ai test di ipotesi sia super esplicito. (Ma +1 per la tua risposta complessiva :)

— Alexis,

@Alexis grazie - Capisco quello che stai dicendo. Perché ho in mente persone così, chiarirò.

— whuber