Come interpretare le stime dei parametri nei risultati GLM di Poisson [chiuso]

Call:
glm(formula = darters ~ river + pH + temp, family = poisson, data = darterData)

Deviance Residuals:
    Min      1Q   Median     3Q    Max
-3.7422 -1.0257   0.0027 0.7169 3.5347

Coefficients:
              Estimate Std.Error z value Pr(>|z|)
(Intercept)   3.144257  0.218646  14.381  < 2e-16 ***
riverWatauga -0.049016  0.051548  -0.951  0.34166
pH            0.086460  0.029821   2.899  0.00374 **
temp         -0.059667  0.009149  -6.522  6.95e-11 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)
Null deviance: 233.68 on 99 degrees of freedom
Residual deviance: 187.74 on 96 degrees of freedom
AIC: 648.21

Voglio sapere come interpretare ogni parametro stimato nella tabella sopra.

— tomjerry001
fonte

L'interpretazione è identica: stats.stackexchange.com/a/126225/7071

— Dimitriy V. Masterov

Questa domanda sembra fuori tema perché si tratta di spiegare un output R senza alcuna forma di domanda intelligente dietro. Questa è la categoria "Ho scaricato l'output del mio computer lì ed esegui l'analisi stat per me" ...

— Xi'an,

Il parametro di dispersione sembra indicare che ci sono alcuni problemi con il modello. Forse dovresti prendere in considerazione l'utilizzo di una distribuzione quasipoisson. Scommetto che le stime dei tuoi parametri cambieranno drasticamente e così anche l'interpretazione. Se esegui "grafico (modello)" otterrai alcuni grafici dei tuoi residui, dai un'occhiata a questi grafici per i modelli indesiderati prima di iniziare a interpretare il tuo modello reale. Per tracciare rapidamente la misura del tuo modello puoi anche usare "visreg (modelfit)" dal pacchetto visreg

— Robbie,

@ Xi'an, anche se la domanda è scarsa e la modifica richiesta, non penso che sia fuori tema. Considera queste domande che non sono considerate fuori tema: Interpretazione dell'output di R's lm () e Interpretazione dell'output di R per la regressione binomiale . Sembra essere un duplicato , tuttavia.

— gung - Ripristina Monica

Questo è un duplicato di Come interpretare i coefficienti in una regressione di Poisson? Si prega di leggere il thread collegato. Se hai ancora una domanda dopo averlo letto, torna qui e modifica la tua domanda per dichiarare ciò che hai imparato e ciò che devi ancora sapere, quindi possiamo fornire le informazioni di cui hai bisogno senza semplicemente duplicare materiale altrove che già non ha aiutato voi.

— gung - Ripristina Monica

Non credo che il titolo della tua domanda catturi accuratamente ciò che stai chiedendo.

La domanda su come interpretare i parametri in un GLM è molto ampia perché il GLM è una classe molto ampia di modelli. Ricordiamo che un GLM modella una variabile di risposta che si presume segua una distribuzione nota della famiglia esponenziale e che abbiamo scelto una funzione invertibile $y$ $g$ tale che per levariabili predittive . In questo modello, l'interpretazione di ogni particolare parametro è la velocità di variazione di rispetto a . Definire

E [y | x] = g^{- 1} (x_{0} + x_{1} β_{1} + \dots + x_{J} β_{J})

$\mathrm{E}\left[y\,|\,x\right] = g^{-1}{\left(x_0 + x_1\beta_1 + \dots + x_J\beta_J\right)}$

J

$J$

x

$x$

β_{j}

$\beta_j$

g (y)

$g(y)$

x_{j}

$x_j$

per mantenere pulita la notazione. Quindi, per qualsiasi

μ \equiv E [y | x] = g^{- 1} (x)

$\mu \equiv \mathrm{E}{\left[y\,|\,x\right]} = g^{-1}{\left(x\right)}$

η \equiv x \cdot β

$\eta \equiv x \cdot \beta$

j \in {1, \dots, J}

$j \in \{1,\dots,J\}$

Ora definire

essere un vettore di

zeri e un singolo

nella

esima posizione, in modo che per esempio se

allora

. Quindi

β_{j} = \frac{\partial η}{\partial x_{j}} = \frac{\partial g (μ)}{\partial x_{j}} .

$\beta_j = \frac{\partial\,\eta}{\partial\,x_j} = \frac{\partial\,g(\mu)}{\partial\,x_j} \text{.}$

e_{j}

$\mathfrak{e}_j$

J - 1

$J-1$

1

$1$

j

$j$

J = 5

$J=5$

e_{3} = (0, 0, 1, 0, 0)

$\mathfrak{e}_3 = \left(0,0,1,0,0\right)$

β_{j} = g (E [y | x + e_{j}]) - g (E [y | x])

$\beta_j = g{\left(\mathrm{E}{\left[y\,|\,x + \mathfrak{e}_j \right]}\right)} - g{\left(\mathrm{E}{\left[y\,|\,x\right]}\right)}$

Il che significa solo che è l'effetto su di un aumento di unità in . $\beta_j$ $\eta$ $x_j$

Puoi anche dichiarare la relazione in questo modo: ed

\frac{\partial E [y | x]}{\partial x_{j}} = \frac{\partial μ}{\partial x_{j}} = \frac{d μ}{d η} \frac{\partial η}{\partial x_{j}} = \frac{\partial μ}{\partial η} β_{j} = \frac{d g^{- 1}}{d η} β_{j}

$\frac{\operatorname{\partial}\mathrm{E}{\left[y\,|\,x\right]}}{\operatorname{\partial}x_j} = \frac{\operatorname{\partial}\mu}{\operatorname{\partial}x_j} = \frac{\operatorname{d}\mu}{\operatorname{d}\eta}\frac{\operatorname{\partial}\eta}{\operatorname{\partial}x_j} = \frac{\operatorname{\partial}\mu}{\operatorname{\partial}\eta} \beta_j = \frac{\operatorname{d}g^{-1}}{\operatorname{d}\eta} \beta_j$

E [y | x + e_{j}] - E [y | x] \equiv Δ_{j} \hat{y} = g^{- 1} ((x + e_{j}) β) - g^{- 1} (x β)

$\mathrm{E}{\left[y\,|\,x + \mathfrak{e}_j \right]} - \mathrm{E}{\left[y\,|\,x\right]} \equiv \operatorname{\Delta_j} \hat y = g^{-1}{\left( \left(x + \mathfrak{e}_j\right)\beta \right)} - g^{-1}{\left( x\,\beta \right)}$

Senza sapere nulla di , questo è quanto possiamo. è l'effetto su , sulla media condizionale trasformata di , di un aumento di unità in , e l'effetto sulla media condizionale di di un aumento di unità in è . $g$ $\beta_j$ $\eta$ $y$ $x_j$ $y$ $x_j$ $g^{-1}{\left(\beta\right)}$

Ma sembra che tu stia chiedendo specificamente della regressione di Poisson usando la funzione di collegamento predefinita di R, che in questo caso è il logaritmo naturale. Se questo è il caso, si sta chiedendo uno specifico tipo di GLM , in cui e . Quindi possiamo ottenere una certa trazione rispetto a un'interpretazione specifica. $y \sim \mathrm{Poisson}{\left(\lambda\right)}$ $g = \ln$

Da quanto ho detto sopra, sappiamo che . E poiché conosciamo, sappiamo anche che. Ci capita anche di sapere che $\frac{\operatorname{\partial}\mu}{\operatorname{\partial}x_j} = \frac{\operatorname{d}g^{-1}}{\operatorname{d}\eta} \beta_j$ $g(\mu) = \ln(\mu)$ $g^{-1}(\eta) = e^\eta$ , quindi possiamo dire che $\frac{\operatorname{d}e^\eta}{\operatorname{d}\eta} = e^\eta$

\frac{\partial μ}{\partial x_{j}} = \frac{\partial E [y | x]}{\partial x_{j}} = e^{x_{0} + x_{1} β_{1} + \dots + x_{J} β_{J}} β_{j}

$\frac{\operatorname{\partial}\mu}{\operatorname{\partial}x_j} = \frac{\operatorname{\partial}\mathrm{E}{\left[y\,|\,x\right]}}{\operatorname{\partial}x_j} = e^{x_0 + x_1\beta_1 + \dots + x_J\beta_J}\beta_j$

che finalmente significa qualcosa di tangibile:

Dato un piccolo cambiamento di , l'aderente varia da $x_j$ $\hat y$ . $\hat y\,\beta_j$

Nota: questa approssimazione può effettivamente funzionare per modifiche fino a 0,2, a seconda della precisione necessaria.

E utilizzando l'interpretazione più familiare variazione unitaria, abbiamo:

\begin{aligned} Δ_{j} \hat{y} & = e^{x_{0} + x_{1} β_{1} + \dots + (x_{j} + 1) β_{j} + \dots + x_{J} β_{J}} - e^{x_{0} + x_{1} β_{1} + \dots + x_{J} β_{J}} \\ = e^{x_{0} + x_{1} β_{1} + \dots + x_{J} β_{J} + β_{j}} - e^{x_{0} + x_{1} β_{1} + \dots + x_{J} β_{J}} \\ = e^{x_{0} + x_{1} β_{1} + \dots + x_{J} β_{J}} e_{j}^{β} - e^{x_{0} + x_{1} β_{1} + \dots + x_{J} β_{J}} \\ = e^{x_{0} + x_{1} β_{1} + \dots + x_{J} β_{J}} (e_{j}^{β} - 1) \end{aligned}

$\begin{align} \operatorname{\Delta_j} \hat y &= e^{ x_0 + x_1\beta_1 + \dots + \left(x_j + 1\right)\,\beta_j + \dots + x_J\beta_J } - e^{x_0 + x_1\beta_1 + \dots + x_J\beta_J} \\ &= e^{ x_0 + x_1\beta_1 + \dots + x_J\beta_J + \beta_j} - e^{x_0 + x_1\beta_1 + \dots + x_J\beta_J} \\ &= e^{ x_0 + x_1\beta_1 + \dots + x_J\beta_J}e^\beta_j - e^{x_0 + x_1\beta_1 + \dots + x_J\beta_J} \\ &= e^{ x_0 + x_1\beta_1 + \dots + x_J\beta_J} \left( e^\beta_j - 1 \right) \end{align}$ which means

Given a unit change in $x_j$ , the fitted $\hat y$ changes by $\hat y \left( e^\beta_j - 1 \right)$ .

There are three important pieces to note here:

The effect of a change in the predictors depends on the level of the response.
An additive change in the predictors has a multiplicative effect on the response.
You can't interpret the coefficients just by reading them (unless you can compute arbitrary exponentials in your head).

So in your example, the effect of increasing pH by 1 is to increase $\ln \hat y$ by $\hat y \left( e^{0.09} - 1 \right)$ ; that is, to multiply $\hat y$ by $e^{0.09} \approx 1.09$ . It looks like your outcome is the number of darters you observe in some fixed unit of time (say, a week). So if you're observing 100 darters a week at a pH of 6.7, raising the pH of the river to 7.7 means you can now expect to see 109 darters a week.

— shadowtalker
fonte

I made a couple tweaks here, @ssdecontrol. I think they'll make your post a little easier to follow, but if you don't like them, roll them back with my apologies.

— gung - Reinstate Monica

I you can't figure that out from my answer then clearly I need to revise the answer. What are you still confused about?

— shadowtalker

Plug those numbers into the equation just like in linear regression

— shadowtalker

@skan no, I mean

E [y | x]

$E[y|x]$ .

x

$x$ and

y

$y$ are random variables representing to a single observation.

x

$x$ is a vector indexed by

j

$j$ ;

x_{j}

$x_j$ is the random variable representing a specific feature/regressor/input/predictor for that observation.

— shadowtalker

And don't overthink it. Once you understand all the pieces in a GLM, the manipulations here are just a direct application of calculus principles. It really is as simple as taking the derivative with respect to the variable you're interested in.

— shadowtalker

My suggestion would be to create a small grid consisting of combinations of the two rivers and two or three values of each of the covariates, then use the predict function with your grid as newdata. Then graph the results. It is much clearer to look at the values that the model actually predicts. You may or may not want to back-transform the predictions to the original scale of measurement (type = "response").

— Russ Lenth
fonte

As much as I like this approach (I do it all the time) I think it's counterproductive for building understanding.

— shadowtalker