Existe-t-il un moyen d'utiliser la matrice de covariance pour trouver des coefficients de régression multiple?

Pour une régression linéaire simple, le coefficient de régression peut être calculé directement à partir de la matrice de variance-covariance $C$ , par

\frac{C_{d, e}}{C_{e, e}}

$C_{d, e}\over C_{e,e}$ où

d

$d$ est l'indice de la variable dépendante et

e

$e$ est l'indice de la variable explicative.

Si l'on n'a que la matrice de covariance, est-il possible de calculer les coefficients d'un modèle à multiples variables explicatives?

ETA: Pour deux variables explicatives, il apparaît que et de façon analogue pour. Je ne vois pas immédiatement comment étendre cela à trois variables ou plus.

β_{1} = \frac{C o v (y, x_{1}) v a r (x_{2}) - C o v (y, x_{2}) C o v (x_{1}, x_{2})}{v une r (X_{1}) v une r (X_{2}) - C o v (X_{1}, X_{2})^{2}}

$\beta_1 = \frac{Cov(y,x_1)var(x_2) - Cov(y,x_2)Cov(x_1,x_2)}{var(x_1)var(x_2) - Cov(x_1,x_2)^2}$

β_{2}

$\beta_2$

regression regression-coefficients covariance-matrix

— David
source

Le vecteur de coefficient

est une solution à

. Certaines manipulations algébriques révèlent qu'il s'agit en fait de la même formule que celle que vous donnez dans le cas des 2 coefficients. Joliment présenté ici: stat.purdue.edu/~jennings/stat514/stat512notes/topic3.pdf . Je ne sais pas si cela aide du tout. Mais j'oserais deviner que cela est impossible en général sur la base de cette formule.

\hat{β}

$\hat{\beta}$

X^{'} Y = (X^{'} X)^{- 1} β

$X'Y=(X'X)^{-1}\beta$

— shadowtalker

@David Avez-vous compris comment étendre cela à un nombre arbitraire de variables explicatives (au-delà de 2)? J'ai besoin de l'expression.

— Jane Wayne

@JaneWayne Je ne suis pas sûr de comprendre votre question: whuber a donné la solution ci-dessous sous forme matricielle,

C^{- 1} (Cov (X_{i}, y))^{'}

$C^{-1}(\text{Cov}(X_i, y))^\prime$

— David

yup je l'ai étudié et il a raison.

— Jane Wayne

Oui, la matrice de covariance de toutes les variables - explicatives et réponses - contient les informations nécessaires pour trouver tous les coefficients, à condition qu'un terme d'interception (constant) soit inclus dans le modèle. (Bien que les covariances ne fournissent aucune information sur le terme constant, elles peuvent être trouvées à partir des moyennes des données.)

Une analyse

Que les données pour les variables explicatives être disposés comme vecteurs colonnes de dimension , et la variable de réponse soit le vecteur colonne , considéré comme une réalisation d'une variable aléatoire . Les estimations des moindres carrés ordinaires des coefficients dans le modèle $n$ $x_1, x_2, \ldots, x_p$ $y$ $Y$ $\hat\beta$

E (Y) = α + X β

$\mathbb{E}(Y) = \alpha + X\beta$

sont obtenus en assemblant les vecteurs de colonnes en un tableau et en résolvant le système d'équations linéaires $p+1$ $X_0 = (1, 1, \ldots, 1)^\prime, X_1, \ldots, X_p$ $n \times p+1$ $X$

X^{'} X \hat{β} = X^{'} y .

$X^\prime X \hat\beta = X^\prime y.$

Il est équivalent au système

\frac{1}{n} X^{'} X \hat{β} = \frac{1}{n} X^{'} y .

$\frac{1}{n}X^\prime X \hat\beta = \frac{1}{n}X^\prime y.$

L'élimination gaussienne résoudra ce système. Il procède en joignant le matrice $p+1\times p+1$ et le $\frac{1}{n}X^\prime X$ vecteur $p+1$ dans untableauet en le réduisant en ligne. $\frac{1}{n}X^\prime y$ $p+1 \times p+2$ $A$

La première étape inspectera $\frac{1}{n}(X^\prime X)_{11} = \frac{1}{n}X_0^\prime X_0 = 1$ $A$ $\frac{1}{n}X_0^\prime X_i = \overline X_i$ $A_{i+1,j+1} = X_i^\prime X_j$ will equal $\overline X_i \overline X_j$ . This is just the formula for the covariance of $X_i$ and $X_j$ . Moreover, the number left in the $i+1, p+2$ position equals $\frac{1}{n}X_i^\prime y - \overline{X_i}\overline{y}$ , the covariance of $X_i$ with $y$ .

Thus, after the first step of Gaussian elimination the system is reduced to solving

C \hat{β} = (Cov (X_{i}, y))^{'}

$C\hat{\beta} = (\text{Cov}(X_i, y))^\prime$

and obviously--since all the coefficients are covariances--that solution can be found from the covariance matrix of all the variables.

(When $C$ is invertible the solution can be written $C^{-1}(\text{Cov}(X_i, y))^\prime$ . The formulas given in the question are special cases of this when $p=1$ and $p=2$ . Writing out such formulas explicitly will become more and more complex as $p$ grows. Moreover, they are inferior for numerical computation, which is best carried out by solving the system of equations rather than by inverting the matrix $C$ .)

The constant term will be the difference between the mean of $y$ and the mean values predicted from the estimates, $X\hat{\beta}$ .

Example

To illustrate, the following R code creates some data, computes their covariances, and obtains the least squares coefficient estimates solely from that information. It compares them to the estimates obtained from the least-squares estimator lm.

#
# 1. Generate some data.
#
n <- 10        # Data set size
p <- 2         # Number of regressors
set.seed(17)
z <- matrix(rnorm(n*(p+1)), nrow=n, dimnames=list(NULL, paste0("x", 1:(p+1))))
y <- z[, p+1]
x <- z[, -(p+1), drop=FALSE]; 
#
# 2. Find the OLS coefficients from the covariances only.
#
a <- cov(x)
b <- cov(x,y)
beta.hat <- solve(a, b)[, 1]  # Coefficients from the covariance matrix
#
# 2a. Find the intercept from the means and coefficients.
#
y.bar <- mean(y)
x.bar <- colMeans(x)
intercept <- y.bar - x.bar %*% beta.hat

The output shows agreement between the two methods:

(rbind(`From covariances` = c(`(Intercept)`=intercept, beta.hat),
       `From data via OLS` = coef(lm(y ~ x))))

                  (Intercept)        x1        x2
From covariances     0.946155 -0.424551 -1.006675
From data via OLS    0.946155 -0.424551 -1.006675

— whuber
source

Thanks, @whuber! This is exactly what I was looking for, and my atrophied brain was unable to get to. As an aside, the motivation for the question is that for various reasons we essentially do not have the full

X

$X$ available, but have cov(z) from previous calculations.

— David

Answers like this raise the bar of this Cross Validated

— jpmuc

@whuber In your example, you computed the intercept from y and x and beta.hat. The y and x are part of the original data. Is it possible to derive the intercept from the covariance matrix and means alone? Could you please provide the notation?

— Jane Wayne

@Jane Given only the means

\bar{X}

$\bar X$ , apply

\hat{β}

$\hat \beta$ to them:

\bar{X} \hat{β} = \bar{X \hat{β}} .

$\overline X \hat\beta = \overline{X \hat\beta}.$ I have changed the code to reflect this.

— whuber

very helpful +1 for the code

— Michael