证明线性回归beta0是无偏估计
首先,我们定义线性回归模型为:
$$y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + ... + \beta_p x_{ip} + \epsilon_i$$
其中,$y_i$是第$i$个观测值的因变量,$x_{i1}, x_{i2}, ..., x_{ip}$是第$i$个观测值的$p$个自变量,$\beta_0, \beta_1, \beta_2, ..., \beta_p$是线性回归模型的系数,$\epsilon_i$是第$i$个观测值的误差项。
我们知道,最小二乘法可以用来估计线性回归模型的系数。
对于$\beta_0$的无偏性,我们需要证明$E(\hat{\beta_0})=\beta_0$,其中$\hat{\beta_0}$是$\beta_0$的最小二乘估计。
根据最小二乘法的公式,我们可以得到:
$$\hat{\beta_0} = \bar{y} - \hat{\beta_1} \bar{x}_1 - \hat{\beta_2} \bar{x}_2 - ... - \hat{\beta_p} \bar{x}_p$$
其中,$\bar{y}$是因变量$y$的均值,$\bar{x}_1, \bar{x}_2, ..., \bar{x}_p$是自变量$x_1, x_2, ..., x_p$的均值,$\hat{\beta_1}, \hat{\beta_2}, ..., \hat{\beta_p}$是自变量的最小二乘估计值。
我们可以将上式展开,得到:
$$\hat{\beta_0} = \frac{1}{n} \sum_{i=1}^n y_i - \hat{\beta_1} \frac{1}{n} \sum_{i=1}^n x_{i1} - \hat{\beta_2} \frac{1}{n} \sum_{i=1}^n x_{i2} - ... - \hat{\beta_p} \frac{1}{n} \sum_{i=1}^n x_{ip}$$
接下来,我们需要计算$\hat{\beta_0}$的期望$E(\hat{\beta_0})$。
由于线性回归模型中的误差项$\epsilon_i$是独立同分布的,且均值为0,即$E(\epsilon_i)=0$,因此:
$$E(\hat{\beta_0}) = E\left(\frac{1}{n} \sum_{i=1}^n y_i - \hat{\beta_1} \frac{1}{n} \sum_{i=1}^n x_{i1} - \hat{\beta_2} \frac{1}{n} \sum_{i=1}^n x_{i2} - ... - \hat{\beta_p} \frac{1}{n} \sum_{i=1}^n x_{ip}\right)$$
$$= \frac{1}{n} \sum_{i=1}^n E(y_i) - \hat{\beta_1} \frac{1}{n} \sum_{i=1}^n E(x_{i1}) - \hat{\beta_2} \frac{1}{n} \sum_{i=1}^n E(x_{i2}) - ... - \hat{\beta_p} \frac{1}{n} \sum_{i=1}^n E(x_{ip})$$
由于线性回归模型中的自变量$x_{i1}, x_{i2}, ..., x_{ip}$是固定的,因此它们的期望是常数,即:
$$E(x_{i1}) = \bar{x}1, E(x{i2}) = \bar{x}2, ..., E(x{ip}) = \bar{x}_p$$
同时,因变量$y_i$的期望$E(y_i)$可以表示为:
$$E(y_i) = E(\beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + ... + \beta_p x_{ip} + \epsilon_i)$$
$$= \beta_0 + \beta_1 E(x_{i1}) + \beta_2 E(x_{i2}) + ... + \beta_p E(x_{ip}) + E(\epsilon_i)$$
$$= \beta_0$$
因此,我们有:
$$E(\hat{\beta_0}) = \frac{1}{n} \sum_{i=1}^n \beta_0 - \hat{\beta_1} \bar{x}_1 - \hat{\beta_2} \bar{x}_2 - ... - \hat{\beta_p} \bar{x}_p$$
接下来,我们需要证明$\hat{\beta_1}, \hat{\beta_2}, ..., \hat{\beta_p}$是$\beta_1, \beta_2, ..., \beta_p$的无偏估计。
因为$\hat{\beta_1}, \hat{\beta_2}, ..., \hat{\beta_p}$是最小二乘估计,它们满足以下方程:
$$\sum_{i=1}^n (y_i - \hat{\beta_0} - \hat{\beta_1} x_{i1} - \hat{\beta_2} x_{i2} - ... - \hat{\beta_p} x_{ip}) x_{i1} = 0$$
$$\sum_{i=1}^n (y_i - \hat{\beta_0} - \hat{\beta_1} x_{i1} - \hat{\beta_2} x_{i2} - ... - \hat{\beta_p} x_{ip}) x_{i2} = 0$$
$$...$$
$$\sum_{i=1}^n (y_i - \hat{\beta_0} - \hat{\beta_1} x_{i1} - \hat{\beta_2} x_{i2} - ... - \hat{\beta_p} x_{ip}) x_{ip} = 0$$
将这些方程展开,可以得到:
$$\sum_{i=1}^n y_i - n \hat{\beta_0} - \hat{\beta_1} \sum_{i=1}^n x_{i1} - \hat{\beta_2} \sum_{i=1}^n x_{i2} - ... - \hat{\beta_p} \sum_{i=1}^n x_{ip} = 0$$
$$\sum_{i=1}^n x_{i1} y_i - \hat{\beta_0} \sum_{i=1}^n x_{i1} - \hat{\beta_1} \sum_{i=1}^n x_{i1}^2 - \hat{\beta_2} \sum_{i=1}^n x_{i1} x_{i2} - ... - \hat{\beta_p} \sum_{i=1}^n x_{i1} x_{ip} = 0$$
$$\sum_{i=1}^n x_{i2} y_i - \hat{\beta_0} \sum_{i=1}^n x_{i2} - \hat{\beta_1} \sum_{i=1}^n x_{i1} x_{i2} - \hat{\beta_2} \sum_{i=1}^n x_{i2}^2 - ... - \hat{\beta_p} \sum_{i=1}^n x_{i2} x_{ip} = 0$$
$$...$$
$$\sum_{i=1}^n x_{ip} y_i - \hat{\beta_0} \sum_{i=1}^n x_{ip} - \hat{\beta_1} \sum_{i=1}^n x_{i1} x_{ip} - \hat{\beta_2} \sum_{i=1}^n x_{i2} x_{ip} - ... - \hat{\beta_p} \sum_{i=1}^n x_{ip}^2 = 0$$
将$\hat{\beta_0}$代入以上方程,可以得到:
$$\sum_{i=1}^n y_i - \frac{n}{n} \sum_{i=1}^n y_i - \hat{\beta_1} \bar{x}_1 n \bar{x}_1 - \hat{\beta_2} \bar{x}_2 n \bar{x}_2 - ... - \hat{\beta_p} \bar{x}_p n \bar{x}_p = 0$$
$$\sum_{i=1}^n x_{i1} y_i - \frac{1}{n} \sum_{i=1}^n y_i \sum_{i=1}^n x_{i1} - \hat{\beta_1} \sum_{i=1}^n x_{i1}^2 - \hat{\beta_2} \sum_{i=1}^n x_{i1} x_{i2} - ... - \hat{\beta_p} \sum_{i=1}^n x_{i1} x_{ip} = 0$$
$$\sum_{i=1}^n x_{i2} y_i - \frac{1}{n} \sum_{i=1}^n y_i \sum_{i=1}^n x_{i2} - \hat{\beta_1} \sum_{i=1}^n x_{i1} x_{i2} - \hat{\beta_2} \sum_{i=1}^n x_{i2}^2 - ... - \hat{\beta_p} \sum_{i=1}^n x_{i2} x_{ip} = 0$$
$$...$$
$$\sum_{i=1}^n x_{ip} y_i - \frac{1}{n} \sum_{i=1}^n y_i \sum_{i=1}^n x_{ip} - \hat{\beta_1} \sum_{i=1}^n x_{i1} x_{ip} - \hat{\beta_2} \sum_{i=1}^n x_{i2} x_{ip} - ... - \hat{\beta_p} \sum_{i=1}^n x_{ip}^2 = 0$$
将上述方程组写成矩阵形式,可以得到:
$$\begin{bmatrix}n & \bar{x}1 n & \bar{x}2 n & ... & \bar{x}p n \ \bar{x}1 n & \sum{i=1}^n x{i1}^2 & \sum{i=1}^n x{i1} x_{i2} & ... & \sum_{i=1}^n x_{i1} x_{ip} \ \bar{x}2 n & \sum{i=1}^n x_{i1} x_{i2} & \sum_{i=1}^n x_{i2}^2 & ... & \sum_{i=1}^n x_{i2} x_{ip} \ ... & ... & ... & ... & ... \ \bar{x}p n & \sum{i=1}^n x_{i1} x_{ip} & \sum_{i=1}^n x_{i2} x_{ip} & ... & \sum_{i=1}^n x_{ip}^2 \end{bmatrix} \begin{bmatrix}\hat{\beta_0} \ \hat{\beta_1} \ \hat{\beta_2} \ ... \ \hat{\beta_p} \end{bmatrix} = \begin{bmatrix}\sum_{i=1}^n y_i \ \sum_{i=1}^n x_{i1} y_i \ \sum_{i=1}^n x_{i2} y_i \ ... \ \sum_{i=1}^n x_{ip} y_i \end{bmatrix}$$
我们可以使用矩阵乘法的性质简化上面的方程组,得到:
$$\begin{bmatrix}\hat{\beta_0} \ \hat{\beta_1} \ \hat{\beta_2} \ ... \ \hat{\beta_p} \end{bmatrix} = \begin{bmatrix}\frac{1}{n} & 0 & 0 & ... & 0 \ 0 & \frac{1}{\sum_{i=1}^n x_{i1}^2} & 0 & ... & 0 \ 0 & 0 & \frac{1}{\sum_{i=1}^n x_{i2}^2} & ... & 0 \ ... & ... & ... & ... & ... \ 0 & 0 & 0 & ... & \frac{1}{\sum_{i=1}^n x_{ip}^2}\end{bmatrix} \begin{bmatrix}n & \bar{x}1 n & \bar{x}2 n & ... & \bar{x}p n \ \bar{x}1 n & \sum{i=1}^n x{i1}^2 & \sum{i=1}^n x{i1} x_{i2} & ... & \sum_{i=1}^n x_{i1} x_{ip} \ \bar{x}2 n & \sum{i=1}^n x_{i1} x_{i2} & \sum_{i=1}^n x_{i2}^2 & ... & \sum_{i=1}^n x_{i2} x_{ip} \ ... & ... & ... & ... & ... \ \bar{x}p n & \sum{i=1}^n x_{i1} x_{ip} & \sum_{i=1}^n x_{i2} x_{ip} & ... & \sum_{i=1}^n x_{ip}^2 \end{bmatrix}^{-1} \begin{bmatrix}\sum_{i=1}^n y_i \ \sum_{i=1}^n x_{i1} y_i \ \sum_{i=1}^n x_{i2} y_i \ ... \ \sum_{i=1}^n x_{ip} y_i \end{bmatrix}$$
因此,我们可以得到:
$$E(\hat{\beta_1}) = \beta_1, E(\hat{\beta_2}) = \beta_2, ..., E(\hat{\beta_p}) = \beta_p$$
由于$\hat{\beta_0}$不依赖于任何自变量,因此:
$$E(\hat{\beta_0}) = \beta_0$$
因此,我们证明了线性回归模型中$\beta_0$的最小二乘估计$\hat{\beta_0}$是无偏估计。
原文地址: https://www.cveoy.top/t/topic/bqmv 著作权归作者所有。请勿转载和采集!