Professor, Department of Statistics
2026-04-27
\[\frac{\frac{\bar X-\mu}{\sqrt{\sigma^2/n}}}{\sqrt{\frac{(n-1)s^2/\sigma^2}{n-1}}}=\frac{\sqrt{n}(\bar X-\mu)}{s}\sim t_{n-1}.\]
Sample mean vector follows a multivariate normal, i.e., \(\bar{\mathbf X} \sim \mathbf N(\boldsymbol \mu, \boldsymbol \Sigma/n)\)
Sample covariance matrix \((n-1)\mathbf S\) follows a Wishart distribution, i.e., \((n-1)\mathbf S \sim Wishart_p (n-1, \Sigma)\)
Independence between \(\bar {\mathbf X}\) and \(S\).
Hoetelling’s \(T^2\): \(T^2 = (\bar{\mathbf X} - \boldsymbol \mu)^T\left(\frac{\mathbf S}{n}\right)^{-1} (\bar{\mathbf X} - \boldsymbol \mu)\)
\[\mathbf{P}^2 = \mathbf{P} \mbox{, }\mathbf{P}=\mathbf{P}^T \]
\[ \gamma_i^T\gamma_j = \left\{ \begin{array}{ll} 1 & \mbox{if } i=j\\ 0 & \mbox{if } i\not=j \end{array} \right. \]
The goal is to find the projection matrix that projects any vector in \(\mathbb{R}^3\) onto the plane defined by \(x + y + z = 0\).
Steps:
The plane equation is \(x + y + z = 0\).
We need two basis vectors \(\mathbf{a}_1, \mathbf{a}_2\) that span the plane:
[,1] [,2] [,3]
[1,] 0.6666667 -0.3333333 -0.3333333
[2,] -0.3333333 0.6666667 -0.3333333
[3,] -0.3333333 -0.3333333 0.6666667
Let’s project \(\mathbf{v} = [3,1,2]^T\): \(v_{\text{proj}}=P \mathbf{v}\).
[,1]
[1,] -0.3333333
[2,] -0.3333333
[3,] 0.6666667
Verification: Check if \(v_{\text{proj}}\) lies on the plane:
[1] 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00
[6] 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00 8.881784e-16
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0.00 0.00 0.00 0.00 0.00 0.95 0.00 0.00 0.00 -0.32
[2,] 0.02 0.37 0.00 -0.09 0.24 -0.11 -0.04 0.83 -0.05 -0.32
[3,] 0.00 0.79 0.00 -0.02 0.04 -0.11 -0.01 -0.51 -0.01 -0.32
[4,] 0.87 -0.14 0.00 -0.15 -0.30 -0.11 0.10 -0.01 0.04 -0.32
[5,] -0.25 -0.14 0.00 0.02 -0.25 -0.11 0.45 -0.01 -0.74 -0.32
[6,] -0.25 -0.14 0.00 -0.04 0.03 -0.11 0.63 -0.01 0.64 -0.32
[7,] 0.05 -0.35 0.00 -0.28 0.77 -0.11 -0.14 -0.24 -0.15 -0.32
[8,] 0.05 -0.14 0.00 0.91 0.09 -0.11 -0.17 -0.01 0.04 -0.32
[9,] -0.25 -0.14 -0.71 -0.18 -0.31 -0.11 -0.41 -0.01 0.11 -0.32
[10,] -0.25 -0.14 0.71 -0.18 -0.31 -0.11 -0.41 -0.01 0.11 -0.32
[,1] [,2] [,3] [,4] [,5]
[1,] 1.000000e+00 -2.775558e-17 -1.387779e-17 -1.387779e-17 -1.387779e-17
[2,] -2.775558e-17 1.000000e+00 4.996004e-16 -9.714451e-17 -9.714451e-17
[3,] -1.387779e-17 4.996004e-16 1.000000e+00 -1.665335e-16 1.387779e-17
[4,] -1.387779e-17 -9.714451e-17 -1.665335e-16 1.000000e+00 -2.498002e-16
[5,] -1.387779e-17 -9.714451e-17 1.387779e-17 -2.498002e-16 1.000000e+00
[6,] -1.387779e-17 -9.714451e-17 -1.387779e-17 -1.249001e-16 -6.938894e-17
[7,] -1.387779e-17 1.110223e-16 -2.775558e-17 -6.938894e-17 -2.775558e-17
[8,] -1.387779e-17 -1.249001e-16 8.326673e-17 2.775558e-17 2.775558e-17
[9,] -1.387779e-17 -1.110223e-16 4.163336e-17 -2.775558e-17 9.714451e-17
[10,] -1.387779e-17 -1.110223e-16 4.163336e-17 -2.775558e-17 2.081668e-16
[,6] [,7] [,8] [,9] [,10]
[1,] -1.387779e-17 -1.387779e-17 -1.387779e-17 -1.387779e-17 -1.387779e-17
[2,] -9.714451e-17 1.110223e-16 -1.249001e-16 -1.110223e-16 -1.110223e-16
[3,] -1.387779e-17 -2.775558e-17 8.326673e-17 4.163336e-17 4.163336e-17
[4,] -1.249001e-16 -6.938894e-17 2.775558e-17 -2.775558e-17 -2.775558e-17
[5,] -6.938894e-17 -2.775558e-17 2.775558e-17 9.714451e-17 2.081668e-16
[6,] 1.000000e+00 1.804112e-16 1.387779e-17 -1.249001e-16 -1.387779e-17
[7,] 1.804112e-16 1.000000e+00 -6.938894e-17 -6.938894e-17 -1.387779e-17
[8,] 1.387779e-17 -6.938894e-17 1.000000e+00 1.110223e-16 -2.775558e-17
[9,] -1.249001e-16 -6.938894e-17 1.110223e-16 1.000000e+00 -1.387779e-17
[10,] -1.387779e-17 -1.387779e-17 -2.775558e-17 -1.387779e-17 1.000000e+00
[,1] [,2] [,3] [,4] [,5]
[1,] 1.000000e+00 -1.249001e-16 -2.775558e-17 -5.551115e-17 -1.387779e-16
[2,] -1.249001e-16 1.000000e+00 -1.387779e-17 5.551115e-17 8.326673e-17
[3,] -2.775558e-17 -1.387779e-17 1.000000e+00 -1.110223e-16 5.551115e-17
[4,] -5.551115e-17 5.551115e-17 -1.110223e-16 1.000000e+00 -9.714451e-17
[5,] -1.387779e-16 8.326673e-17 5.551115e-17 -9.714451e-17 1.000000e+00
[6,] 0.000000e+00 1.040834e-17 0.000000e+00 -6.938894e-18 2.775558e-17
[7,] -2.775558e-17 -2.775558e-17 1.110223e-16 -2.775558e-17 2.220446e-16
[8,] 7.546047e-17 2.949030e-17 -6.938894e-18 -2.602085e-18 -1.092876e-16
[9,] 1.665335e-16 -2.081668e-17 -1.387779e-17 6.938894e-18 1.804112e-16
[10,] 5.551115e-17 1.387779e-16 2.775558e-17 0.000000e+00 5.551115e-17
[,6] [,7] [,8] [,9] [,10]
[1,] 0.000000e+00 -2.775558e-17 7.546047e-17 1.665335e-16 5.551115e-17
[2,] 1.040834e-17 -2.775558e-17 2.949030e-17 -2.081668e-17 1.387779e-16
[3,] 0.000000e+00 1.110223e-16 -6.938894e-18 -1.387779e-17 2.775558e-17
[4,] -6.938894e-18 -2.775558e-17 -2.602085e-18 6.938894e-18 0.000000e+00
[5,] 2.775558e-17 2.220446e-16 -1.092876e-16 1.804112e-16 5.551115e-17
[6,] 1.000000e+00 2.775558e-17 8.630249e-17 6.938894e-18 -1.387779e-17
[7,] 2.775558e-17 1.000000e+00 2.255141e-17 1.526557e-16 1.110223e-16
[8,] 8.630249e-17 2.255141e-17 1.000000e+00 -6.938894e-18 2.931683e-16
[9,] 6.938894e-18 1.526557e-16 -6.938894e-18 1.000000e+00 1.387779e-17
[10,] -1.387779e-17 1.110223e-16 2.931683e-16 1.387779e-17 1.000000e+00
Note that \[\mathbb C=\sum_{i=j}^{n-1} \gamma_j\gamma_j^T\]
\(\mathbb C\mathbf X\) is a linear function of \(\mathbf X\) and it can be verified that \(\mathbb C\mathbf 1=\mathbf 0\), we have \[E[\mathbb C\mathbf X]=\mu \mathbb C\mathbf 1=\mathbf 0\]
Multivariate: Let \(\mathbf X_{n\times p}\) be a random sample from \(N(\boldsymbol\mu, \boldsymbol \Sigma)\) Similarly, it can be shown that \(\mathbb C \mathbf X\) has mean \(\mathbf 0_{n\times p}\). We have verified this numerically.
In either situation, we have \(\mathbb C \mathbf X = \mathbb C (\mathbf X-E[\mathbf X])\) This fact will be used later.
Definition. Let \(Z_1, ..., Z_k \overset{iid}\sim N(0,1)\). Then, the sum of squares \(Q = Z_1^2 + ... + Z_k^2\) has a chi-squared distribution with \(k\) degrees of freedom, denoted by \(\chi_k^2\).
Alternatively, let \(\mathbf Z_{k\times 1} \sim N(\mathbf 0, \mathbf I)\). We say \(||\mathbf Z||^2=\mathbf Z^T \mathbf Z\) follows \(\chi_k^2\).
\(Z_1^2, \cdots, Z_k^2 \overset{iid}\sim \chi_1^2\).
The sum of independent chi-squared random variables is also chi-squared. Specifically, if \(Q_1\sim \chi_{k_1}^2\) and \(Q_2\sim \chi_{k_2}^2\) are independent, then \(Q_1+Q_2\sim \chi_{k_1+k_2}^2\).
\[ f(x) = \frac{1}{2^{k/2}\Gamma(k/2)} x^{k/2-1} e^{-x/2} \mbox{, } x>0 \]
where \(\Gamma(\cdot)\) is the gamma function.
The MGF of a chi-squared random variable with \(k\) degrees of freedom is: \[ M_X(t) = (1-2t)^{-k/2} \]
The mean and variance of a chi-squared random variable with \(k\) degrees of freedom are:
\[\text{E}[X] = k \mbox{, }\text{Var}[X] = 2k\]
Let \(\mathbf P_{n\times n}\) be a projection matrix with rank \(r\) and let \(\mathbf Z_{n\times 1}\sim N(\mathbf 0, \mathbf I)\).
Claim: \(\mathbf Z^T \mathbf P \mathbf Z\sim \chi_r^2\).
Proof: By the spectral decomposition of \(\mathbf P\), we have
\[P= \sum_{i=1}^r \gamma_i \gamma_i^T,\] where \(\gamma_1, \cdots, \gamma_r\) are orthogonal vectors of norm 1, i.e., \(\gamma_i^T \gamma_j = 0\) for \(i\not=j\) and \(\gamma_i^T \gamma_i=1\) for \(i=1, \cdots, r\).
\[ \begin{aligned} \mathbf Z^T \mathbf P \mathbf Z &= \mathbf Z^T \sum_{i=1}^r \gamma_i \gamma_i^T \mathbf Z= \sum_{i=1}^r \mathbf Z^T \gamma_i \gamma_i^T \mathbf Z\\ &= \sum_{i=1}^r (\gamma_i^T \mathbf Z)^T(\gamma_i^T \mathbf Z) \end{aligned} \]
\[Y_i \sim N(\gamma_i^T \mathbf 0, \gamma_i^T \mathbf I \gamma_i^T)=N(0,1)\]
\[Y_1, \cdots, Y_r\overset{iid}\sim N(0,1).\]
Let \(\mathbb C=\mathbf I - \frac{1}{n} \mathbf 1\mathbf 1^T\) be the centering matrix.
We have shown that \(\mathbb C\) is a projection matrix with rank \(n-1\):
\((n-1)s^2=\mathbf X^T\mathbb C \mathbf X\) because
\[\begin{aligned} (n-1)s^2&=\sum_{i=1}^n (X_i-\bar X)^2 = (\mathbb C \mathbf X)^T (\mathbb C \mathbf X) = \mathbf X^T \mathbb C \mathbf X. \end{aligned}\]Let \(\mathbf Z=\frac{(\mathbf X - E[\mathbf X])}{\sigma}\).
Easy to see that \(\mathbf Z\sim N(\mathbf 0, \mathbf I)\). Thus,
\[\frac{(n-1)s^2}{\sigma^2}=\mathbf Z^T \mathbb C \mathbf Z\]
The Wishart distribution is named after the British statistician John Wishart, who introduced it in his 1928 paper published in Biometrika.
Wishart was interested in the problem of estimating the covariance matrix of a multivariate normal distribution.
Wishart showed that the sample covariance matrix follows a particular probability distribution that we now call the Wishart distribution.
The Wishart distribution has become a fundamental tool in multivariate statistical analysis
A Wishart distribution can be defined in the following way
Let \(\mathbf W\) be a \(p\times p\) random matrix.
We say \(\mathbf W\) follows \(Wishart_{p}(k, \boldsymbol \Sigma)\) if \(\mathbf W\) can be written as \(\mathbf W=\mathbf X^T \mathbf X\) where \(\mathbf X\) denotes the random matrix formed by a random sample of size \(k\) from MVN \(N(\mathbf 0, \boldsymbol \Sigma)\).
Remark:\(E[\mathbf W]=k\Sigma\).
\[\mathbf X_1, \cdots \mathbf X_k \overset{iid}\sim N(\mathbf 0, \boldsymbol \Sigma),\]
then \[\mathbf X^T \mathbf X=\sum_{i=1}^k \mathbf X_i \mathbf X_i^T \sim Wishart_p(k, \boldsymbol \Sigma).\]
Wishart: If \(\mathbf X_1, \cdots \mathbf X_k \overset{iid}\sim N(\mathbf 0, \boldsymbol \Sigma)\), then \[\mathbf X^T \mathbf X =\sum_{i=1}^k \mathbf X_i\mathbf X_i^T \sim Wishart_p(k, \boldsymbol \Sigma) \mbox{, where } \mathbf X_{k\times p}=\begin{pmatrix} X_1^T\\ \vdots\\ X_k^T \end{pmatrix} \]
Chi-squared: If \(X_1, \cdots, X_k \overset{iid}\sim N(0,1)\), then
\[\mathbf X^T\mathbf X=\sum_{i=1}^k X_i^2\sim \chi_k^2 \mbox{, where } \mathbf X_{k\times 1}=
\begin{pmatrix}
X_1 \\ \vdots \\ X_k
\end{pmatrix}\]
The sample covariance \((n-1)\mathbf S=\mathbf X^T \mathbb C \mathbf X\).. The definition of Wishart distribution is not applicable immediately.
Rewrite \((n-1)\mathbf S\):
\[ \begin{aligned} (n-1)\mathbf S&=\mathbf X^T \mathbb C^T\mathbb C\mathbb C \mathbf X=(\mathbb C \mathbf X)^T(\mathbb C \mathbf X)\\ &=(\mathbb C \mathbf X)^T\mathbb C(\mathbb C \mathbf X)\\ &=(\mathbb C \mathbf X)^T\sum_{j=1}^{n-1}\gamma_i \gamma_i^T (\mathbb C \mathbf X)\\ &=\sum_{j=1}^{n-1} (\gamma_i^T \mathbb C \mathbf X)^T (\gamma_i^T \mathbb C \mathbf X) \end{aligned}\]
\[\begin{aligned} Cov[Y_i, Y_j]&=E[(Y_i-\mathbf 0 )(Y_j-\mathbf 0)^T]\\ &= E[Y_iY_j^T] \\ &= E[(\gamma_i^T \mathbb C \mathbf X)^T(\gamma_j^T \mathbb C \mathbf X)]\\ &= E[\mathbf X^T \mathbb C \gamma_i \gamma_j^T \mathbb C \mathbf X] \\ &=\mathbf 0 \end{aligned}\]
The last step is true because for \(i\not=j\), \(\gamma_i \gamma_j^T=0\)
Since \(Y_i\) and \(Y_j\) are two linear combinations of the same MVN distributed random matrix (or its vectorized version), we have \(Y_i\) and \(Y_j\) are independent for \(i\not=j\).
We understand that \(Y_i\) follows a MVN with mean \(\mathbf 0\). How about its covariance matrix? Next, we introduce matrix normal distribution.
Let \(\mathbf X_1, \cdots \mathbf X_n\) be a random sample (therefore iid) from \(N(\boldsymbol \mu, \boldsymbol \Sigma)\).
If we stack \(\mathbf X_1, \cdots, \mathbf X_n\) into a \(n\times p\) random matrix \(\mathbf X\), we say follows a matrix normal distribution: \[\mathbf X \sim N(\mathbf 1_n \boldsymbol \mu^T, \boldsymbol \Sigma, \mathbf I_n)\]
Consider the linear function \(\mathbf X^T \mathbb a\). It can be shown that \[\mathbf X^T \mathbb a \sim N((\mathbf 1_n \boldsymbol \mu^T)^T \mathbb a, \mathbb a^T \mathbb a \boldsymbol \Sigma)\sim N(\boldsymbol \mu \mathbf 1_n \mathbb a, \mathbb a^T \mathbb a \boldsymbol \Sigma)\]
Thus, \(Y_i \sim N(\mathbf 0, \boldsymbol \Sigma)\).
Consider a random sample from MVN \(N(\boldsymbol \mu, \boldsymbol \Sigma)\). Let \(\mathbf S\) denote the sample covariance matrix.
We have already shown that \((n-1)\mathbf S \sim Wishart_p(n-1, \boldsymbol \Sigma)\)
What is the distribution of a diagonal element of \((n-1)\mathbf S\)?
What is the distribution of the sum of elements of \((n-1)\mathbf S\)? Note, this is a special case of next question with \(\mathbf B=(1, \cdots, 1)\).
What is the distribution of \((n-1)\mathbf B \mathbf S \mathbf B^T\) where \(B\) is a fixed \(q\times p\) matrix?
If time permits, we will run some simulations
If you cannot get the answer to the last question, let’s use the definition of Wishart distribution.
Let \(\mathbf W = (n-1)S\). Because it follows \(Wishart_p(n-1, \boldsymbol\Sigma)\), we know that \(\mathbf W=\sum_{j=1}^{n-1} \mathbf Z_j \mathbf Z_j^T\) where \(\mathbf Z_j\)’s are iid frm \(N(\mathbf 0, \boldsymbol\Sigma)\).
Then \[ \begin{aligned} (n-1)\mathbf B \mathbf S \mathbf B^T &= \mathbf B\sum_{j=1}^{n-1} \mathbf Z_j \mathbf Z_j^T\mathbf B = \sum_{j=1}^{n-1} \mathbf B \mathbf Z_j \mathbf Z_j^T\mathbf B^T\\ &= \sum_{j=1}^{n-1} (\mathbf B \mathbf Z_j)(\mathbf B \mathbf Z_j)^T \end{aligned} \]
Let \(\mathbf Y_j=\mathbf B \mathbf Z_j\). Note that it is a linear function of \(\mathbf Z_j\); therefore \[\mathbf Y_j\sim N(\mathbf 0, \mathbf B \boldsymbol \Sigma \mathbf B^T)\] and the \(\mathbf Y_j\)’s are iid (becaue …).
By the definition of Wishart distribution, we have
\[(n-1)\mathbf B \mathbf S \mathbf B^T\sim Wishart_q(n-1, \mathbf B \boldsymbol \Sigma \mathbf B^T)\]