Professor, Department of Statistics
2026-04-27
A Wishart distribution can be defined in the following way
Let \(\mathbf W\) be a \(p\times p\) random matrix.
We say \(\mathbf W\) follows \(Wishart_{p}(k, \boldsymbol \Sigma)\) if \(\mathbf W\) can be written as \(\mathbf W=\mathbf X^T \mathbf X\) where \(\mathbf X\) denotes the random matrix formed by a random sample of size \(k\) from MVN \(N(\mathbf 0, \boldsymbol \Sigma)\).
Remark:\(E[\mathbf W]=k\Sigma\).
\[\mathbf X_1, \cdots \mathbf X_k \overset{iid}\sim N(\mathbf 0, \boldsymbol \Sigma),\]
then \[\mathbf X^T \mathbf X=\sum_{i=1}^k \mathbf X_i \mathbf X_i^T \sim Wishart_p(k, \boldsymbol \Sigma).\]
Wishart: If \(\mathbf X_1, \cdots \mathbf X_k \overset{iid}\sim N(\mathbf 0, \boldsymbol \Sigma)\), then \[\mathbf X^T \mathbf X =\sum_{i=1}^k \mathbf X_i\mathbf X_i^T \sim Wishart_p(k, \boldsymbol \Sigma) \mbox{, where } \mathbf X_{k\times p}=\begin{pmatrix} X_1^T\\ \vdots\\ X_k^T \end{pmatrix} \]
Chi-squared: If \(X_1, \cdots, X_k \overset{iid}\sim N(0,1)\), then
\[\mathbf X^T\mathbf X=\sum_{i=1}^k X_i^2\sim \chi_k^2 \mbox{, where } \mathbf X_{k\times 1}=
\begin{pmatrix}
X_1 \\ \vdots \\ X_k
\end{pmatrix}\]
Let \(\mathbf X_1, \cdots \mathbf X_n\) be a random sample from \(N(\boldsymbol \mu, \boldsymbol \Sigma)\).
We have shown that
\[(n-1)\mathbf S \sim Wishart_p(n-1, \boldsymbol\Sigma)\]
\[ \begin{aligned} (n-1)\mathbf S&=\mathbf X^T \mathbb C^T\mathbb C\mathbb C \mathbf X=(\mathbb C \mathbf X)^T(\mathbb C \mathbf X)\\ &=(\mathbb C \mathbf X)^T\mathbb C(\mathbb C \mathbf X)\\ &=(\mathbb C \mathbf X)^T\sum_{j=1}^{n-1}\gamma_i \gamma_i^T (\mathbb C \mathbf X)\\ &=\sum_{j=1}^{n-1} (\gamma_i^T \mathbb C \mathbf X)^T (\gamma_i^T \mathbb C \mathbf X) \end{aligned}\]
Let \(Y_i= (\gamma_i^T \mathbb C \mathbf X)^T\).
We have shown that \(Y_1, \cdots, Y_{n-1} \overset{iid}\sim N(\mathbf 0, \boldsymbol \Sigma)\).
Following the definition of Wishart, we have
\[(n-1)\mathbf S \sim Wishart_p(n-1, \boldsymbol \Sigma).\]
[,1] [,2]
[1,] 4.026980 -2.843953
[2,] -2.843953 4.048838
[,1] [,2]
[1,] 4.0 -2.8
[2,] -2.8 4.0
Let \(\boldsymbol{X}_1, \boldsymbol{X}_2, ..., \boldsymbol{X}_n\) be a random sample from a multivariate normal distribution with mean vector \(\boldsymbol{\mu}\) and covariance matrix \(\boldsymbol{\Sigma}\).
The sample mean vector and sample covariance matrix are denoted by \(\bar{\mathbf X}\) and \(\mathbf S\), respectively.
The null hypothesis of interest \(H_0: \boldsymbol \mu = \boldsymbol \mu_0\)
The one-sample Hotelling \(T^2\) is defined as \[T^2=(\hat{\mathbf \mu} - \mathbf \mu_0)^T \left(Cov(\hat{\mathbf \mu})\right)^{-1}(\hat{\mathbf \mu} - \mathbf \mu_0)\]
We have shown that \(T^2\sim T_{p, n-1}^2\) when \(H_0: \boldsymbol \mu=\boldsymbol \mu_0\).
eigen() decomposition
$values
[1] 8.8 4.0 4.0 4.0
$vectors
[,1] [,2] [,3] [,4]
[1,] -0.5 0.8660254 0.0000000 0.0000000
[2,] -0.5 -0.2886751 -0.5773503 -0.5773503
[3,] -0.5 -0.2886751 -0.2113249 0.7886751
[4,] -0.5 -0.2886751 0.7886751 -0.2113249
meat dairy veg other
24.034032 15.928361 7.660490 7.738634
meat dairy veg other
meat 0.07159404 0.013584596 0.018824131 0.009220700
dairy 0.01358460 0.073421655 0.005829816 0.003895500
veg 0.01882413 0.005829816 0.086176323 0.009828535
other 0.00922070 0.003895500 0.009828535 0.075478822
where \(Cov(\bar{\mathbf X})=\frac{\mathbf S}{n}\).
The result indicates that \[Pr[(\bar{\mathbf X} - \boldsymbol \mu)^T \left(Cov(\bar{\mathbf X})\right)^{-1}(\bar{\mathbf X} - \boldsymbol \mu)\le \frac{(n-1)p}{n-p} F_{p, n-p, 1-\alpha}]=1-\alpha\]
Thus, a \((1-\alpha)100\%\) confidence region for \(\boldsymbol \mu\) is \[\{\mathbf\mu: (\mathbf{\bar X} - \boldsymbol \mu)^T \left(Cov(\mathbf{\bar X})\right)^{-1}(\mathbf{\bar X} - \boldsymbol \mu)\le \frac{(n-1)p}{n-p} F_{p, n-p, 1-\alpha}\}\]
If there is only one parameter of interest, we can construct a C.I. using t-distribution, just as in univariate analysis
Example. What is the mean protein intake from source \(j\)?
This lecture: we construct a C.I. for \(\mu_j\) by using \(t_{n-1, 1-\frac{\alpha}{2}}\) as the critical value
\[\bar{X}_{(j)} \pm t_{n-1, 1- \frac{\alpha}{2}}\sqrt{\frac{s^2_{X_{(j)}}}{n}} \]
What if we are interested in several simultaneously? We will discuss simultaneous confidence intervals in the next few slides.
Clearly \(Pr(A_1\cap A_2 \cap A_3 \cap A_4)<1-\alpha\)
Thus, if we use \(t_{n-1, 1-\frac{\alpha}{2}}\) as the critical value, we do not have enough coverage for all the parameters in \(\boldsymbol \mu\) simultaneously
What we need to construct are simultaneous confidence intervals
Method 1 for simultaneous C.I. \(T^2\). Some linear algebra result ensures that the following method gives \((1-\alpha)100\%\) confidence to cover all linear combinations of the parameters (in the form of \(a^T\boldsymbol \mu\)) simultaneously \[a^T\bar{\mathbf X}\pm \sqrt{\frac{(n-1)p}{n-p}F_{p, n-p, 1-\alpha}} se(a^T\bar{\mathbf X}) \]
Method 2 Bonferroni’s correction: simply replace \(\alpha\) with \(\alpha/k\) where \(k\) is the total number of linear functions of the mean parameters: \(t_{n-1, 1-\alpha/(2k)}\), where \(k\) is the number of parameters of interest.
[1] "sample means"
meat dairy veg other
24.034032 15.928361 7.660490 7.738634
[1] "standard errors"
meat dairy veg other
0.2675706 0.2709643 0.2935580 0.2747341
[1] "critical values based on T2"
[1] "calculate critical value based on Bonferroni"
[1] "lower bounds"
[1] "upper bounds"
[1] "unadjusted critical value"
[1] 2.000995
[1] "critical value based on T2"
[1] 3.269537
[1] "critical value based on Bonferroni"
[1] 2.576588
\[s^2_p = \dfrac{(n_1-1)s^2_1+(n_2-1)s^2_2}{n_1+n_2-2}\] where \[s^2_i = \dfrac{\sum_{j=1}^{n_i}X^2_{ij}-(\sum_{j=1}^{n_i}X_{ij})^2/n_i}{n_i-1}\]
Two-sample t-statistic \[t = \dfrac{\bar{x}_1-\bar{x}_2}{\sqrt{s^2_p(\dfrac{1}{n_1}+\dfrac{1}{n_2})}} \]
Null distribution: \(t\overset{H_0}\sim t_{n_1+n_2-2}\).
Two independent samples
\[\mathbf X_{11}, \cdots,\mathbf X_{1,n_1}\overset{iid} \sim N(\boldsymbol \mu_1, \boldsymbol \Sigma)\]
\[\mathbf X_{21}, \cdots,\mathbf X_{2,n_2}\overset{iid} \sim N(\boldsymbol \mu_2, \boldsymbol \Sigma)\]
Null and alternative hypotheses: \(H_0: \boldsymbol \mu_1=\boldsymbol \mu_2\) vs \(H_1: \boldsymbol \mu_1\not=\boldsymbol \mu_2\)
Pooled sample covariance matrix \[\mathbf{S}_p = \dfrac{(n_1-1)\mathbf{S}_1+(n_2-1)\mathbf{S}_2}{n_1+n_2-2}\] where \[\mathbf{S}_i = \dfrac{1}{n_i-1}\sum_{j=1}^{n_i}{(\mathbf X_{ij}-\bar{\mathbf X}_i)(\mathbf X_{ij}-\bar{\mathbf X}_i)'}\]
Two-sample Hotelling’s \(T^2\) \[T^2 = {(\bar{\mathbf X}_1 - \bar{\mathbf X}_2)}^T\{\mathbf{S}_p(\frac{1}{n_1}+\frac{1}{n_2})\}^{-1} {(\bar{\mathbf X}_1 - \bar{\mathbf X}_2)}\]
Null distribution: \[T^2 \overset{H_0}\sim \frac{(n_1+n_2-2)p}{n_1+n_2-p-1} F_{p, n_1+n_2-p-1}\]
Hotelling.T2.2sample=function(X, Y){
n=dim(X)[1]; m=dim(Y)[1]; p=dim(X)[2]
if(p!= dim(Y)[2]) return("Error: the dimensions of X and Y are not the same")
X.bar=colMeans(X); Y.bar=colMeans(Y)
X.S=cov(X); Y.S=cov(Y)
pooled.S=((n-1)*X.S+(m-1)*Y.S)/(m+n-2)
T2=t(X.bar-Y.bar)%*%solve((1/n+1/m)*pooled.S)%*%(X.bar-Y.bar)
p.value=1-pf(T2/((n+m-2)*p/(n+m-1-p)),p,n+m-1-p)
return(list(X.bar=X.bar, Y.bar=Y.bar, T2=T2, p.value=p.value))}Hotelling.T2.2sample to compare the mean vectors of iris setosa and versicolorHotelling.T2 to compare the mean vectors of iris setosa and versicolorWe might be interested in the difference between iris setosa and versicolor in the four features
Because we are interested all the four features, we do need to construct simultaneous C.I.s for the four features. Two methods to find critical values with adjustment for multiple C.I.s:
\[\sqrt{\frac{(n_1+n_2-2)p}{n_1+n_2-p-1}F_{p, n_1+n_2-p-1, 1-\alpha}}\]
diff se CI.lower CI.upper
Sepal.Length -0.930 0.088 -1.212 -0.648
Sepal.Width 0.658 0.070 0.436 0.880
Petal.Length -2.798 0.071 -3.024 -2.572
Petal.Width -1.080 0.032 -1.181 -0.979
diff se CI.lower CI.upper
Sepal.Length -0.930 0.088 -1.155 -0.705
Sepal.Width 0.658 0.070 0.481 0.835
Petal.Length -2.798 0.071 -2.978 -2.618
Petal.Width -1.080 0.032 -1.161 -0.999
\[\begin{aligned} & & \sqrt{n} (\bar{\mathbf X} -\boldsymbol \mu ) \overset{\mathbf D} \rightarrow N(\mathbf 0, \boldsymbol \Sigma)\\ & \Rightarrow & n(\bar{\mathbf X}-\boldsymbol \mu)^T \mathbf S^{-1}(\bar{\mathbf X}-\boldsymbol \mu) \rightarrow \chi_p^2 \end{aligned} \]