Lecture 7: Hotelling’s T2

Zhaoxia Yu

Professor, Department of Statistics

2026-04-27

Code

knitr::opts_chunk$set(echo = TRUE)
library(tidyr) #the pipe (%>%) tool is extremely useful
library(MASS)
library(corrplot)#for visualizing the corr matrix of the iris data
library(car)

Outline

Review of Wishart distribution
Hotelling’s \(T^2\) distribution for one-sample problems
Examples of one-sample Hotelling’s \(T^2\)
Two-sample Hotelling’s \(T^2\)
Examples of two-sample Hotelling’s \(T^2\)
The multivariate normality (MVN) assumption

Review

Definition of Wishart Distribution

A Wishart distribution can be defined in the following way
Let \(\mathbf W\) be a \(p\times p\) random matrix.
We say \(\mathbf W\) follows \(Wishart_{p}(k, \boldsymbol \Sigma)\) if \(\mathbf W\) can be written as \(\mathbf W=\mathbf X^T \mathbf X\) where \(\mathbf X\) denotes the random matrix formed by a random sample of size \(k\) from MVN \(N(\mathbf 0, \boldsymbol \Sigma)\).
Remark:\(E[\mathbf W]=k\Sigma\).

The definition indicates that if

\[\mathbf X_1, \cdots \mathbf X_k \overset{iid}\sim N(\mathbf 0, \boldsymbol \Sigma),\]

then \[\mathbf X^T \mathbf X=\sum_{i=1}^k \mathbf X_i \mathbf X_i^T \sim Wishart_p(k, \boldsymbol \Sigma).\]

Wishart vs Chi-squared

Wishart: If \(\mathbf X_1, \cdots \mathbf X_k \overset{iid}\sim N(\mathbf 0, \boldsymbol \Sigma)\), then \[\mathbf X^T \mathbf X =\sum_{i=1}^k \mathbf X_i\mathbf X_i^T \sim Wishart_p(k, \boldsymbol \Sigma) \mbox{, where } \mathbf X_{k\times p}=\begin{pmatrix} X_1^T\\ \vdots\\ X_k^T \end{pmatrix} \]
Chi-squared: If \(X_1, \cdots, X_k \overset{iid}\sim N(0,1)\), then
\[\mathbf X^T\mathbf X=\sum_{i=1}^k X_i^2\sim \chi_k^2 \mbox{, where } \mathbf X_{k\times 1}= \begin{pmatrix} X_1 \\ \vdots \\ X_k \end{pmatrix}\]

Wishart vs Chi-squared (continued)

When \(p=1\), \[W=\sum_{i=1}^k X_i^2 = \sigma^2 \sum_{i=1}^k \left(\frac{X_i}{\sigma} \right)^2\sim \sigma^2 \chi_k^2 \]

Sample Covariance

Let \(\mathbf X_1, \cdots \mathbf X_n\) be a random sample from \(N(\boldsymbol \mu, \boldsymbol \Sigma)\).
We have shown that

\[(n-1)\mathbf S \sim Wishart_p(n-1, \boldsymbol\Sigma)\]

It is critical to rewrite \((n-1)\mathbf S\) in the following way.

\[ \begin{aligned} (n-1)\mathbf S&=\mathbf X^T \mathbb C^T\mathbb C\mathbb C \mathbf X=(\mathbb C \mathbf X)^T(\mathbb C \mathbf X)\\ &=(\mathbb C \mathbf X)^T\mathbb C(\mathbb C \mathbf X)\\ &=(\mathbb C \mathbf X)^T\sum_{j=1}^{n-1}\gamma_i \gamma_i^T (\mathbb C \mathbf X)\\ &=\sum_{j=1}^{n-1} (\gamma_i^T \mathbb C \mathbf X)^T (\gamma_i^T \mathbb C \mathbf X) \end{aligned}\]

Let \(Y_i= (\gamma_i^T \mathbb C \mathbf X)^T\).
We have shown that \(Y_1, \cdots, Y_{n-1} \overset{iid}\sim N(\mathbf 0, \boldsymbol \Sigma)\).
Following the definition of Wishart, we have

\[(n-1)\mathbf S \sim Wishart_p(n-1, \boldsymbol \Sigma).\]

A Simulation Study to Understand the Wishart Distribution

Recall that if \(W\sim Wishart_p(k, \boldsymbol \Sigma)\), then \(E[\mathbf W]=k\Sigma\).

Code

library(MASS)
p=2; n=5; B=1000; rho=0.7
Sigma=diag(1+rho, p, p) - matrix(rho, p, p)
wmat.array=array(0, c(B, p, p)) #wishart-distributed
for(b in 1:B){
  X=mvrnorm(n, rep(0,p), Sigma)
  wmat.array[b,,]=(n-1)*cov(X)}
apply(wmat.array, c(2,3), mean)

          [,1]      [,2]
[1,]  4.026980 -2.843953
[2,] -2.843953  4.048838

Code

Sigma*(n-1)

     [,1] [,2]
[1,]  4.0 -2.8
[2,] -2.8  4.0

Hotelling’s \(T^2\)

Definition of Hotelling’s \(T^2\)

Hotelling generalized the student’s t, which is for univarite, to Hotelling’s \(T^2\), which is the multivariate version
Definition. We say a random variable follows Hotelling’s \(T_{p,\nu}^2\) if the random variable can be written as \(\mathbf Z^T\left(\frac{W}{\nu}\right)^{-1}\mathbf Z\) where
1. \(\mathbf Z\sim N(\mathbf 0, \boldsymbol\Sigma)\)
2. \(\mathbf W \sim W_p(\nu, \boldsymbol\Sigma)\)
3. \(\mathbf Z \perp \mathbf W\)

One-Sample Hotelling \(T^2\)

Let \(\boldsymbol{X}_1, \boldsymbol{X}_2, ..., \boldsymbol{X}_n\) be a random sample from a multivariate normal distribution with mean vector \(\boldsymbol{\mu}\) and covariance matrix \(\boldsymbol{\Sigma}\).
The sample mean vector and sample covariance matrix are denoted by \(\bar{\mathbf X}\) and \(\mathbf S\), respectively.
The null hypothesis of interest \(H_0: \boldsymbol \mu = \boldsymbol \mu_0\)
The one-sample Hotelling \(T^2\) is defined as \[T^2=(\hat{\mathbf \mu} - \mathbf \mu_0)^T \left(Cov(\hat{\mathbf \mu})\right)^{-1}(\hat{\mathbf \mu} - \mathbf \mu_0)\]
We have shown that \(T^2\sim T_{p, n-1}^2\) when \(H_0: \boldsymbol \mu=\boldsymbol \mu_0\).

Hotelling’s \(T^2\) Distribution vs \(F\) Distribution

Write an R function to conduct Hotelling’s \(T^2\)

There is no R base function for conducting Hotelling’s \(T^2\) test
We will write an R function

Code

#Hotelling's T^2 for testing H0: mu=mu0 vs mu != mu0
Hotelling.T2.1sample=function(X, mu0)
{
  n=dim(X)[1]
  p=dim(X)[2]
  X.bar=colMeans(X)
  X.S=cov(X)
  T2=n*t(X.bar-mu0)%*%solve(X.S)%*%(X.bar-mu0)
  p.value=1-pf(T2/((n-1)*p/(n-p)),p,n-p)
  return(list(X.bar=X.bar, X.cov=X.S, T2=T2, p.value=p.value))
}

Example: Protein Intake

For the protein intake data, it might be more interesting to estimate the means than conducting hypothesis testing
Suppose we are interested in estimating the means of the daily protein intake from different sources

Code

library(MASS)#the library "MASS" is required
my.cov=4*(diag(4) + 0.3* rep(1,4)%o%rep(1,4))
n=60;p=4
my.mean=8*c(3,2,1,1)
eigen(my.cov)#to check whether the cov matrix is p.d.

eigen() decomposition
$values
[1] 8.8 4.0 4.0 4.0

$vectors
     [,1]       [,2]       [,3]       [,4]
[1,] -0.5  0.8660254  0.0000000  0.0000000
[2,] -0.5 -0.2886751 -0.5773503 -0.5773503
[3,] -0.5 -0.2886751 -0.2113249  0.7886751
[4,] -0.5 -0.2886751  0.7886751 -0.2113249

Estimate the mean vector using the sample mean vector
Estimate covariance of the sample mean vector. Recall that \(cov(\bar{\mathbf X})=\frac{\boldsymbol \Sigma}{n}\)

Code

set.seed(1)
x=mvrnorm(n, mu=my.mean, Sigma=my.cov)
protein=as.matrix(data.frame(meat=x[,1],dairy=x[,2], 
                             veg=x[,3], other=x[,4]))
colMeans(protein)

     meat     dairy       veg     other 
24.034032 15.928361  7.660490  7.738634

Code

cov(protein)/n

            meat       dairy         veg       other
meat  0.07159404 0.013584596 0.018824131 0.009220700
dairy 0.01358460 0.073421655 0.005829816 0.003895500
veg   0.01882413 0.005829816 0.086176323 0.009828535
other 0.00922070 0.003895500 0.009828535 0.075478822

Use Hotelling’s \(T^2\) to quantify uncertainties. Recall that \[T^2=(\bar{\mathbf X} - \boldsymbol \mu)^T \left(Cov(\bar{\mathbf X})\right)^{-1}(\bar{\mathbf X} - \boldsymbol \mu)\sim \frac{(n-1)p}{n-p} F_{p, n-p}\]

where \(Cov(\bar{\mathbf X})=\frac{\mathbf S}{n}\).

The result indicates that \[Pr[(\bar{\mathbf X} - \boldsymbol \mu)^T \left(Cov(\bar{\mathbf X})\right)^{-1}(\bar{\mathbf X} - \boldsymbol \mu)\le \frac{(n-1)p}{n-p} F_{p, n-p, 1-\alpha}]=1-\alpha\]
Thus, a \((1-\alpha)100\%\) confidence region for \(\boldsymbol \mu\) is \[\{\mathbf\mu: (\mathbf{\bar X} - \boldsymbol \mu)^T \left(Cov(\mathbf{\bar X})\right)^{-1}(\mathbf{\bar X} - \boldsymbol \mu)\le \frac{(n-1)p}{n-p} F_{p, n-p, 1-\alpha}\}\]

Confidence Region vs Interval

The confidence region has exactly \((1-\alpha)100\%\) confidence; however
In many situations, we would like to construct confidence intervals, which are in the form of \[\mbox{estimate}\pm \mbox{critical value} \times \mbox{standard error}\]

CI for One Parameter

If there is only one parameter of interest, we can construct a C.I. using t-distribution, just as in univariate analysis
Example. What is the mean protein intake from source \(j\)?
- Lecture 04: we constructed a large-sample C.I. by using 1.96 as the critical value. (See the protein intake example)
This lecture: we construct a C.I. for \(\mu_j\) by using \(t_{n-1, 1-\frac{\alpha}{2}}\) as the critical value

\[\bar{X}_{(j)} \pm t_{n-1, 1- \frac{\alpha}{2}}\sqrt{\frac{s^2_{X_{(j)}}}{n}} \]
What if we are interested in several simultaneously? We will discuss simultaneous confidence intervals in the next few slides.

Simultaneous C.I.s

Coverage

Let \(A_j=\{\mu_j\mbox{ is in the constructed C.I. }\}\). The C.I. in the previous slide has \((1-\alpha)100\%\) coverage for a specific \(\mu_j\), i.e., \[Pr(A_j)=1-\alpha\]
If we are interested in all the parameters, which are \(\mu_1, \mu_2, \mu_3, \mu_4\) in the protein intake example. The coverage for the mean vector is \[Pr(A_1\cap A_2 \cap A_3 \cap A_4)\]

Clearly \(Pr(A_1\cap A_2 \cap A_3 \cap A_4)<1-\alpha\)
Thus, if we use \(t_{n-1, 1-\frac{\alpha}{2}}\) as the critical value, we do not have enough coverage for all the parameters in \(\boldsymbol \mu\) simultaneously
What we need to construct are simultaneous confidence intervals

Methods for Simultaneous Confidence Intervals

Method 1 for simultaneous C.I. \(T^2\). Some linear algebra result ensures that the following method gives \((1-\alpha)100\%\) confidence to cover all linear combinations of the parameters (in the form of \(a^T\boldsymbol \mu\)) simultaneously \[a^T\bar{\mathbf X}\pm \sqrt{\frac{(n-1)p}{n-p}F_{p, n-p, 1-\alpha}} se(a^T\bar{\mathbf X}) \]
Method 2 Bonferroni’s correction: simply replace \(\alpha\) with \(\alpha/k\) where \(k\) is the total number of linear functions of the mean parameters: \(t_{n-1, 1-\alpha/(2k)}\), where \(k\) is the number of parameters of interest.

Simultaneous C.I.s using \(T^2\): Protein Intake

Code

#sample means
print("sample means")

[1] "sample means"

Code

colMeans(protein)

     meat     dairy       veg     other 
24.034032 15.928361  7.660490  7.738634

Code

#standard errors
print("standard errors")

[1] "standard errors"

Code

sqrt(diag(cov(protein)/n))

     meat     dairy       veg     other 
0.2675706 0.2709643 0.2935580 0.2747341

Code

#critical value based on T2
print("critical values based on T2")

[1] "critical values based on T2"

Code

cv=sqrt((n-1)*p/(n-p)*qf(0.95, p, n-p))
#lower bounds
low.bound=colMeans(protein) - cv *sqrt(diag(cov(protein)/n))
#upper bounds
up.bound=colMeans(protein) + cv *sqrt(diag(cov(protein)/n))

Simultaneous C.I.s using \(T^2\): Protein Intake

Put everything nicely into a data frame

Code

#put everything into a table
data.frame(lower=low.bound, mean=colMeans(protein), 
           upper=up.bound)

          lower      mean     upper
meat  23.159200 24.034032 24.908864
dairy 15.042433 15.928361 16.814289
veg    6.700691  7.660490  8.620288
other  6.840381  7.738634  8.636887

Simultaneous C.I.s using Bonferroni: Protein Intake

Code

#critical value based on Bonferroni
print("calculate critical value based on Bonferroni")

[1] "calculate critical value based on Bonferroni"

Code

cv=qt(1-0.05/p/2, n-1)
#lower bounds
print("lower bounds")

[1] "lower bounds"

Code

low.bound=colMeans(protein) - cv *sqrt(diag(cov(protein)/n))
#upper bounds
print("upper bounds")

[1] "upper bounds"

Code

up.bound=colMeans(protein) + cv *sqrt(diag(cov(protein)/n))

Put everything together

Code

#put everything into a table
data.frame(lower=low.bound, mean=colMeans(protein), 
           upper=up.bound)

          lower      mean     upper
meat  23.344613 24.034032 24.723451
dairy 15.230198 15.928361 16.626524
veg    6.904112  7.660490  8.416868
other  7.030758  7.738634  8.446511

Comparison of Different Critical Values

Three choices of critical values
- unadjusted: \(t_{n-1, 1-\alpha/2}\). Should be used if multiple linear functions need to be estimated
- \(T^2\): \(\sqrt{\frac{(n-1)p}{n-p}F_{p, n-p, 1-\alpha}}\)
- Bonferroni’s correction: simply replace \(\alpha\) with \(\alpha/k\) where \(k\) is the total number of linear functions of the mean parameters: \(t_{n-1, 1-\alpha/(2k)}\)

Example: the critical values for the individual means from four protein sources

Code

#unadjusted, shouldn't be used when constructing simultaneous C.I.s
print("unadjusted critical value")

[1] "unadjusted critical value"

Code

qt(1-0.05/2, n-1)

[1] 2.000995

Code

#T^2
print("critical value based on T2")

[1] "critical value based on T2"

Code

sqrt((n-1)*p/(n-p)*qf(0.95, p, n-p))

[1] 3.269537

Code

print("critical value based on Bonferroni")

[1] "critical value based on Bonferroni"

Code

#Bonferroni correction
qt(1-0.05/p/2, n-1)

[1] 2.576588

Two-Sample Hotellings \(T^2\)

One-Sample vs Two-Sample

In the one-sample problem, the goal is to make inference of
- univariate: a population mean (one-sample t-test problem) or
- multivariate: a population mean vector (one-sample Hotelling \(T^2\) problem)
In the two-sample problem
- univariate: compare two population means
- multivariate: compare two population mean vectors

Univariate Two-Sample Problems

Two independent samples
- Sample 1 is from population 1:
\[X_{11}, \cdots, X_{1,n_1}\overset{iid} \sim N(\mu_1,\sigma^2)\]
- Sample 2 is from population 2:
\[X_{21}, \cdots, X_{2,n_2}\overset{iid} \sim N(\mu_2,\sigma^2)\]
Null hypothesis: \(H_0: \mu_1=\mu_2\)

Pooled sample variance

\[s^2_p = \dfrac{(n_1-1)s^2_1+(n_2-1)s^2_2}{n_1+n_2-2}\] where \[s^2_i = \dfrac{\sum_{j=1}^{n_i}X^2_{ij}-(\sum_{j=1}^{n_i}X_{ij})^2/n_i}{n_i-1}\]

Two-sample t-statistic \[t = \dfrac{\bar{x}_1-\bar{x}_2}{\sqrt{s^2_p(\dfrac{1}{n_1}+\dfrac{1}{n_2})}} \]
Null distribution: \(t\overset{H_0}\sim t_{n_1+n_2-2}\).

Multivariate Two-Sample Problems

Two independent samples
- Sample 1 is from population 1:
\[\mathbf X_{11}, \cdots,\mathbf X_{1,n_1}\overset{iid} \sim N(\boldsymbol \mu_1, \boldsymbol \Sigma)\]
- Sample 2 is from population 2:
\[\mathbf X_{21}, \cdots,\mathbf X_{2,n_2}\overset{iid} \sim N(\boldsymbol \mu_2, \boldsymbol \Sigma)\]
Null and alternative hypotheses: \(H_0: \boldsymbol \mu_1=\boldsymbol \mu_2\) vs \(H_1: \boldsymbol \mu_1\not=\boldsymbol \mu_2\)

Pooled sample covariance matrix \[\mathbf{S}_p = \dfrac{(n_1-1)\mathbf{S}_1+(n_2-1)\mathbf{S}_2}{n_1+n_2-2}\] where \[\mathbf{S}_i = \dfrac{1}{n_i-1}\sum_{j=1}^{n_i}{(\mathbf X_{ij}-\bar{\mathbf X}_i)(\mathbf X_{ij}-\bar{\mathbf X}_i)'}\]
Two-sample Hotelling’s \(T^2\) \[T^2 = {(\bar{\mathbf X}_1 - \bar{\mathbf X}_2)}^T\{\mathbf{S}_p(\frac{1}{n_1}+\frac{1}{n_2})\}^{-1} {(\bar{\mathbf X}_1 - \bar{\mathbf X}_2)}\]
Null distribution: \[T^2 \overset{H_0}\sim \frac{(n_1+n_2-2)p}{n_1+n_2-p-1} F_{p, n_1+n_2-p-1}\]

Multivariate Two-Sample Problems: Write an R Function

No existing base function in R.

Code

Hotelling.T2.2sample=function(X, Y){
  n=dim(X)[1]; m=dim(Y)[1]; p=dim(X)[2]
  if(p!= dim(Y)[2]) return("Error: the dimensions of X and Y are not the same")
  X.bar=colMeans(X); Y.bar=colMeans(Y)
  X.S=cov(X); Y.S=cov(Y)
  pooled.S=((n-1)*X.S+(m-1)*Y.S)/(m+n-2)
  T2=t(X.bar-Y.bar)%*%solve((1/n+1/m)*pooled.S)%*%(X.bar-Y.bar)
  p.value=1-pf(T2/((n+m-2)*p/(n+m-1-p)),p,n+m-1-p)
  return(list(X.bar=X.bar, Y.bar=Y.bar, T2=T2, p.value=p.value))}

Multivariate Two-Sample Problems: Write an R Function

The built-in function “t.test” serves a dual-purpose (one-sample or two-sample) function for univariate analyis
We will write a dual-purpose function Hotelling.T2

Code

Hotelling.T2=function(X, Y=NULL, mu0=NULL)
{
 if(is.null(Y) && is.null(mu0) ) 
   return("Error: mu0 is not specified")
 if(!is.null(X) && !is.null(mu0)) 
   obj=Hotelling.T2.1sample(X, mu0) 
 if(!is.null(X) && !is.null(Y)) 
   obj=Hotelling.T2.2sample(X,Y)
 return(obj)
}

Multivariate Two-Sample Problems: Iris setosa vs versicolor

Apply the function Hotelling.T2.2sample to compare the mean vectors of iris setosa and versicolor

Code

Hotelling.T2.2sample(iris[1:50,1:4], iris[51:100,1:4])

$X.bar
Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
       5.006        3.428        1.462        0.246 

$Y.bar
Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
       5.936        2.770        4.260        1.326 

$T2
         [,1]
[1,] 2580.839

$p.value
     [,1]
[1,]    0

Multivariate Two-Sample Problems: Example

Applying the dual purpose function Hotelling.T2 to compare the mean vectors of iris setosa and versicolor

Code

Hotelling.T2(iris[1:50,1:4], iris[51:100,1:4])

$X.bar
Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
       5.006        3.428        1.462        0.246 

$Y.bar
Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
       5.936        2.770        4.260        1.326 

$T2
         [,1]
[1,] 2580.839

$p.value
     [,1]
[1,]    0

Two-Sample Multivariate: Simultaneous C.I.s

We might be interested in the difference between iris setosa and versicolor in the four features
Because we are interested all the four features, we do need to construct simultaneous C.I.s for the four features. Two methods to find critical values with adjustment for multiple C.I.s:
- Method 1 - \(T^2\):
\[\sqrt{\frac{(n_1+n_2-2)p}{n_1+n_2-p-1}F_{p, n_1+n_2-p-1, 1-\alpha}}\]
- Method 2- Bonferroni’s correction by replacing \(\alpha\) with \(\alpha/k\), i.e., use the following critical value \[t_{n_1+n_2-2, 1-\alpha/(2k)}\]

Code

n1=n2=50; p=4
mean1=matrix(colMeans(iris[1:50,1:p]), p, 1)
mean2=matrix(colMeans(iris[51:100,1:p]), p, 1)
mean.diff = mean1-mean2
S1=cov(iris[1:50,1:p]); S2=cov(iris[51:100,1:p]); 
Sp=( (n1-1)*S1+(n2-1)*S2 )/ (n1+n2-2)

Method 1: \(T^2\)

Code

cv=sqrt((n1+n2-2)*p/(n1+n2-p-1)*qf(1-0.05, p, n1+n2-p-1 ))
round(data.frame(diff=mean.diff, se=sqrt(diag((1/n1+1/n2)*Sp) ),
CI.lower=mean1-mean2-cv*sqrt(diag((1/n1+1/n2)*Sp) ),
CI.upper=mean1-mean2+cv*sqrt(diag((1/n1+1/n2)*Sp) ) ), 3)

               diff    se CI.lower CI.upper
Sepal.Length -0.930 0.088   -1.212   -0.648
Sepal.Width   0.658 0.070    0.436    0.880
Petal.Length -2.798 0.071   -3.024   -2.572
Petal.Width  -1.080 0.032   -1.181   -0.979

Method 2: Bonferroni

Code

cv=qt(1-0.05/p/2, n1+n2-2)
round(data.frame(diff=mean.diff, se=sqrt(diag((1/n1+1/n2)*Sp) ),
CI.lower=mean1-mean2-cv*sqrt(diag((1/n1+1/n2)*Sp) ),
CI.upper=mean1-mean2+cv*sqrt(diag((1/n1+1/n2)*Sp) ) ), 3)

               diff    se CI.lower CI.upper
Sepal.Length -0.930 0.088   -1.155   -0.705
Sepal.Width   0.658 0.070    0.481    0.835
Petal.Length -2.798 0.071   -2.978   -2.618
Petal.Width  -1.080 0.032   -1.161   -0.999

MVN Assumption

The assumption of MVN

We assume each observation \(\mathbf X_i\) follows a MVN
Assessing the assumption of multivariate normality is more difficult than assessing the assumption of normality (univariate)
This is because univariate normality does not guarantee multivariate normality. Typically, we look at the following two items:
It is difficult to examine joint normality in more than 2d. In practice, we do 1d and 2d
- Marginal normality
- Are pairs of variables show elliptical contours?
Are there outliers in the data?

Assess Marignal Normality

Useful visual tools:
- histogram
- QQ plot
- scatter plot
Less useful tools (formal tests)
- Kolmogorov-Smironov test
- Shapiro-Wilk test (correlation coefficient between data and normal scores)

Histograms

Alternative text of this histogram:
- Left: Skewed distribution
- Middle: bimodal distribution
- Right: bell-shaped distribution

QQ plots

Alternative text of this QQ plot:
- Left: skewed distribution
- Middle: bimodal distribution
- Right: bell-shaped distribution

Bivariate Scatter Plots

Alternative text
- Left: bivariate normal distribution with elliptical contours, zero correlation
- Middle: bivariate normal distribution with elliptical contours, positive correlation
- Right: non-normal distribution with non-elliptical contours. The data is from a mixture of two bivariate normal distributions.

Large-Sample Results

Multivariate CLT

\[\begin{aligned} & & \sqrt{n} (\bar{\mathbf X} -\boldsymbol \mu ) \overset{\mathbf D} \rightarrow N(\mathbf 0, \boldsymbol \Sigma)\\ & \Rightarrow & n(\bar{\mathbf X}-\boldsymbol \mu)^T \mathbf S^{-1}(\bar{\mathbf X}-\boldsymbol \mu) \rightarrow \chi_p^2 \end{aligned} \]

When \(n-p\) is large, we replace \(\frac{(n-1)p}{n-p}F_{p, n-p}\) with \(\chi_{p}^2\)
When \(n_1-p\) and \(n_2-p\) are large, we replace \(\frac{(n_1+n_2-2)p}{n_1+n_2-p-1}F_{p, n_1+n_2-p-1}\) with \(\chi_{p}^2\)

Lecture 7: Hotelling’s T2

Outline

Review

Definition of Wishart Distribution

Wishart vs Chi-squared

Wishart vs Chi-squared (continued)

Sample Covariance

A Simulation Study to Understand the Wishart Distribution

Hotelling’s \(T^2\)

Definition of Hotelling’s \(T^2\)

One-Sample Hotelling \(T^2\)

One-Sample Hotelling \(T^2\)

Hotelling’s \(T^2\) Distribution vs \(F\) Distribution

Write an R function to conduct Hotelling’s \(T^2\)

Example: Protein Intake

Confidence Region vs Interval

CI for One Parameter

Simultaneous C.I.s

Coverage

Methods for Simultaneous Confidence Intervals

Simultaneous C.I.s using \(T^2\): Protein Intake

Simultaneous C.I.s using \(T^2\): Protein Intake

Simultaneous C.I.s using Bonferroni: Protein Intake

Comparison of Different Critical Values

Two-Sample Hotellings \(T^2\)

One-Sample vs Two-Sample

Univariate Two-Sample Problems

Multivariate Two-Sample Problems

Multivariate Two-Sample Problems: Write an R Function

Multivariate Two-Sample Problems: Write an R Function

Multivariate Two-Sample Problems: Iris setosa vs versicolor

Multivariate Two-Sample Problems: Example

Two-Sample Multivariate: Simultaneous C.I.s

MVN Assumption

The assumption of MVN

Assess Marignal Normality

Histograms

QQ plots

Bivariate Scatter Plots

Large-Sample Results

Assignment 2