Institute of Mathematical Statistics and Actuarial Science


Colloquium Talk

Friday, November 17, 2017

Lecture room B78
Institute for Exact Sciences
Sidlerstr. 5
CH-3012 Bern

16:15-17:00 h

Optimal High-Dimensional Shrinkage Covariance Matrix Estimation for Elliptical Distributions and Applications in Regularized Linear Discriminant Analysis

Esa Ollila (Aalto University, Helsinki)

We derive an optimal shrinkage sample covariance matrix (SCM) estimator which is suitable for high dimensional problems and when sampling from an unspecified elliptically symmetric distribution. Specifically, we derive the optimal (oracle) shrinkage parameters that obtain the minimum mean-squared error (MMSE) between the shrinkage SCM and the true covariance matrix when sampling from an elliptical distribution. Subsequently, we show how the oracle shrinkage parameters can be consistently estimated under the random matrix theory regime. Simulations show the advantage of the proposed estimator over the conventional shrinkage SCM estimator due to Ledoit and Wolf (2004). The proposed shrinkage SCM estimator often provides significantly better performance than the Ledoit-Wolf estimator and has the advantage that consistency is guaranteed over the whole class of elliptical distributions with finite 4th order moments.

The proposed approach is then used in classification problems. We propose a modification of linear discriminant analysis, referred to as compressive regularized discriminant analysis (CRDA), for analysis of high-dimensional datasets. CRDA is especially designed for feature elimination purpose and can be used as gene selection method in microarray studies. Besides regularization of the SCM, CRDA lends ideas also from $\ell_{q,1}$ matrix norm minimization algorithms used in multivariate extensions of compressed sensing. A simulation study and four examples of real life microarray datasets evaluate the performances of CRDA based classifiers. Overall, the proposed CRDA method gives fewer misclassification errors than its competitors while at the same time achieving accurate feature elimination.