Abstract
Although the ongoing digital revolution in fields such as chemometrics, genomics, or personalized medicine gives hope for considerable progress in these areas, it also provides more and more high‐dimensional data to analyze and interpret. A common usual task in those fields is discriminant analysis, which however may suffer from the high dimensionality of the data. The recent advances, through subspace classification or variable selection methods, allowed to reach either excellent classification performances or useful visualizations and interpretations. Obviously, it is of great interest to have both excellent classification accuracies and a meaningful variable selection for interpretation. This work addresses this issue by introducing a subspace discriminant analysis method which performs a class‐specific variable selection through Bayesian sparsity. The resulting classification methodology is called sparse high‐dimensional discriminant analysis (sHDDA). Contrary to most sparse methods which are based on the Lasso, sHDDA relies on a Bayesian modeling of the sparsity pattern and avoids the painstaking and sensitive cross‐validation of the sparsity level. The main features of sHDDA are illustrated on simulated and real‐world data. In particular, an exemplar application to cancer characterization based on medical imaging using radiomic feature extraction is proposed.
Originalsprog | Engelsk |
---|---|
Tidsskrift | Journal of Chemometrics |
Vol/bind | 33 |
Udgave nummer | 2 |
ISSN | 0886-9383 |
DOI | |
Status | Udgivet - 2018 |
Emneord
- Digital Revolution
- High-Dimensional Data
- Discriminant Analysis
- Variable Selection
- Bayesian Sparsity