Joakim Skogholt, Kristian H. Liland, Tormod Næs, Age K. Smilde, Ulf G. Indahl
{"title":"Selection of principal variables through a modified Gram–Schmidt process with and without supervision","authors":"Joakim Skogholt, Kristian H. Liland, Tormod Næs, Age K. Smilde, Ulf G. Indahl","doi":"10.1002/cem.3510","DOIUrl":null,"url":null,"abstract":"<p>In various situations requiring empirical model building from highly multivariate measurements, modelling based on partial least squares regression (PLSR) may often provide efficient low-dimensional model solutions. In unsupervised situations, the same may be true for principal component analysis (PCA). In both cases, however, it is also of interest to identify subsets of the measured variables useful for obtaining sparser but still comparable models without significant loss of information and performance. In the present paper, we propose a voting approach for sparse overall maximisation of variance analogous to PCA and a similar alternative for deriving sparse regression models influenced closely related to the PLSR method. Both cases yield pivoting strategies for a modified Gram–Schmidt process and its corresponding (partial) QR-factorisation of the underlying data matrix to manage the variable selection process. The proposed methods include score and loading plot possibilities that are acknowledged for providing efficient interpretations of the related PCA and PLS models in chemometric applications.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2023-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3510","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemometrics","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cem.3510","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL WORK","Score":null,"Total":0}
引用次数: 1
Abstract
In various situations requiring empirical model building from highly multivariate measurements, modelling based on partial least squares regression (PLSR) may often provide efficient low-dimensional model solutions. In unsupervised situations, the same may be true for principal component analysis (PCA). In both cases, however, it is also of interest to identify subsets of the measured variables useful for obtaining sparser but still comparable models without significant loss of information and performance. In the present paper, we propose a voting approach for sparse overall maximisation of variance analogous to PCA and a similar alternative for deriving sparse regression models influenced closely related to the PLSR method. Both cases yield pivoting strategies for a modified Gram–Schmidt process and its corresponding (partial) QR-factorisation of the underlying data matrix to manage the variable selection process. The proposed methods include score and loading plot possibilities that are acknowledged for providing efficient interpretations of the related PCA and PLS models in chemometric applications.
期刊介绍:
The Journal of Chemometrics is devoted to the rapid publication of original scientific papers, reviews and short communications on fundamental and applied aspects of chemometrics. It also provides a forum for the exchange of information on meetings and other news relevant to the growing community of scientists who are interested in chemometrics and its applications. Short, critical review papers are a particularly important feature of the journal, in view of the multidisciplinary readership at which it is aimed.