Pub Date : 2025-11-11DOI: 10.1016/j.jmva.2025.105531
Santiago Ortiz , Henry Laniado , Daniel Peña , Francisco J. Prieto
The work introduces the KASP (Kurtosis and Skewness Projections) procedure, a method for detecting outliers in high-dimensional multivariate data based on dimension reduction techniques. The procedure involves finding projections that maximize non-normality measures in the distribution of the observations. These projections are based on three directions: one that maximizes a combination of the squared skewness and kurtosis coefficients, one that minimizes the kurtosis coefficient, and one that maximizes the squared skewness coefficient. The study demonstrates that these directions include the optimal way to identify outliers for many different contamination structures. The performance of the KASP procedure is compared with alternative methods in correctly identifying and falsely detecting outliers in high-dimensional data sets. Additionally, the paper presents three practical examples to illustrate the effectiveness of the procedure in outlier detection in high dimensions.
{"title":"Dimension reduction for outlier detection in high-dimensional data","authors":"Santiago Ortiz , Henry Laniado , Daniel Peña , Francisco J. Prieto","doi":"10.1016/j.jmva.2025.105531","DOIUrl":"10.1016/j.jmva.2025.105531","url":null,"abstract":"<div><div>The work introduces the KASP (Kurtosis and Skewness Projections) procedure, a method for detecting outliers in high-dimensional multivariate data based on dimension reduction techniques. The procedure involves finding projections that maximize non-normality measures in the distribution of the observations. These projections are based on three directions: one that maximizes a combination of the squared skewness and kurtosis coefficients, one that minimizes the kurtosis coefficient, and one that maximizes the squared skewness coefficient. The study demonstrates that these directions include the optimal way to identify outliers for many different contamination structures. The performance of the KASP procedure is compared with alternative methods in correctly identifying and falsely detecting outliers in high-dimensional data sets. Additionally, the paper presents three practical examples to illustrate the effectiveness of the procedure in outlier detection in high dimensions.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"211 ","pages":"Article 105531"},"PeriodicalIF":1.4,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145516869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-11DOI: 10.1016/j.jmva.2025.105539
Xiaomeng Ju, Hyung G. Park, Thaddeus Tarpey
This paper develops a novel Bayesian approach for nonlinear regression with symmetric matrix predictors, often used to encode connectivity of different nodes. Unlike methods that vectorize matrices as predictors that result in a large number of model parameters and unstable estimation, we propose a Bayesian multi-index regression method, resulting in a projection-pursuit-type estimator that leverages the structure of matrix-valued predictors. We establish the model identifiability conditions and impose a sparsity-inducing prior on the projection directions for sparse sampling to prevent overfitting and enhance interpretability of the parameter estimates. Posterior inference is conducted through Bayesian backfitting. The performance of the proposed method is evaluated through simulation studies and a case study investigating the relationship between brain connectivity features and cognitive scores.
{"title":"Projection pursuit Bayesian regression for symmetric matrix predictors","authors":"Xiaomeng Ju, Hyung G. Park, Thaddeus Tarpey","doi":"10.1016/j.jmva.2025.105539","DOIUrl":"10.1016/j.jmva.2025.105539","url":null,"abstract":"<div><div>This paper develops a novel Bayesian approach for nonlinear regression with symmetric matrix predictors, often used to encode connectivity of different nodes. Unlike methods that vectorize matrices as predictors that result in a large number of model parameters and unstable estimation, we propose a Bayesian multi-index regression method, resulting in a projection-pursuit-type estimator that leverages the structure of matrix-valued predictors. We establish the model identifiability conditions and impose a sparsity-inducing prior on the projection directions for sparse sampling to prevent overfitting and enhance interpretability of the parameter estimates. Posterior inference is conducted through Bayesian backfitting. The performance of the proposed method is evaluated through simulation studies and a case study investigating the relationship between brain connectivity features and cognitive scores.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"211 ","pages":"Article 105539"},"PeriodicalIF":1.4,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145516866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-11DOI: 10.1016/j.jmva.2025.105523
Nicola Loperfido
The star product of two matrices is the linear combination of the blocks in the second matrix, with the corresponding elements of the first matrix as coefficients. In probability and statistics, the star product appeared in conjunction with measures of multivariate skewness and kurtosis, within the frameworks of model-based clustering, multivariate normality testing, outlier detection, invariant coordinate selection and independent component analysis. In this paper, we investigate some properties of the star product and their applications to dimension reduction techniques, including common principal components and invariant coordinate selection. The connections of the star product with tensor concepts and three-way data are also considered. The theoretical results are illustrated with the Iris Flowers and the Swiss Banknotes datasets.
{"title":"Star products and dimension reduction","authors":"Nicola Loperfido","doi":"10.1016/j.jmva.2025.105523","DOIUrl":"10.1016/j.jmva.2025.105523","url":null,"abstract":"<div><div>The star product of two matrices is the linear combination of the blocks in the second matrix, with the corresponding elements of the first matrix as coefficients. In probability and statistics, the star product appeared in conjunction with measures of multivariate skewness and kurtosis, within the frameworks of model-based clustering, multivariate normality testing, outlier detection, invariant coordinate selection and independent component analysis. In this paper, we investigate some properties of the star product and their applications to dimension reduction techniques, including common principal components and invariant coordinate selection. The connections of the star product with tensor concepts and three-way data are also considered. The theoretical results are illustrated with the Iris Flowers and the Swiss Banknotes datasets.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"211 ","pages":"Article 105523"},"PeriodicalIF":1.4,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145516972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.1016/j.jmva.2025.105541
Hossein Asgharian , Krzysztof Podgórski , Nima Shariati
In univariate spatial stochastic models, parameter space dimension is reduced through structural models with a known adjacency matrix. This structural reduction is also applied in multivariate spatial settings, where matrix-valued observations represent locations along one coordinate and multivariate variables along the other. However, such reduction often goes too far, omitting parameters that capture natural and important dependencies. Widely used models, including the spatial error and spatial lag models, lack parameters for intra-location dependencies. In a spatial econometric context, for example, while parameters link inflation and interest rates across economies, there is no explicit way to represent the effect of inflation on interest rates within a given economy. Through examples and analytical arguments, it is shown that when intralocation feedback exists in the data, standard models fail to capture it, leading to serious misrepresentation of other effects. As a remedy, this paper develops multivariate spatial models that incorporate feedback between variables at the same location. Given the high-dimensional nature of structural models, the challenge is to introduce such effects without substantially enlarging the parameter space, thereby avoiding overparameterization or non-identifiability. This is achieved by adding a single parameter that accounts for intralocation feedback. The proposed models are well-defined under a general second-order framework, accommodating non-Gaussian distributions. Dimensions of the parameter space, model identification, and other fundamental properties are established. Statistical inference is discussed using both empirical precision matrix methods and maximum likelihood. While the main contribution lies in static models, extensions to time-dependent data are also formulated, showing that dynamic generalizations are straightforward.
{"title":"Parsimonious multivariate structural spatial models with intra-location feedback","authors":"Hossein Asgharian , Krzysztof Podgórski , Nima Shariati","doi":"10.1016/j.jmva.2025.105541","DOIUrl":"10.1016/j.jmva.2025.105541","url":null,"abstract":"<div><div>In univariate spatial stochastic models, parameter space dimension is reduced through structural models with a known adjacency matrix. This structural reduction is also applied in multivariate spatial settings, where matrix-valued observations represent locations along one coordinate and multivariate variables along the other. However, such reduction often goes too far, omitting parameters that capture natural and important dependencies. Widely used models, including the spatial error and spatial lag models, lack parameters for intra-location dependencies. In a spatial econometric context, for example, while parameters link inflation and interest rates across economies, there is no explicit way to represent the effect of inflation on interest rates within a given economy. Through examples and analytical arguments, it is shown that when intralocation feedback exists in the data, standard models fail to capture it, leading to serious misrepresentation of other effects. As a remedy, this paper develops multivariate spatial models that incorporate feedback between variables at the same location. Given the high-dimensional nature of structural models, the challenge is to introduce such effects without substantially enlarging the parameter space, thereby avoiding overparameterization or non-identifiability. This is achieved by adding a single parameter that accounts for intralocation feedback. The proposed models are well-defined under a general second-order framework, accommodating non-Gaussian distributions. Dimensions of the parameter space, model identification, and other fundamental properties are established. Statistical inference is discussed using both empirical precision matrix methods and maximum likelihood. While the main contribution lies in static models, extensions to time-dependent data are also formulated, showing that dynamic generalizations are straightforward.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"211 ","pages":"Article 105541"},"PeriodicalIF":1.4,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145516861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.1016/j.jmva.2025.105538
Ufuk Beyaztas , Abhijit Mandal , Han Lin Shang
This paper introduces a robust estimation strategy for the spatial functional linear regression model using dimension reduction methods, specifically functional principal component analysis (FPCA) and functional partial least squares (FPLS). These techniques are designed to address challenges associated with spatially correlated functional data, particularly the impact of outliers on parameter estimation. By projecting the infinite-dimensional functional predictor onto a finite-dimensional space defined by orthonormal basis functions and employing M-estimation to mitigate outlier effects, our approach improves the accuracy and reliability of parameter estimates in the spatial functional linear regression context. Simulation studies and empirical data analysis substantiate the effectiveness of our methods. Fisher consistency and influence function of the FPCA-based approach are established under regularity conditions. The rfsac package in 1 implements these robust estimation strategies, ensuring practical applicability for researchers and practitioners.
{"title":"Enhancing spatial functional linear regression with robust dimension reduction methods","authors":"Ufuk Beyaztas , Abhijit Mandal , Han Lin Shang","doi":"10.1016/j.jmva.2025.105538","DOIUrl":"10.1016/j.jmva.2025.105538","url":null,"abstract":"<div><div>This paper introduces a robust estimation strategy for the spatial functional linear regression model using dimension reduction methods, specifically functional principal component analysis (FPCA) and functional partial least squares (FPLS). These techniques are designed to address challenges associated with spatially correlated functional data, particularly the impact of outliers on parameter estimation. By projecting the infinite-dimensional functional predictor onto a finite-dimensional space defined by orthonormal basis functions and employing M-estimation to mitigate outlier effects, our approach improves the accuracy and reliability of parameter estimates in the spatial functional linear regression context. Simulation studies and empirical data analysis substantiate the effectiveness of our methods. Fisher consistency and influence function of the FPCA-based approach are established under regularity conditions. The <span>rfsac</span> package in <figure><img></figure> <span><span><sup>1</sup></span></span> implements these robust estimation strategies, ensuring practical applicability for researchers and practitioners.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"211 ","pages":"Article 105538"},"PeriodicalIF":1.4,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145516974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-08DOI: 10.1016/j.jmva.2025.105536
Darjus Hosszejni, Sylvia Frühwirth-Schnatter
Factor models are an indispensable tool in dimension reduction in multivariate statistical analysis. Methodological research for factor models is often concerned with identifying rotations that provide the best interpretation of the loadings. This focus on rotational invariance, however, does not ensure unique variance decomposition, which is crucial in many applications where separating common and idiosyncratic variation is key. The present paper provides conditions for variance identification based solely on a counting rule for the binary zero–nonzero pattern of the factor loading matrix which underpins subsequent inference and interpretability. By connecting factor analysis with some classical elements from graph and network theory, it is proven that this condition is sufficient for variance identification without imposing any conditions on the factor loading matrix. An efficient algorithm is designed to verify the seemingly untractable condition in polynomial number of steps. To illustrate the practical relevance of these new insights, the paper makes an explicit connection to post-processing in sparse Bayesian factor analysis. A simulation study and a real world data analysis of financial returns with a time-varying factor model illustrates that verifying variance identification is highly relevant for statistical factor analysis, in particular when the factor dimension is unknown.
{"title":"Cover it up! Bipartite graphs uncover identifiability in sparse factor analysis","authors":"Darjus Hosszejni, Sylvia Frühwirth-Schnatter","doi":"10.1016/j.jmva.2025.105536","DOIUrl":"10.1016/j.jmva.2025.105536","url":null,"abstract":"<div><div>Factor models are an indispensable tool in dimension reduction in multivariate statistical analysis. Methodological research for factor models is often concerned with identifying rotations that provide the best interpretation of the loadings. This focus on rotational invariance, however, does not ensure unique variance decomposition, which is crucial in many applications where separating common and idiosyncratic variation is key. The present paper provides conditions for variance identification based solely on a counting rule for the binary zero–nonzero pattern of the factor loading matrix which underpins subsequent inference and interpretability. By connecting factor analysis with some classical elements from graph and network theory, it is proven that this condition is sufficient for variance identification without imposing any conditions on the factor loading matrix. An efficient algorithm is designed to verify the seemingly untractable condition in polynomial number of steps. To illustrate the practical relevance of these new insights, the paper makes an explicit connection to post-processing in sparse Bayesian factor analysis. A simulation study and a real world data analysis of financial returns with a time-varying factor model illustrates that verifying variance identification is highly relevant for statistical factor analysis, in particular when the factor dimension is unknown.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"211 ","pages":"Article 105536"},"PeriodicalIF":1.4,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145516863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-08DOI: 10.1016/j.jmva.2025.105534
Oliver Warth, Lutz Dümbgen
Detecting and visualizing interesting structures in high-dimensional data is a ubiquitous challenge. If one aims for linear projections onto low-dimensional spaces, a well-known problematic phenomenon is the Diaconis–Freedman effect: under mild conditions, most projections do not reveal interesting structures but look like scale mixtures of spherically symmetric Gaussian distributions. We present a method which combines global search strategies and local projection pursuit via maximizing the maximum mean discrepancy (MMD) between the empirical distribution of the projected data and a data-driven Gaussian mixture distribution. Here, MMD is based on kernel mean embeddings with Gaussian kernels.
{"title":"Projection pursuit via kernel mean embeddings","authors":"Oliver Warth, Lutz Dümbgen","doi":"10.1016/j.jmva.2025.105534","DOIUrl":"10.1016/j.jmva.2025.105534","url":null,"abstract":"<div><div>Detecting and visualizing interesting structures in high-dimensional data is a ubiquitous challenge. If one aims for linear projections onto low-dimensional spaces, a well-known problematic phenomenon is the Diaconis–Freedman effect: under mild conditions, most projections do not reveal interesting structures but look like scale mixtures of spherically symmetric Gaussian distributions. We present a method which combines global search strategies and local projection pursuit via maximizing the maximum mean discrepancy (MMD) between the empirical distribution of the projected data and a data-driven Gaussian mixture distribution. Here, MMD is based on kernel mean embeddings with Gaussian kernels.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"211 ","pages":"Article 105534"},"PeriodicalIF":1.4,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145516973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-08DOI: 10.1016/j.jmva.2025.105533
C.J. Adcock
This paper reports the results of a study into projection pursuit for the multivariate extended skew-normal and skew-Student distributions. Computation of the projection pursuit vectors is done using an algorithm that exploits the structure of the moments. Detailed results are reported for a range of values of the shape vector, the extension parameter and degrees of freedom. The required scale matrix and shape vectors are based on data reported in a study of diabetes. The same parameters and data are used to illustrate the role that projection pursuit can play in variable selection for regression. The differences between third and fourth order projection pursuit are not great, this being a consequence of the structure of the moments induced by the form of the distribution. There are differences depending on the choice of parameterization. Use of the central parameterization changes the structure of both the covariance matrix and the shape vector.
{"title":"Skewness and kurtosis projection pursuit for the multivariate extended skew-normal and skew-Student distributions","authors":"C.J. Adcock","doi":"10.1016/j.jmva.2025.105533","DOIUrl":"10.1016/j.jmva.2025.105533","url":null,"abstract":"<div><div>This paper reports the results of a study into projection pursuit for the multivariate extended skew-normal and skew-Student distributions. Computation of the projection pursuit vectors is done using an algorithm that exploits the structure of the moments. Detailed results are reported for a range of values of the shape vector, the extension parameter and degrees of freedom. The required scale matrix and shape vectors are based on data reported in a study of diabetes. The same parameters and data are used to illustrate the role that projection pursuit can play in variable selection for regression. The differences between third and fourth order projection pursuit are not great, this being a consequence of the structure of the moments induced by the form of the distribution. There are differences depending on the choice of parameterization. Use of the central parameterization changes the structure of both the covariance matrix and the shape vector.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"211 ","pages":"Article 105533"},"PeriodicalIF":1.4,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145516971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-08DOI: 10.1016/j.jmva.2025.105524
Una Radojičić , Klaus Nordhausen , Joni Virta
It is well-known that, in Gaussian two-group separation, the optimally discriminating projection direction can be estimated without any knowledge on the group labels. In this work, we gather several such unsupervised estimators based on skewness and derive their limiting distributions. As one of our main results, we show that all affine equivariant estimators of the optimal direction have proportional asymptotic covariance matrices, making their comparison straightforward. Two of our four estimators are novel and two have been proposed already earlier. We use simulations to verify our results and to inspect the finite-sample behaviors of the estimators.
{"title":"Unsupervised linear discrimination using skewness","authors":"Una Radojičić , Klaus Nordhausen , Joni Virta","doi":"10.1016/j.jmva.2025.105524","DOIUrl":"10.1016/j.jmva.2025.105524","url":null,"abstract":"<div><div>It is well-known that, in Gaussian two-group separation, the optimally discriminating projection direction can be estimated without any knowledge on the group labels. In this work, we gather several such unsupervised estimators based on skewness and derive their limiting distributions. As one of our main results, we show that all affine equivariant estimators of the optimal direction have proportional asymptotic covariance matrices, making their comparison straightforward. Two of our four estimators are novel and two have been proposed already earlier. We use simulations to verify our results and to inspect the finite-sample behaviors of the estimators.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"211 ","pages":"Article 105524"},"PeriodicalIF":1.4,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145516976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-08DOI: 10.1016/j.jmva.2025.105529
Shifeng Xiong
Principal component analysis and factor analysis are fundamental multivariate analysis methods. In this paper a unified framework to connect them is introduced. Under a general latent variable model, we present matrix optimization problems from the viewpoint of loss function minimization, and show that the two methods can be viewed as solutions to the optimization problems with specific loss functions. Specifically, principal component analysis can be derived from a broad class of loss functions including the norm, while factor analysis corresponds to a modified norm problem. Related problems are discussed, including algorithms, penalized maximum likelihood estimation under the latent variable model, and a principal component factor model. These results can lead to new tools of data analysis and research topics.
{"title":"A unified framework of principal component analysis and factor analysis","authors":"Shifeng Xiong","doi":"10.1016/j.jmva.2025.105529","DOIUrl":"10.1016/j.jmva.2025.105529","url":null,"abstract":"<div><div>Principal component analysis and factor analysis are fundamental multivariate analysis methods. In this paper a unified framework to connect them is introduced. Under a general latent variable model, we present matrix optimization problems from the viewpoint of loss function minimization, and show that the two methods can be viewed as solutions to the optimization problems with specific loss functions. Specifically, principal component analysis can be derived from a broad class of loss functions including the <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span> norm, while factor analysis corresponds to a modified <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span> norm problem. Related problems are discussed, including algorithms, penalized maximum likelihood estimation under the latent variable model, and a principal component factor model. These results can lead to new tools of data analysis and research topics.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"211 ","pages":"Article 105529"},"PeriodicalIF":1.4,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145516862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}