{"title":"高维投影寻踪的计算视角:可行或不可行特征提取","authors":"Chunming Zhang, Jimin Ye, Xiaomei Wang","doi":"10.1111/insr.12517","DOIUrl":null,"url":null,"abstract":"<p>Finding a suitable representation of multivariate data is fundamental in many scientific disciplines. Projection pursuit (\n<math>\n <mtext>PP</mtext></math>) aims to extract interesting ‘non-Gaussian’ features from multivariate data, and tends to be computationally intensive even when applied to data of low dimension. In high-dimensional settings, a recent work (Bickel et al., 2018) on \n<math>\n <mtext>PP</mtext></math> addresses asymptotic characterization and conjectures of the feasible projections as the dimension grows with sample size. To gain practical utility of and learn theoretical insights into \n<math>\n <mtext>PP</mtext></math> in an integral way, data analytic tools needed to evaluate the behaviour of \n<math>\n <mtext>PP</mtext></math> in high dimensions become increasingly desirable but are less explored in the literature. This paper focuses on developing computationally fast and effective approaches central to finite sample studies for (i) visualizing the feasibility of \n<math>\n <mtext>PP</mtext></math> in extracting features from high-dimensional data, as compared with alternative methods like \n<math>\n <mtext>PCA</mtext></math> and \n<math>\n <mtext>ICA</mtext></math>, and (ii) assessing the plausibility of \n<math>\n <mtext>PP</mtext></math> in cases where asymptotic studies are lacking or unavailable, with the goal of better understanding the practicality, limitation and challenge of \n<math>\n <mtext>PP</mtext></math> in the analysis of large data sets.</p>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"91 1","pages":"140-161"},"PeriodicalIF":1.7000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/insr.12517","citationCount":"0","resultStr":"{\"title\":\"A Computational Perspective on Projection Pursuit in High Dimensions: Feasible or Infeasible Feature Extraction\",\"authors\":\"Chunming Zhang, Jimin Ye, Xiaomei Wang\",\"doi\":\"10.1111/insr.12517\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Finding a suitable representation of multivariate data is fundamental in many scientific disciplines. Projection pursuit (\\n<math>\\n <mtext>PP</mtext></math>) aims to extract interesting ‘non-Gaussian’ features from multivariate data, and tends to be computationally intensive even when applied to data of low dimension. In high-dimensional settings, a recent work (Bickel et al., 2018) on \\n<math>\\n <mtext>PP</mtext></math> addresses asymptotic characterization and conjectures of the feasible projections as the dimension grows with sample size. To gain practical utility of and learn theoretical insights into \\n<math>\\n <mtext>PP</mtext></math> in an integral way, data analytic tools needed to evaluate the behaviour of \\n<math>\\n <mtext>PP</mtext></math> in high dimensions become increasingly desirable but are less explored in the literature. This paper focuses on developing computationally fast and effective approaches central to finite sample studies for (i) visualizing the feasibility of \\n<math>\\n <mtext>PP</mtext></math> in extracting features from high-dimensional data, as compared with alternative methods like \\n<math>\\n <mtext>PCA</mtext></math> and \\n<math>\\n <mtext>ICA</mtext></math>, and (ii) assessing the plausibility of \\n<math>\\n <mtext>PP</mtext></math> in cases where asymptotic studies are lacking or unavailable, with the goal of better understanding the practicality, limitation and challenge of \\n<math>\\n <mtext>PP</mtext></math> in the analysis of large data sets.</p>\",\"PeriodicalId\":14479,\"journal\":{\"name\":\"International Statistical Review\",\"volume\":\"91 1\",\"pages\":\"140-161\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2022-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/insr.12517\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Statistical Review\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/insr.12517\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Statistical Review","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/insr.12517","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
A Computational Perspective on Projection Pursuit in High Dimensions: Feasible or Infeasible Feature Extraction
Finding a suitable representation of multivariate data is fundamental in many scientific disciplines. Projection pursuit (
) aims to extract interesting ‘non-Gaussian’ features from multivariate data, and tends to be computationally intensive even when applied to data of low dimension. In high-dimensional settings, a recent work (Bickel et al., 2018) on
addresses asymptotic characterization and conjectures of the feasible projections as the dimension grows with sample size. To gain practical utility of and learn theoretical insights into
in an integral way, data analytic tools needed to evaluate the behaviour of
in high dimensions become increasingly desirable but are less explored in the literature. This paper focuses on developing computationally fast and effective approaches central to finite sample studies for (i) visualizing the feasibility of
in extracting features from high-dimensional data, as compared with alternative methods like
and
, and (ii) assessing the plausibility of
in cases where asymptotic studies are lacking or unavailable, with the goal of better understanding the practicality, limitation and challenge of
in the analysis of large data sets.
期刊介绍:
International Statistical Review is the flagship journal of the International Statistical Institute (ISI) and of its family of Associations. It publishes papers of broad and general interest in statistics and probability. The term Review is to be interpreted broadly. The types of papers that are suitable for publication include (but are not limited to) the following: reviews/surveys of significant developments in theory, methodology, statistical computing and graphics, statistical education, and application areas; tutorials on important topics; expository papers on emerging areas of research or application; papers describing new developments and/or challenges in relevant areas; papers addressing foundational issues; papers on the history of statistics and probability; white papers on topics of importance to the profession or society; and historical assessment of seminal papers in the field and their impact.