分布对分布回归的非线性充分降维

IF 1.4 3区数学 Q2 STATISTICS & PROBABILITY Journal of Multivariate Analysis Pub Date : 2024-02-27 DOI:10.1016/j.jmva.2024.105302

Qi Zhang, Bing Li, Lingzhou Xue

{"title":"分布对分布回归的非线性充分降维","authors":"Qi Zhang, Bing Li, Lingzhou Xue","doi":"10.1016/j.jmva.2024.105302","DOIUrl":null,"url":null,"abstract":"<div><p>We introduce a new approach to nonlinear sufficient dimension reduction in cases where both the predictor and the response are distributional data, modeled as members of a metric space. Our key step is to build universal kernels (cc-universal) on the metric spaces, which results in reproducing kernel Hilbert spaces for the predictor and response that are rich enough to characterize the conditional independence that determines sufficient dimension reduction. For univariate distributions, we construct the universal kernel using the Wasserstein distance, while for multivariate distributions, we resort to the sliced Wasserstein distance. The sliced Wasserstein distance ensures that the metric space possesses similar topological properties to the Wasserstein space, while also offering significant computation benefits. Numerical results based on synthetic data show that our method outperforms possible competing methods. The method is also applied to several data sets, including fertility and mortality data and Calgary temperature data.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"202 ","pages":"Article 105302"},"PeriodicalIF":1.4000,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Nonlinear sufficient dimension reduction for distribution-on-distribution regression\",\"authors\":\"Qi Zhang, Bing Li, Lingzhou Xue\",\"doi\":\"10.1016/j.jmva.2024.105302\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>We introduce a new approach to nonlinear sufficient dimension reduction in cases where both the predictor and the response are distributional data, modeled as members of a metric space. Our key step is to build universal kernels (cc-universal) on the metric spaces, which results in reproducing kernel Hilbert spaces for the predictor and response that are rich enough to characterize the conditional independence that determines sufficient dimension reduction. For univariate distributions, we construct the universal kernel using the Wasserstein distance, while for multivariate distributions, we resort to the sliced Wasserstein distance. The sliced Wasserstein distance ensures that the metric space possesses similar topological properties to the Wasserstein space, while also offering significant computation benefits. Numerical results based on synthetic data show that our method outperforms possible competing methods. The method is also applied to several data sets, including fertility and mortality data and Calgary temperature data.</p></div>\",\"PeriodicalId\":16431,\"journal\":{\"name\":\"Journal of Multivariate Analysis\",\"volume\":\"202 \",\"pages\":\"Article 105302\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2024-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Multivariate Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0047259X24000095\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Multivariate Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0047259X24000095","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 0

摘要

在预测因子和响应因子都是分布数据的情况下，我们引入了一种新的非线性充分降维方法。我们的关键步骤是在度量空间上建立通用核（cc-universal），从而为预测因子和响应再现核希尔伯特空间，这些空间的丰富程度足以描述决定充分降维的条件独立性。对于单变量分布，我们使用 Wasserstein 距离构建通用核，而对于多变量分布，我们则使用切分 Wasserstein 距离。切片瓦瑟斯坦距离可确保度量空间具有与瓦瑟斯坦空间相似的拓扑特性，同时还具有显著的计算优势。基于合成数据的数值结果表明，我们的方法优于可能的竞争方法。该方法还应用于多个数据集，包括生育率和死亡率数据以及卡尔加里温度数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Nonlinear sufficient dimension reduction for distribution-on-distribution regression

We introduce a new approach to nonlinear sufficient dimension reduction in cases where both the predictor and the response are distributional data, modeled as members of a metric space. Our key step is to build universal kernels (cc-universal) on the metric spaces, which results in reproducing kernel Hilbert spaces for the predictor and response that are rich enough to characterize the conditional independence that determines sufficient dimension reduction. For univariate distributions, we construct the universal kernel using the Wasserstein distance, while for multivariate distributions, we resort to the sliced Wasserstein distance. The sliced Wasserstein distance ensures that the metric space possesses similar topological properties to the Wasserstein space, while also offering significant computation benefits. Numerical results based on synthetic data show that our method outperforms possible competing methods. The method is also applied to several data sets, including fertility and mortality data and Calgary temperature data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Multivariate Analysis 数学-统计学与概率论

CiteScore

2.40

自引率

25.00%

发文量

108

审稿时长

74 days

期刊介绍： Founded in 1971, the Journal of Multivariate Analysis (JMVA) is the central venue for the publication of new, relevant methodology and particularly innovative applications pertaining to the analysis and interpretation of multidimensional data. The journal welcomes contributions to all aspects of multivariate data analysis and modeling, including cluster analysis, discriminant analysis, factor analysis, and multidimensional continuous or discrete distribution theory. Topics of current interest include, but are not limited to, inferential aspects of Copula modeling Functional data analysis Graphical modeling High-dimensional data analysis Image analysis Multivariate extreme-value theory Sparse modeling Spatial statistics.

期刊最新文献

Consistency of empirical distributions of sequences of graph statistics in networks with dependent edges Semiparametric density estimation with localized Bregman divergence Tree-structured Markov random fields with Poisson marginal distributions Model averaging for global Fréchet regression Classification using global and local Mahalanobis distances