Asta-Advances in Statistical Analysis最新文献

英文中文

Discussion on On the role of data, statistics and decisions in a pandemic 关于数据、统计和决策在大流行中的作用的讨论

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2022-06-10 DOI: 10.1007/s10182-022-00450-y

Ursula Berger, Göran Kauermann, Helmut Küchenhoff

The authors make an important contribution presenting a comprehensive and thoughtful overview about the many different aspects of data, statistics and data analyses in times of the recent COVID-19 pandemic discussing all relevant topics. The paper certainly provides a very valuable reflection of what has been done, what could have been done and what needs to be done. We contribute here with a few comments and some additional issues. We do not discuss all chapters of Jahn et al. (AStA Adv Stat Anal, 2022. 10.1007/s10182-022-00439-7), but focus on those where our personal views and experiences might add some additional aspects.

作者做出了重要贡献，对最近COVID-19大流行时期的数据、统计和数据分析的许多不同方面进行了全面和深思熟虑的概述，讨论了所有相关主题。对于已经做了什么、本可以做什么以及需要做什么，这份报告无疑提供了非常有价值的反映。我们在这里提出一些意见和一些附加问题。我们不讨论Jahn等人的所有章节(astv Stat Anal, 2022)。10.1007/s10182-022-00439-7)，但重点关注那些我们个人的观点和经验可能会增加一些额外的方面。

引用次数: 5

Describing a landscape we are yet discovering 描绘了一幅我们尚未发现的风景

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2022-06-09 DOI: 10.1007/s10182-022-00449-5

Sebastian Contreras, Jonas Dehning, Viola Priesemann

引用次数: 3

Hierarchical clustering and matrix completion for the reconstruction of world input–output tables 世界输入输出表重构的层次聚类和矩阵补全

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2022-06-02 DOI: 10.1007/s10182-022-00448-6

Rodolfo Metulini, Giorgio Gnecco, Francesco Biancalani, Massimo Riccaboni

Multi-regional input–output (I/O) matrices provide the networks of within- and cross-country economic relations. In the context of I/O analysis, the methodology adopted by national statistical offices in data collection raises the issue of obtaining reliable data in a timely fashion and it makes the reconstruction of (parts of) the I/O matrices of particular interest. In this work, we propose a method combining hierarchical clustering and matrix completion with a LASSO-like nuclear norm penalty, to predict missing entries of a partially unknown I/O matrix. Through analyses based on both real-world and synthetic I/O matrices, we study the effectiveness of the proposed method to predict missing values from both previous years data and current data related to countries similar to the one for which current data are obscured. To show the usefulness of our method, an application based on World Input–Output Database (WIOD) tables—which are an example of industry-by-industry I/O tables—is provided. Strong similarities in structure between WIOD and other I/O tables are also found, which make the proposed approach easily generalizable to them.

多区域投入产出(I/O)矩阵提供国内和跨国经济关系网络。在输入/输出分析方面，国家统计局在收集数据时采用的方法提出了及时获得可靠数据的问题，并使(部分)输入/输出矩阵的重建特别令人感兴趣。在这项工作中，我们提出了一种结合分层聚类和矩阵补全以及类似lasso的核范数惩罚的方法，来预测部分未知I/O矩阵的缺失条目。通过基于真实世界和合成I/O矩阵的分析，我们研究了所提出的方法在预测前几年数据和当前数据中缺失值的有效性，这些数据与当前数据模糊的国家相似。为了展示我们的方法的实用性，提供了一个基于世界输入输出数据库(World Input-Output Database, WIOD)表的应用程序——它是各行业I/O表的一个示例。wid和其他I/O表在结构上也有很强的相似性，这使得所提出的方法很容易推广到它们。

{"title":"Hierarchical clustering and matrix completion for the reconstruction of world input–output tables","authors":"Rodolfo Metulini, Giorgio Gnecco, Francesco Biancalani, Massimo Riccaboni","doi":"10.1007/s10182-022-00448-6","DOIUrl":"10.1007/s10182-022-00448-6","url":null,"abstract":"<div><p>Multi-regional input–output (I/O) matrices provide the networks of within- and cross-country economic relations. In the context of I/O analysis, the methodology adopted by national statistical offices in data collection raises the issue of obtaining reliable data in a timely fashion and it makes the reconstruction of (parts of) the I/O matrices of particular interest. In this work, we propose a method combining hierarchical clustering and matrix completion with a LASSO-like nuclear norm penalty, to predict missing entries of a partially unknown I/O matrix. Through analyses based on both real-world and synthetic I/O matrices, we study the effectiveness of the proposed method to predict missing values from both previous years data and current data related to countries similar to the one for which current data are obscured. To show the usefulness of our method, an application based on World Input–Output Database (WIOD) tables—which are an example of industry-by-industry I/O tables—is provided. Strong similarities in structure between WIOD and other I/O tables are also found, which make the proposed approach easily generalizable to them.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"107 3","pages":"575 - 620"},"PeriodicalIF":1.4,"publicationDate":"2022-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00448-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50004745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Comment on: On the role of data, statistics and decisions in a pandemic statistics for climate protection and health—dare (more) progress! 评论:关于数据、统计和决策在大流行中的作用，气候保护和健康统计要取得(更多)进展!

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2022-05-21 DOI: 10.1007/s10182-022-00447-7

Walter J. Radermacher

In the Corona pandemic, it became clear with burning clarity how much good quality statistics are needed, and at the same time how unsuccessful we are at providing such statistics despite the existing technical and methodological possibilities and diverse data sources. It is therefore more than overdue to get to the bottom of the causes of these issues and to learn from the findings. This defines a high aspiration, namely that firstly a diagnosis is carried out in which the causes of the deficiencies with their interactions are identified as broadly as possible. Secondly, such a broad diagnosis should result in a therapy that includes a coherent strategy that can be generalised, i.e. that goes beyond the Corona pandemic.

在新冠疫情中，人们清楚地看到，需要多少高质量的统计数据，同时，尽管存在技术和方法上的可能性以及不同的数据来源，但我们在提供此类统计数据方面是多么的失败。因此，我们早就应该弄清这些问题的原因，并从调查结果中吸取教训。这定义了一个高期望，即首先进行诊断，尽可能广泛地确定缺陷的原因及其相互作用。其次，如此广泛的诊断应该导致一种治疗，其中包括一种可以推广的连贯策略，即超越冠状病毒大流行。

引用次数: 5

Tests of stochastic dominance with repeated measurements data 用重复测量数据进行随机优势检验

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2022-05-11 DOI: 10.1007/s10182-022-00446-8

Angel G. Angelov, Magnus Ekström

The paper explores a testing problem which involves four hypotheses, that is, based on observations of two random variables X and Y, we wish to discriminate between four possibilities: identical survival functions, stochastic dominance of X over Y, stochastic dominance of Y over X, or crossing survival functions. Four-decision testing procedures for repeated measurements data are proposed. The tests are based on a permutation approach and do not rely on distributional assumptions. One-sided versions of the Cramér–von Mises, Anderson–Darling, and Kolmogorov–Smirnov statistics are utilized. The consistency of the tests is proven. A simulation study shows good power properties and control of false-detection errors. The suggested tests are applied to data from a psychophysical experiment.

本文探讨了一个涉及四个假设的检验问题，即基于对两个随机变量X和Y的观察，我们希望区分四种可能性:相同的生存函数，X对Y的随机优势，Y对X的随机优势，或交叉生存函数。提出了重复测量数据的四决策测试程序。这些测试基于排列方法，而不依赖于分布假设。本研究采用了克莱姆萨-冯-米塞斯、安德森-达林和柯尔莫哥洛夫-斯米尔诺夫统计的单侧版本。验证了试验结果的一致性。仿真研究表明，该方法具有良好的功率特性和对误检误差的控制能力。建议的测试应用于心理物理实验的数据。

引用次数: 0

On dealing with the unknown population minimum in parametric inference 参数推理中未知总体最小值的处理

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2022-05-05 DOI: 10.1007/s10182-022-00445-9

Matheus Henrique Junqueira Saldanha, Adriano Kamimura Suzuki

A myriad of physical, biological and other phenomena are better modeled with semi-infinite distribution families, in which case not knowing the population minimum becomes a hassle when performing parametric inference. Ad hoc methods to deal with this problem exist, but are suboptimal and sometimes unfeasible. Besides, having the statistician handcraft solutions in a case-by-case basis is counterproductive. In this paper, we propose a framework under which the issue can be analyzed, and perform an extensive search in the literature for methods that could be used to solve the aforementioned problem; we also propose a method of our own. Simulation experiments were then performed to compare some methods from the literature and our proposal. We found that the straightforward method, which is to infer the population minimum by maximum likelihood, has severe difficulty in giving a good estimate for the population minimum, but manages to achieve very good inferred models. The other methods, including our proposal, involve estimating the population minimum, and we found that our method is superior to the other methods of this kind, considering the distributions simulated, followed very closely by the endpoint estimator by Alves et al. (Stat Sin 24(4):1811–1835, 2014). Although these two give much more accurate estimates for the population minimum, the straightforward method also displays some advantages, so choosing between these three methods will depend on the problem domain.

无数的物理、生物和其他现象可以用半无限分布族更好地建模，在这种情况下，不知道总体最小值在执行参数推理时变得很麻烦。处理这个问题的特别方法是存在的，但不是最优的，有时是不可行的。此外，让统计学家在个案的基础上手工制作解决方案会适得其反。在本文中，我们提出了一个可以分析问题的框架，并在文献中进行了广泛的搜索，以寻找可用于解决上述问题的方法;我们也提出了自己的方法。然后进行了仿真实验，比较了文献中的一些方法和我们的建议。我们发现，直接的方法，即通过最大似然推断总体最小值，在给出总体最小值的良好估计方面存在严重困难，但可以获得非常好的推断模型。其他方法，包括我们的建议，涉及估计总体最小值，我们发现，考虑到模拟的分布，我们的方法优于同类的其他方法，紧随其后的是Alves等人的端点估计器(Stat Sin 24(4): 1811-1835, 2014)。尽管这两种方法对总体最小值给出了更准确的估计，但直接的方法也显示出一些优势，因此在这三种方法之间进行选择将取决于问题领域。

{"title":"On dealing with the unknown population minimum in parametric inference","authors":"Matheus Henrique Junqueira Saldanha, Adriano Kamimura Suzuki","doi":"10.1007/s10182-022-00445-9","DOIUrl":"10.1007/s10182-022-00445-9","url":null,"abstract":"<div><p>A myriad of physical, biological and other phenomena are better modeled with semi-infinite distribution families, in which case not knowing the population minimum becomes a hassle when performing parametric inference. Ad hoc methods to deal with this problem exist, but are suboptimal and sometimes unfeasible. Besides, having the statistician handcraft solutions in a case-by-case basis is counterproductive. In this paper, we propose a framework under which the issue can be analyzed, and perform an extensive search in the literature for methods that could be used to solve the aforementioned problem; we also propose a method of our own. Simulation experiments were then performed to compare some methods from the literature and our proposal. We found that the straightforward method, which is to infer the population minimum by maximum likelihood, has severe difficulty in giving a good estimate for the population minimum, but manages to achieve very good inferred models. The other methods, including our proposal, involve estimating the population minimum, and we found that our method is superior to the other methods of this kind, considering the distributions simulated, followed very closely by the endpoint estimator by Alves et al. (Stat Sin 24(4):1811–1835, 2014). Although these two give much more accurate estimates for the population minimum, the straightforward method also displays some advantages, so choosing between these three methods will depend on the problem domain.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"107 3","pages":"509 - 535"},"PeriodicalIF":1.4,"publicationDate":"2022-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43197145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Local spatial log-Gaussian Cox processes for seismic data 地震数据的局部空间对数-高斯Cox过程

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2022-04-25 DOI: 10.1007/s10182-022-00444-w

Nicoletta D’Angelo, Marianna Siino, Antonino D’Alessandro, Giada Adelfio

In this paper, we propose the use of advanced and flexible statistical models to describe the spatial displacement of earthquake data. The paper aims to account for the external geological information in the description of complex seismic point processes, through the estimation of models with space varying parameters. A local version of the Log-Gaussian Cox processes (LGCP) is introduced and applied for the first time, exploiting the inferential tools in Baddeley (Spat Stat 22:261–295, 2017), estimating the model by the local Palm likelihood. We provide methods and approaches accounting for the interaction among points, typically described by LGCP models through the estimation of the covariance parameters of the Gaussian Random Field, that in this local version are allowed to vary in space, providing a more realistic description of the clustering feature of seismic events. Furthermore, we contribute to the framework of diagnostics, outlining suitable methods for the local context and proposing a new step-wise approach addressing the particular case of multiple covariates. Overall, we show that local models provide good inferential results and could serve as the basis for future spatio-temporal local model developments, peculiar for the description of the complex seismic phenomenon.

在本文中，我们建议使用先进和灵活的统计模型来描述地震数据的空间位移。本文旨在通过空间变参数模型的估计，在复杂地震点过程的描述中考虑外部地质信息。引入并首次应用了局部版本的log -高斯Cox过程(LGCP)，利用Baddeley (Spat Stat 22:26 - 295, 2017)中的推理工具，通过局部Palm似然估计模型。我们提供了考虑点之间相互作用的方法和途径，通常由LGCP模型通过估计高斯随机场的协方差参数来描述，在这个局部版本中，这些参数允许在空间上变化，从而更真实地描述地震事件的聚类特征。此外，我们为诊断框架做出了贡献，概述了适合当地情况的方法，并提出了一种新的逐步方法来解决多协变量的特殊情况。总的来说，我们表明局部模型提供了良好的推理结果，可以作为未来时空局部模型发展的基础，对于复杂地震现象的描述是特殊的。

{"title":"Local spatial log-Gaussian Cox processes for seismic data","authors":"Nicoletta D’Angelo, Marianna Siino, Antonino D’Alessandro, Giada Adelfio","doi":"10.1007/s10182-022-00444-w","DOIUrl":"10.1007/s10182-022-00444-w","url":null,"abstract":"<div><p>In this paper, we propose the use of advanced and flexible statistical models to describe the spatial displacement of earthquake data. The paper aims to account for the external geological information in the description of complex seismic point processes, through the estimation of models with space varying parameters. A local version of the Log-Gaussian Cox processes (LGCP) is introduced and applied for the first time, exploiting the inferential tools in Baddeley (Spat Stat 22:261–295, 2017), estimating the model by the local Palm likelihood. We provide methods and approaches accounting for the interaction among points, typically described by LGCP models through the estimation of the covariance parameters of the Gaussian Random Field, that in this local version are allowed to vary in space, providing a more realistic description of the clustering feature of seismic events. Furthermore, we contribute to the framework of diagnostics, outlining suitable methods for the local context and proposing a new step-wise approach addressing the particular case of multiple covariates. Overall, we show that local models provide good inferential results and could serve as the basis for future spatio-temporal local model developments, peculiar for the description of the complex seismic phenomenon.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 4","pages":"633 - 671"},"PeriodicalIF":1.4,"publicationDate":"2022-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00444-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44906683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Some measures of kurtosis and their inference on large datasets 峰度的一些度量及其在大型数据集上的推断

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2022-04-14 DOI: 10.1007/s10182-022-00442-y

Claudio Giovanni Borroni, Lucio De Capitani

This paper deals with the estimation of kurtosis on large datasets. It aims at overcoming two frequent limitations in applications: first, Pearson's standardized fourth moment is computed as a unique measure of kurtosis; second, the fact that data might be just samples is neglected, so that the opportunity of using suitable inferential tools, like standard errors and confidence intervals, is discarded. In the paper, some recent indexes of kurtosis are reviewed as alternatives to Pearson’s standardized fourth moment. The asymptotic distribution of their natural estimators is derived, and it is used as a tool to evaluate efficiency and to build confidence intervals. A simulation study is also conducted to provide practical indications about the choice of a suitable index. As a conclusion, researchers are warned against the use of classical Pearson’s index when the sample size is too low and/or the distribution is skewed and/or heavy-tailed. Specifically, the occurrence of heavy tails can deprive Pearson’s index of any meaning or produce unreliable confidence intervals. However, such limitations can be overcome by reverting to the reviewed alternative indexes, relying just on low-order moments.

本文研究了大型数据集的峰度估计问题。它旨在克服应用中两个常见的限制:首先，皮尔逊的标准化第四矩被计算为峰度的独特度量;其次，数据可能只是样本的事实被忽略了，因此使用合适的推断工具(如标准误差和置信区间)的机会被丢弃了。本文综述了最近出现的一些峰度指标，作为皮尔逊标准第四矩的替代指标。推导了它们的自然估计量的渐近分布，并将其作为评估效率和建立置信区间的工具。通过仿真研究，为选择合适的指标提供了实际依据。作为结论，研究人员被警告不要在样本量过低和/或分布偏斜和/或重尾时使用经典的皮尔逊指数。具体来说，重尾的出现会使皮尔逊指数失去任何意义或产生不可靠的置信区间。然而，这种限制可以通过恢复到仅依赖于低阶矩的已审查的替代指标来克服。

{"title":"Some measures of kurtosis and their inference on large datasets","authors":"Claudio Giovanni Borroni, Lucio De Capitani","doi":"10.1007/s10182-022-00442-y","DOIUrl":"10.1007/s10182-022-00442-y","url":null,"abstract":"<div><p>This paper deals with the estimation of kurtosis on large datasets. It aims at overcoming two frequent limitations in applications: first, Pearson's standardized fourth moment is computed as a unique measure of kurtosis; second, the fact that data might be just samples is neglected, so that the opportunity of using suitable inferential tools, like standard errors and confidence intervals, is discarded. In the paper, some recent indexes of kurtosis are reviewed as alternatives to Pearson’s standardized fourth moment. The asymptotic distribution of their natural estimators is derived, and it is used as a tool to evaluate efficiency and to build confidence intervals. A simulation study is also conducted to provide practical indications about the choice of a suitable index. As a conclusion, researchers are warned against the use of classical Pearson’s index when the sample size is too low and/or the distribution is skewed and/or heavy-tailed. Specifically, the occurrence of heavy tails can deprive Pearson’s index of any meaning or produce unreliable confidence intervals. However, such limitations can be overcome by reverting to the reviewed alternative indexes, relying just on low-order moments.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 4","pages":"573 - 607"},"PeriodicalIF":1.4,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00442-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42575110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A quantile regression perspective on external preference mapping 外部偏好映射的分位数回归分析

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2022-04-12 DOI: 10.1007/s10182-022-00440-0

Cristina Davino, Tormod Næs, Rosaria Romano, Domenico Vistocco

External preference mapping is widely used in marketing and R&D divisions to understand the consumer behaviour. The most common preference map is obtained through a two-step procedure that combines principal component analysis and least squares regression. The standard approach exploits classical regression and therefore focuses on the conditional mean. This paper proposes the use of quantile regression to enrich the preference map looking at the whole distribution of the consumer preference. The enriched maps highlight possible different consumer behaviour with respect to the less or most preferred products. This is pursued by exploring the variability of liking along the principal components as well as focusing on the direction of preference. The use of different aesthetics (colours, shapes, size, arrows) equips standard preference map with additional information and does not force the user to change the standard tool she/he is used to. The proposed methodology is shown in action on a case study pertaining yogurt preferences.

外部偏好映射广泛应用于市场营销和研发部门，以了解消费者的行为。最常见的偏好图是通过结合主成分分析和最小二乘回归的两步程序获得的。标准方法利用经典回归，因此侧重于条件均值。本文提出使用分位数回归来丰富消费者偏好整体分布的偏好图。丰富的地图突出了相对于不太受欢迎或最受欢迎的产品可能存在的不同消费者行为。这是通过探索沿着主要成分的喜好变化以及关注偏好的方向来实现的。使用不同的美学(颜色，形状，大小，箭头)为标准偏好图提供了额外的信息，并且不会强迫用户改变他/她习惯的标准工具。所提出的方法是在一个有关酸奶偏好的案例研究中显示的。

{"title":"A quantile regression perspective on external preference mapping","authors":"Cristina Davino, Tormod Næs, Rosaria Romano, Domenico Vistocco","doi":"10.1007/s10182-022-00440-0","DOIUrl":"10.1007/s10182-022-00440-0","url":null,"abstract":"<div><p>External preference mapping is widely used in marketing and R&D divisions to understand the consumer behaviour. The most common preference map is obtained through a two-step procedure that combines principal component analysis and least squares regression. The standard approach exploits classical regression and therefore focuses on the conditional mean. This paper proposes the use of quantile regression to enrich the preference map looking at the whole distribution of the consumer preference. The enriched maps highlight possible different consumer behaviour with respect to the less or most preferred products. This is pursued by exploring the variability of liking along the principal components as well as focusing on the direction of preference. The use of different aesthetics (colours, shapes, size, arrows) equips standard preference map with additional information and does not force the user to change the standard tool she/he is used to. The proposed methodology is shown in action on a case study pertaining yogurt preferences.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 4","pages":"545 - 571"},"PeriodicalIF":1.4,"publicationDate":"2022-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00440-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44379862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Group sparse recovery via group square-root elastic net and the iterative multivariate thresholding-based algorithm 基于群平方根弹性网和迭代多元阈值算法的群稀疏恢复

IF 1.4 4区数学 Q2 STATISTICS & PROBABILITY

Asta-Advances in Statistical Analysis

Pub Date : 2022-04-08 DOI: 10.1007/s10182-022-00443-x

Wanling Xie, Hu Yang

In this work, we propose a novel group selection method called Group Square-Root Elastic Net. It is based on square-root regularization with a group elastic net penalty, i.e., a (ell _{2,1}+ell _2) penalty. As a type of square-root-based procedure, one distinct feature is that the estimator is independent of the unknown noise level (sigma ), which is non-trivial to estimate under the high-dimensional setting, especially when (pgg n). In many applications, the estimator is expected to be sparse, not in an irregular way, but rather in a structured manner. It makes the proposed method very attractive to tackle both high-dimensionality and structured sparsity. We study the correct subset recovery under a Group Elastic Net Irrepresentable Condition. Both the slow rate bounds and fast rate bounds are established, the latter under the Restricted Eigenvalue assumption and Gaussian noise assumption. To implement, a fast algorithm based on the scaled multivariate thresholding-based iterative selection idea is introduced with proved convergence. A comparative study examines the superiority of our approach against alternatives.

在这项工作中，我们提出了一种新的群体选择方法，称为群体平方根弹性网。它基于平方根正则化，并带有一组弹性网惩罚，即(ell _{2,1}+ell _2)惩罚。作为一种基于平方根的过程，一个明显的特征是估计量与未知噪声水平(sigma )无关，这在高维设置下是非平凡的估计，特别是当(pgg n)。在许多应用程序中，估计器被期望是稀疏的，不是不规则的，而是结构化的。这使得该方法在处理高维稀疏性和结构化稀疏性方面都非常有吸引力。研究了群弹性网不可表示条件下的正确子集恢复。建立了慢速边界和快速边界，其中快速边界是在限制特征值假设和高斯噪声假设下建立的。为了实现这一目标，提出了一种基于缩放多元阈值迭代选择思想的快速算法，并证明了算法的收敛性。一项比较研究检验了我们的方法相对于其他方法的优越性。

{"title":"Group sparse recovery via group square-root elastic net and the iterative multivariate thresholding-based algorithm","authors":"Wanling Xie, Hu Yang","doi":"10.1007/s10182-022-00443-x","DOIUrl":"10.1007/s10182-022-00443-x","url":null,"abstract":"<div><p>In this work, we propose a novel group selection method called Group Square-Root Elastic Net. It is based on square-root regularization with a group elastic net penalty, i.e., a <span>(ell _{2,1}+ell _2)</span> penalty. As a type of square-root-based procedure, one distinct feature is that the estimator is independent of the unknown noise level <span>(sigma )</span>, which is non-trivial to estimate under the high-dimensional setting, especially when <span>(pgg n)</span>. In many applications, the estimator is expected to be sparse, not in an irregular way, but rather in a structured manner. It makes the proposed method very attractive to tackle both high-dimensionality and structured sparsity. We study the correct subset recovery under a Group Elastic Net Irrepresentable Condition. Both the slow rate bounds and fast rate bounds are established, the latter under the Restricted Eigenvalue assumption and Gaussian noise assumption. To implement, a fast algorithm based on the scaled multivariate thresholding-based iterative selection idea is introduced with proved convergence. A comparative study examines the superiority of our approach against alternatives.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"107 3","pages":"469 - 507"},"PeriodicalIF":1.4,"publicationDate":"2022-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49272710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Asta-Advances in Statistical Analysis

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀