Stat最新文献

英文中文

Table inference for combinatorial origin‐destination choices in agent‐based population synthesis 基于代理的种群合成中原产地-目的地组合选择的表格推论

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2024-03-06 DOI: 10.1002/sta4.656

Ioannis Zachos, Theodoros Damoulas, Mark Girolami

A key challenge in agent‐based mobility simulations is the synthesis of individual agent socioeconomic profiles. Such profiles include locations of agent activities, which dictate the quality of the simulated travel patterns. These locations are typically represented in origin‐destination matrices that are sampled using coarse travel surveys. This is because fine‐grained trip profiles are scarce and fragmented due to privacy and cost reasons. The discrepancy between data and sampling resolutions renders agent traits nonidentifiable due to the combinatorial space of data‐consistent individual attributes. This problem is pertinent to any agent‐based inference setting where the latent state is discrete. Existing approaches have used continuous relaxations of the underlying location assignments and subsequent ad hoc discretisation thereof. We propose a framework to efficiently navigate this space offering improved reconstruction and coverage as well as linear‐time sampling of the ground truth origin‐destination table. This allows us to avoid factorially growing rejection rates and poor summary statistic consistency inherent in discrete choice modelling. We achieve this by introducing joint sampling schemes for the continuous intensity and discrete table of agent trips, as well as Markov bases that can efficiently traverse this combinatorial space subject to summary statistic constraints. Our framework's benefits are demonstrated in multiple controlled experiments and a large‐scale application to agent work trip reconstruction in Cambridge, UK.

基于代理的移动模拟面临的一个主要挑战是如何综合代理的社会经济概况。这些概况包括代理人的活动地点，这些地点决定了模拟出行模式的质量。这些地点通常在出发地-目的地矩阵中体现，而出发地-目的地矩阵是通过粗略的旅行调查采样得到的。这是因为出于隐私和成本方面的考虑，细粒度的旅行概况非常稀少和分散。由于数据和采样分辨率之间的差异，与数据一致的个体属性的组合空间使得代理特征无法识别。这个问题与潜在状态离散的任何基于代理的推理设置都相关。现有的方法使用了对基础位置分配的连续松弛，以及随后的临时离散化。我们提出了一个框架，可以有效地浏览这个空间，提供更好的重构和覆盖率，并对基本真实的原籍-目的地表进行线性时间采样。这样，我们就能避免离散选择建模中固有的因数增长的拒绝率和较差的汇总统计一致性。为此，我们引入了连续强度和代理行程离散表的联合采样方案，并引入了马尔可夫基（Markov bases），可以在汇总统计约束条件下有效地遍历这一组合空间。我们的框架在多个受控实验和英国剑桥代理人工作行程重建的大规模应用中展示了其优势。

{"title":"Table inference for combinatorial origin‐destination choices in agent‐based population synthesis","authors":"Ioannis Zachos, Theodoros Damoulas, Mark Girolami","doi":"10.1002/sta4.656","DOIUrl":"https://doi.org/10.1002/sta4.656","url":null,"abstract":"A key challenge in agent‐based mobility simulations is the synthesis of individual agent socioeconomic profiles. Such profiles include locations of agent activities, which dictate the quality of the simulated travel patterns. These locations are typically represented in origin‐destination matrices that are sampled using coarse travel surveys. This is because fine‐grained trip profiles are scarce and fragmented due to privacy and cost reasons. The discrepancy between data and sampling resolutions renders agent traits nonidentifiable due to the combinatorial space of data‐consistent individual attributes. This problem is pertinent to any agent‐based inference setting where the latent state is discrete. Existing approaches have used continuous relaxations of the underlying location assignments and subsequent ad hoc discretisation thereof. We propose a framework to efficiently navigate this space offering improved reconstruction and coverage as well as linear‐time sampling of the ground truth origin‐destination table. This allows us to avoid factorially growing rejection rates and poor summary statistic consistency inherent in discrete choice modelling. We achieve this by introducing joint sampling schemes for the continuous intensity and discrete table of agent trips, as well as Markov bases that can efficiently traverse this combinatorial space subject to summary statistic constraints. Our framework's benefits are demonstrated in multiple controlled experiments and a large‐scale application to agent work trip reconstruction in Cambridge, UK.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"105 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140056692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Image registration for zooming: A statistically consistent local feature mapping approach 缩放图像注册：统计一致的局部特征映射方法

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2024-03-05 DOI: 10.1002/sta4.664

Sujay Das, Anik Roy, Partha Sarathi Mukherjee

Image registration is a widely used tool for matching two images of the same scene with one another. In the literature, several image registration techniques are available to register rigid-body and non-rigid-body transformations. One such important transformation is zooming. There are very few feature-based methods that address this particular problem. These methods fail miserably when there are only a limited number of point features available in the image. This paper proposes a feature-based approach that works with a feature that is readily available in almost all images, for registering two images of the same image object where one is a zoomed-in version of the other. In the proposed method, we first detect the possible edge points which we consider as features in both the reference and the zoomed image. Then, we map these features of the reference and the zoomed image with one another and find the relationship between them using a mathematical model. Finally, we use the relationship to register the zoomed-in image. This method outperforms some of the state-of-the-art methods in many occasions. Several numerical examples and some statistical properties justify that this method works well in many applications.

图像配准是一种广泛应用的工具，用于将同一场景的两幅图像相互匹配。在文献中，有多种图像配准技术可用于配准刚体和非刚体变换。其中一个重要的变换就是缩放。目前只有极少数基于特征的方法可以解决这一特殊问题。当图像中可用的点特征数量有限时，这些方法就会惨遭失败。本文提出了一种基于特征的方法，利用几乎所有图像中都存在的特征，对同一图像对象的两幅图像进行注册，其中一幅图像是另一幅图像的放大版本。在建议的方法中，我们首先检测可能的边缘点，并将其视为参考图像和放大图像中的特征。然后，我们将参考图像和放大图像中的这些特征相互映射，并使用数学模型找出它们之间的关系。最后，我们利用这种关系来注册放大后的图像。这种方法在很多情况下都优于一些最先进的方法。几个数字实例和一些统计特性证明，这种方法在许多应用中都能很好地发挥作用。

{"title":"Image registration for zooming: A statistically consistent local feature mapping approach","authors":"Sujay Das, Anik Roy, Partha Sarathi Mukherjee","doi":"10.1002/sta4.664","DOIUrl":"https://doi.org/10.1002/sta4.664","url":null,"abstract":"Image registration is a widely used tool for matching two images of the same scene with one another. In the literature, several image registration techniques are available to register rigid-body and non-rigid-body transformations. One such important transformation is zooming. There are very few feature-based methods that address this particular problem. These methods fail miserably when there are only a limited number of point features available in the image. This paper proposes a feature-based approach that works with a feature that is readily available in almost all images, for registering two images of the same image object where one is a zoomed-in version of the other. In the proposed method, we first detect the possible edge points which we consider as features in both the reference and the zoomed image. Then, we map these features of the reference and the zoomed image with one another and find the relationship between them using a mathematical model. Finally, we use the relationship to register the zoomed-in image. This method outperforms some of the state-of-the-art methods in many occasions. Several numerical examples and some statistical properties justify that this method works well in many applications.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"63 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140044477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

D-optimal designs for multi-response linear models with two groups 两组多反应线性模型的 D 优化设计

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2024-03-05 DOI: 10.1002/sta4.665

Xin Liu, Lei He, Rong-Xian Yue

In recent years, multi-response linear models have gained significant popularity in various statistical applications. However, the design aspects of multi-response linear models with group-wise considerations have received limited attention in the literature. This paper aims to thoroughly investigate

D $$ D $$

-optimal designs for such models. Specifically, we focus on scenarios involving two groups, where the proportions of observations for each group can be arbitrarily selected or not. Two equivalence theorems are presented to elaborate the characterization of

D $$ D $$

-optimal designs. Additionally, we delve into the admissibility of approximate designs and establish necessary conditions for a design to be deemed admissible. Several illustrative examples are addressed to demonstrate the application of the derived theoretical results.

近年来，多反应线性模型在各种统计应用中大受欢迎。然而，文献中对分组考虑的多反应线性模型的设计方面关注有限。本文旨在深入研究此类模型的 D$$ D$$ 最佳设计。具体来说，我们将重点放在涉及两组的情况上，其中每组的观察值比例可以任意选择，也可以不选择。我们提出了两个等价定理来阐述 D$$ D$$ 最佳设计的特征。此外，我们还深入探讨了近似设计的可接受性，并建立了设计被视为可接受性的必要条件。我们还列举了几个示例来证明所得出的理论结果的应用。

{"title":"D-optimal designs for multi-response linear models with two groups","authors":"Xin Liu, Lei He, Rong-Xian Yue","doi":"10.1002/sta4.665","DOIUrl":"https://doi.org/10.1002/sta4.665","url":null,"abstract":"In recent years, multi-response linear models have gained significant popularity in various statistical applications. However, the design aspects of multi-response linear models with group-wise considerations have received limited attention in the literature. This paper aims to thoroughly investigate <mjx-container aria-label=\"upper D\" ctxtmenu_counter=\"1\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-role=\"latinletter\" data-semantic-speech=\"upper D\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/75327e92-2ca5-46c5-ae20-6902d6add7ab/sta4665-math-0003.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-role=\"latinletter\" data-semantic-speech=\"upper D\" data-semantic-type=\"identifier\">D</mi></mrow>$$ D $$</annotation></semantics></math></mjx-assistive-mml></mjx-container>-optimal designs for such models. Specifically, we focus on scenarios involving two groups, where the proportions of observations for each group can be arbitrarily selected or not. Two equivalence theorems are presented to elaborate the characterization of <mjx-container aria-label=\"upper D\" ctxtmenu_counter=\"2\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-role=\"latinletter\" data-semantic-speech=\"upper D\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/ac956979-3a41-48e3-8773-e9144fe466ed/sta4665-math-0004.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-role=\"latinletter\" data-semantic-speech=\"upper D\" data-semantic-type=\"identifier\">D</mi></mrow>$$ D $$</annotation></semantics></math></mjx-assistive-mml></mjx-container>-optimal designs. Additionally, we delve into the admissibility of approximate designs and establish necessary conditions for a design to be deemed admissible. Several illustrative examples are addressed to demonstrate the application of the derived theoretical results.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"9 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140044474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Asymptotic behaviour of a non‐autonomous multispecies Holling type II model with a complex type of noises 具有复杂类型噪声的非自治多物种霍林 II 型模型的渐近行为

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2024-03-05 DOI: 10.1002/sta4.667

Libai Xu, Xintong Ma, Yanyan Zhao

The deterministic non‐autonomous multispecies Holling type II model and its stochastic version with a simple type of noise have been proposed to infer multispecies community structure. However, these models fail to account for complex types of noises, which may render the model overly simplistic. In this paper, a non‐autonomous multispecies Holling type II model with a complex type of noise has been proposed. We establish sufficient conditions for various mathematical properties of the solutions, including existence and uniqueness, stochastic permanence and extinction. Additionally, numerical simulation studies are provided to illustrate our theoretical findings.

为推断多物种群落结构，有人提出了确定性非自主多物种霍林 II 型模型及其具有简单噪声类型的随机模型。然而，这些模型未能考虑复杂类型的噪声，这可能会使模型过于简单。本文提出了一种具有复杂噪声类型的非自主多物种霍林 II 型模型。我们建立了解的各种数学性质的充分条件，包括存在性和唯一性、随机永久性和消亡。此外，我们还提供了数值模拟研究来说明我们的理论发现。

引用次数: 0

What matters to graduate students? Experiences at a statistical consulting center from pre‐ to post‐COVID‐19 pandemic 什么对研究生很重要？从 COVID-19 流行前到流行后在统计咨询中心的经历

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2024-03-04 DOI: 10.1002/sta4.659

Marianne Huebner, Steven J. Pierce, Andrew J. Dennhardt, Hope Akaeze, Nicole Jess, Wenjuan Ma

The COVID‐19 pandemic led to unprecedented changes in all levels of society, including the statistical consulting field. This paper focuses on the experiences of graduate student consultants and clients at our statistical consulting center (SCC) that operates all year independent of semesters. During the lockdown period, work continued without interruption and was conducted remotely, but there was a temporary reduction in utilization. Advice on statistical methods, help with data analysis and educational offerings are the main appeals to utilize SCC services. We describe our mentoring approach for graduate student research assistants (RAs) and how pandemic changes affected RAs and clients. Based on experiences during the pandemic, we offer practical suggestions for SCCs' approaches to research support, work characteristics and collaborations to improve the experiences of graduate students, both as consultants and clients. Most collaboration meetings are now virtual by request from clients. Telecommuting supports flexible personal schedules and needs. Online educational offerings provide easier access for participants and more opportunities for a wider range of topics and presenters. However, mentoring sessions for RAs are best conducted in‐person, and every effort should be made to encourage in‐person interactions and collaborations between staff members to advance the effectiveness of post‐pandemic SCCs.

COVID-19 大流行导致社会各个层面发生了前所未有的变化，包括统计咨询领域。本文重点介绍了我们统计咨询中心（SCC）的研究生咨询师和客户的经历，该中心全年运作，不受学期限制。在封锁期间，中心的工作没有中断，而且是远程进行的，但使用率暂时有所下降。有关统计方法的建议、数据分析帮助和教育课程是利用 SCC 服务的主要原因。我们介绍了针对研究生研究助理 (RA) 的指导方法，以及大流行病的变化对研究助理和客户的影响。根据大流行病期间的经验，我们对 SCC 在研究支持、工作特点和合作方面的方法提出了切实可行的建议，以改善研究生作为顾问和客户的体验。现在，大多数合作会议都是应客户的要求举行的虚拟会议。远程办公支持灵活的个人时间安排和需求。在线教育为参与者提供了更便捷的途径，也为更广泛的主题和主讲人提供了更多机会。然而，针对 RA 的指导课程最好是面对面进行，应尽一切努力鼓励工作人员之间的面对面互动与合作，以提高流行病后 SCC 的有效性。

{"title":"What matters to graduate students? Experiences at a statistical consulting center from pre‐ to post‐COVID‐19 pandemic","authors":"Marianne Huebner, Steven J. Pierce, Andrew J. Dennhardt, Hope Akaeze, Nicole Jess, Wenjuan Ma","doi":"10.1002/sta4.659","DOIUrl":"https://doi.org/10.1002/sta4.659","url":null,"abstract":"The COVID‐19 pandemic led to unprecedented changes in all levels of society, including the statistical consulting field. This paper focuses on the experiences of graduate student consultants and clients at our statistical consulting center (SCC) that operates all year independent of semesters. During the lockdown period, work continued without interruption and was conducted remotely, but there was a temporary reduction in utilization. Advice on statistical methods, help with data analysis and educational offerings are the main appeals to utilize SCC services. We describe our mentoring approach for graduate student research assistants (RAs) and how pandemic changes affected RAs and clients. Based on experiences during the pandemic, we offer practical suggestions for SCCs' approaches to research support, work characteristics and collaborations to improve the experiences of graduate students, both as consultants and clients. Most collaboration meetings are now virtual by request from clients. Telecommuting supports flexible personal schedules and needs. Online educational offerings provide easier access for participants and more opportunities for a wider range of topics and presenters. However, mentoring sessions for RAs are best conducted in‐person, and every effort should be made to encourage in‐person interactions and collaborations between staff members to advance the effectiveness of post‐pandemic SCCs.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"3 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140036512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Highly private large‐sample tests for contingency tables 对或然率表进行高度私有化的大样本测试

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2024-02-29 DOI: 10.1002/sta4.658

Sungkyu Jung, Seung Woo Kwak

Differential privacy is a foundational concept for safeguarding sensitive individual information when releasing data or statistical analysis results. In this study, we concentrate on the protection of privacy in the context of goodness‐of‐fit (GOF) and independence tests, utilizing perturbed contingency tables that adhere to Gaussian differential privacy within the high‐privacy regime, where the degrees of privacy protection increase as the sample size increases. We introduce private test procedures for GOF, independence of two variables and the equality of proportions in paired samples, similar to McNemar's test. For each of these hypothesis testing situations, we propose private test statistics based on the statistics and establish their asymptotic null distributions. We numerically confirm that Type I error rates of the proposed private test procedures are well controlled and have adequate power for larger sample sizes and effect sizes. The proposal is demonstrated in private inferences based on the American Time Use Survey data.

差分隐私是在发布数据或统计分析结果时保护敏感个人信息的基本概念。在本研究中，我们将重点放在拟合优度（GOF）和独立性检验中的隐私保护上，利用扰动的或然率表，在高隐私机制下坚持高斯差分隐私，即隐私保护程度随着样本量的增加而增加。我们为 GOF、两个变量的独立性和配对样本中的比例相等（类似于 McNemar 检验）引入了隐私检验程序。对于上述每种假设检验情况，我们都提出了基于统计量的私有检验统计量，并建立了它们的渐近零分布。我们用数字证实了所提出的私人检验程序的 I 类错误率得到了很好的控制，并且对于较大的样本量和效应量具有足够的功率。我们在基于美国时间使用调查数据的私人推断中演示了这一建议。

引用次数: 0

Machine collaboration 机器协作

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2024-02-29 DOI: 10.1002/sta4.661

Qingfeng Liu, Yang Feng

We propose a new ensemble framework for supervised learning, called machine collaboration (MaC), using a collection of possibly heterogeneous base learning methods (hereafter, base machines) for prediction tasks. Unlike bagging/stacking (a parallel and independent framework) and boosting (a sequential and top-down framework), MaC is a type of circular and recursive learning framework. The circular and recursive nature helps the base machines to transfer information circularly and update their structures and parameters accordingly. The theoretical result on the risk bound of the estimator from MaC reveals that the circular and recursive feature can help MaC reduce risk via a parsimonious ensemble. We conduct extensive experiments on MaC using both simulated data and 119 benchmark real datasets. The results demonstrate that in most cases, MaC performs significantly better than several other state-of-the-art methods, including classification and regression trees, neural networks, stacking, and boosting.

我们提出了一种新的监督学习集合框架，称为机器协作（Machine Collaboration，简称 MaC），它使用一系列可能异构的基础学习方法（以下简称基础机器）来完成预测任务。与bagging/stacking（并行和独立框架）和boosting（顺序和自上而下框架）不同，MaC是一种循环和递归学习框架。循环和递归的特性有助于基础机器循环传递信息，并相应地更新其结构和参数。关于MaC估计器风险边界的理论结果表明，循环和递归特性可以帮助MaC通过准集合降低风险。我们使用模拟数据和 119 个基准真实数据集对 MaC 进行了大量实验。结果表明，在大多数情况下，MaC 的性能明显优于其他几种最先进的方法，包括分类和回归树、神经网络、堆叠和提升。

引用次数: 0

Linear mixed models for complex survey data: Implementing and evaluating pairwise likelihood 复杂调查数据的线性混合模型：实施和评估成对可能性

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2024-02-27 DOI: 10.1002/sta4.657

Thomas Lumley, Xudong Huang

As complex-survey data become more widely used in health and social science research, there is increasing interest in fitting a wider range of regression models. We describe an implementation of two-level linear mixed models in R using the pairwise composite likelihood approach of Rao and co-workers. We discuss the computational efficiency of pairwise composite likelihood and compare the estimator to the existing sequential pseudolikelihood estimator in simulations and in data from the Programme for International Student Assessment (PISA) educational survey.

随着复杂的调查数据越来越广泛地应用于健康和社会科学研究，人们对拟合更广泛的回归模型越来越感兴趣。我们介绍了使用 Rao 及其合作者的成对复合似然法在 R 中实现两级线性混合模型的方法。我们讨论了成对复合似然的计算效率，并在模拟和国际学生评估项目（PISA）教育调查数据中将该估计器与现有的顺序伪似然估计器进行了比较。

引用次数: 0

A note about why deep learning is deep: A discontinuous approximation perspective 深度学习为什么是深度学习？非连续逼近的视角

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2024-02-22 DOI: 10.1002/sta4.654

Yongxin Li, Haobo Qi, Hansheng Wang

Deep learning has achieved unprecedented success in recent years. This approach essentially uses the composition of nonlinear functions to model the complex relationship between input features and output labels. However, a comprehensive theoretical understanding of why the hierarchical layered structure can exhibit superior expressive power is still lacking. In this paper, we provide an explanation for this phenomenon by measuring the approximation efficiency of neural networks with respect to discontinuous target functions. We focus on deep neural networks with rectified linear unit (ReLU) activation functions. We find that to achieve the same degree of approximation accuracy, the number of neurons required by a single‐hidden‐layer (SHL) network is exponentially greater than that required by a multi‐hidden‐layer (MHL) network. In practice, discontinuous points tend to contain highly valuable information (i.e., edges in image classification). We argue that this may be a very important reason accounting for the impressive performance of deep neural networks. We validate our theory in extensive experiments.

近年来，深度学习取得了前所未有的成功。这种方法本质上是利用非线性函数的组成来模拟输入特征和输出标签之间的复杂关系。然而，对于分层分层结构为何能表现出卓越的表现力，目前还缺乏全面的理论认识。在本文中，我们通过测量神经网络对不连续目标函数的逼近效率来解释这一现象。我们重点研究了具有整流线性单元（ReLU）激活函数的深度神经网络。我们发现，要达到相同程度的逼近精度，单隐藏层（SHL）网络所需的神经元数量呈指数级增长，而多隐藏层（MHL）网络所需的神经元数量则呈指数级增长。实际上，不连续的点往往包含非常有价值的信息（即图像分类中的边缘）。我们认为，这可能是深度神经网络取得惊人性能的一个非常重要的原因。我们在大量实验中验证了我们的理论。

引用次数: 0

Reproducible research practices: A tool for effective and efficient leadership in collaborative statistics 可复制的研究实践：切实有效领导合作统计工作的工具

IF 1.7 4区数学 Q3 STATISTICS & PROBABILITY

Stat

Pub Date : 2024-02-11 DOI: 10.1002/sta4.653

Camille J. Hochheimer, Grace N. Bosma, Lauren Gunn-Sandell, Mary D. Sammel

With data and code sharing policies more common and version control more widely used in statistics, standards for reproducible research are higher than ever. Reproducible research practices must keep up with the fast pace of research. To do so, we propose combining modern practices of leadership with best practices for reproducible research in collaborative statistics as an effective tool for ensuring quality and accuracy while developing stewardship and autonomy in the people we lead. First, we establish a framework for expectations of reproducible statistical research. Then, we introduce Stephen M.R. Covey's theory of trusting and inspiring leadership. These two are combined as we show how stewardship agreements can be used to make reproducible coding a team norm. We provide an illustrative code example and highlight how this method creates a more collaborative rather than evaluative culture where team members hold themselves accountable. The goal of this manuscript is for statisticians to find this application of leadership theory useful and to inspire them to intentionally develop their personal approach to leadership.

随着数据和代码共享政策越来越普遍，版本控制在统计领域的应用也越来越广泛，可重复研究的标准比以往任何时候都要高。可重复研究实践必须跟上快速的研究步伐。为此，我们建议将现代领导力实践与合作统计中的可重现研究最佳实践相结合，作为确保质量和准确性的有效工具，同时培养我们所领导的人员的管理能力和自主性。首先，我们建立了一个对可重复统计研究的期望框架。然后，我们介绍斯蒂芬-柯维（Stephen M.R. Covey）的信任和激励型领导理论。我们将这两者结合起来，展示如何利用管理协议使可重复编码成为团队规范。我们提供了一个代码示例，并强调了这种方法如何创造出一种更具协作性而非评价性的文化，让团队成员对自己负责。本手稿的目的是让统计人员发现领导力理论的应用非常有用，并激励他们有意识地发展个人的领导力方法。

{"title":"Reproducible research practices: A tool for effective and efficient leadership in collaborative statistics","authors":"Camille J. Hochheimer, Grace N. Bosma, Lauren Gunn-Sandell, Mary D. Sammel","doi":"10.1002/sta4.653","DOIUrl":"https://doi.org/10.1002/sta4.653","url":null,"abstract":"With data and code sharing policies more common and version control more widely used in statistics, standards for reproducible research are higher than ever. Reproducible research practices must keep up with the fast pace of research. To do so, we propose combining modern practices of leadership with best practices for reproducible research in collaborative statistics as an effective tool for ensuring quality and accuracy while developing stewardship and autonomy in the people we lead. First, we establish a framework for expectations of reproducible statistical research. Then, we introduce Stephen M.R. Covey's theory of trusting and inspiring leadership. These two are combined as we show how stewardship agreements can be used to make reproducible coding a team norm. We provide an illustrative code example and highlight how this method creates a more collaborative rather than evaluative culture where team members hold themselves accountable. The goal of this manuscript is for statisticians to find this application of leadership theory useful and to inspire them to intentionally develop their personal approach to leadership.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"3 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139756629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Stat

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀