首页 > 最新文献

American Statistician最新文献

英文 中文
On the number of replications in resampling tests and Monte Carlo simulation studies 重采样试验和蒙特卡罗模拟研究中的重复次数
IF 1.8 4区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2026-01-08 DOI: 10.1080/00031305.2025.2612197
Daniel Gaigall, Julian Gerstenberg
{"title":"On the number of replications in resampling tests and Monte Carlo simulation studies","authors":"Daniel Gaigall, Julian Gerstenberg","doi":"10.1080/00031305.2025.2612197","DOIUrl":"https://doi.org/10.1080/00031305.2025.2612197","url":null,"abstract":"","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"12 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145920155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparison of DeepSeek and Other LLMs DeepSeek与其他llm的比较
IF 1.8 4区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-12-31 DOI: 10.1080/00031305.2025.2611010
Tianchen Gao, Jiashun Jin, Zheng Tracy Ke, Gabriel Moryoussef
Recently, DeepSeek has been the focus of attention in and beyond the AI community. An interesting problem is how DeepSeek compares to other large language models (LLMs). There are many tasks an LLM can do, and in this paper, we use the task of predicting an outcome using a short text for comparison. We consider two settings, an authorship classification setting and a citation classification setting. In the first one, the goal is to determine whether a short text is written by human or AI. In the second one, the goal is to classify a citation to one of four types using the textual content. For each experiment, we compare DeepSeek with 4 popular LLMs: Claude, Gemini, GPT, and Llama.We find that, in terms of classification accuracy, DeepSeek outperforms Gemini, GPT, and Llama in most cases, but underperforms Claude. We also find that DeepSeek is comparably slower than others but with a low cost to use, while Claude is much more expensive than all the others. Finally, we find that in terms of similarity, the output of DeepSeek is most similar to those of Gemini and Claude (and among all 5 LLMs, Claude and Gemini have the most similar outputs).In this paper, we also present a fully-labeled dataset collected by ourselves, and propose a recipe where we can use the LLMs and a recent data set, MADStat, to generate new data sets. The datasets in our paper can be used as benchmarks for future study on LLMs.
最近,DeepSeek一直是人工智能社区内外关注的焦点。一个有趣的问题是如何将DeepSeek与其他大型语言模型(llm)进行比较。法学硕士可以完成许多任务,在本文中,我们使用使用短文本预测结果的任务进行比较。我们考虑两种设置,作者分类设置和引文分类设置。在第一个测试中,目标是确定一篇短文是由人类还是人工智能编写的。在第二个示例中,目标是使用文本内容将引文分类为四种类型之一。对于每个实验,我们将DeepSeek与4种流行的llm进行比较:Claude、Gemini、GPT和Llama。我们发现,就分类精度而言,DeepSeek在大多数情况下优于Gemini、GPT和Llama,但低于Claude。我们还发现,DeepSeek的速度相对较慢,但使用成本较低,而Claude比其他所有算法都贵得多。最后,我们发现在相似度方面,DeepSeek的输出与Gemini和Claude的输出最相似(并且在所有5个llm中,Claude和Gemini的输出最相似)。在本文中,我们还提供了一个由我们自己收集的全标记数据集,并提出了一个配方,我们可以使用llm和最近的数据集MADStat来生成新的数据集。本文的数据集可以作为未来法学硕士研究的基准。
{"title":"A Comparison of DeepSeek and Other LLMs","authors":"Tianchen Gao, Jiashun Jin, Zheng Tracy Ke, Gabriel Moryoussef","doi":"10.1080/00031305.2025.2611010","DOIUrl":"https://doi.org/10.1080/00031305.2025.2611010","url":null,"abstract":"Recently, DeepSeek has been the focus of attention in and beyond the AI community. An interesting problem is how DeepSeek compares to other large language models (LLMs). There are many tasks an LLM can do, and in this paper, we use the <i>task of predicting an outcome using a short text</i> for comparison. We consider two settings, an authorship classification setting and a citation classification setting. In the first one, the goal is to determine whether a short text is written by human or AI. In the second one, the goal is to classify a citation to one of four types using the textual content. For each experiment, we compare DeepSeek with 4 popular LLMs: Claude, Gemini, GPT, and Llama.We find that, in terms of classification accuracy, DeepSeek outperforms Gemini, GPT, and Llama in most cases, but underperforms Claude. We also find that DeepSeek is comparably slower than others but with a low cost to use, while Claude is much more expensive than all the others. Finally, we find that in terms of similarity, the output of DeepSeek is most similar to those of Gemini and Claude (and among all 5 LLMs, Claude and Gemini have the most similar outputs).In this paper, we also present a fully-labeled dataset collected by ourselves, and propose a recipe where we can use the LLMs and a recent data set, MADStat, to generate new data sets. The datasets in our paper can be used as benchmarks for future study on LLMs.","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"29 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145937525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Facilitating a Collaborative Relationship between Generative AI and the Statistics Student 促进生成人工智能和统计学学生之间的协作关系
IF 1.8 4区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-12-29 DOI: 10.1080/00031305.2025.2608724
Richard A. Levine
This article examines how students can engage with generative artificial intelligence (genAI) as collaborators in the statistics learning process. Prompt engineering is positioned as a transferable, tool-agnostic competency that reinforces core elements of statistical thinking, including clarity, iteration, and purposeful inquiry. Through illustrative collaborations, we explore applications such as automating and optimizing code, acquiring programming syntax, and designing simulation studies. While these tasks are drawn from upper-level undergraduate and graduate coursework, the running example–a chi-squared test of association–is intended to spur ideas for incorporating genAI into the introductory statistics classroom. Supplementary materials include a) an outline of a learning management module and structure of the discussion and activities during my class periods covering this module on responsible use of generative AI; b) R Markdown files and compiled pdf documents intended to support classroom integration; c) illustrative comparisons across three widely used platforms–ChatGPT, Copilot, and Gemini–to highlight how differences in output style and reasoning can inform instructional design, rather than to rank or evaluate tools technically. The article concludes with a discussion of strategies for promoting ethical, transparent, and inclusive uses of genAI in statistics education.
本文探讨了学生如何在统计学学习过程中与生成式人工智能(genAI)合作。提示工程被定位为一种可转移的、工具不可知的能力,它强化了统计思维的核心要素,包括清晰度、迭代和有目的的查询。通过说明性合作,我们探索诸如自动化和优化代码、获取编程语法和设计仿真研究等应用程序。虽然这些任务是从高年级的本科和研究生课程中抽取的,但是这个运行的例子——关联的卡方检验——旨在激发将基因人工智能纳入入门统计学课堂的想法。补充材料包括:a)一个学习管理模块的大纲,以及在我的课堂上讨论和活动的结构,该模块涉及负责任地使用生成式人工智能;b) R Markdown文件和已编译的pdf文件,旨在支持课堂整合;c)对三个广泛使用的平台(chatgpt、Copilot和gemini)进行说明性比较,以突出输出风格和推理的差异如何为教学设计提供信息,而不是在技术上对工具进行排名或评估。文章最后讨论了在统计教育中促进道德、透明和包容地使用基因人工智能的策略。
{"title":"Facilitating a Collaborative Relationship between Generative AI and the Statistics Student","authors":"Richard A. Levine","doi":"10.1080/00031305.2025.2608724","DOIUrl":"https://doi.org/10.1080/00031305.2025.2608724","url":null,"abstract":"This article examines how students can engage with generative artificial intelligence (genAI) as collaborators in the statistics learning process. Prompt engineering is positioned as a transferable, tool-agnostic competency that reinforces core elements of statistical thinking, including clarity, iteration, and purposeful inquiry. Through illustrative collaborations, we explore applications such as automating and optimizing code, acquiring programming syntax, and designing simulation studies. While these tasks are drawn from upper-level undergraduate and graduate coursework, the running example–a chi-squared test of association–is intended to spur ideas for incorporating genAI into the introductory statistics classroom. Supplementary materials include a) an outline of a learning management module and structure of the discussion and activities during my class periods covering this module on responsible use of generative AI; b) R Markdown files and compiled pdf documents intended to support classroom integration; c) illustrative comparisons across three widely used platforms–ChatGPT, Copilot, and Gemini–to highlight how differences in output style and reasoning can inform instructional design, rather than to rank or evaluate tools technically. The article concludes with a discussion of strategies for promoting ethical, transparent, and inclusive uses of genAI in statistics education.","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"22 1","pages":"1-22"},"PeriodicalIF":1.8,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145894450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abraham Wald and the Origins of the Sequential Probability Ratio Test 亚伯拉罕·沃尔德和序列概率比检验的起源
IF 1.8 4区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-12-23 DOI: 10.1080/00031305.2025.2604805
Joel B. Greenhouse, Christopher J. Phillips
{"title":"Abraham Wald and the Origins of the Sequential Probability Ratio Test","authors":"Joel B. Greenhouse, Christopher J. Phillips","doi":"10.1080/00031305.2025.2604805","DOIUrl":"https://doi.org/10.1080/00031305.2025.2604805","url":null,"abstract":"","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"25 1","pages":"1-12"},"PeriodicalIF":1.8,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature selection in Cox model with partially observed covariates: Application to oncology trials 部分观察协变量的Cox模型特征选择:在肿瘤学试验中的应用
IF 1.8 4区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-12-22 DOI: 10.1080/00031305.2025.2606077
Ujjwal Das, Ranojoy Basu
{"title":"Feature selection in Cox model with partially observed covariates: Application to oncology trials","authors":"Ujjwal Das, Ranojoy Basu","doi":"10.1080/00031305.2025.2606077","DOIUrl":"https://doi.org/10.1080/00031305.2025.2606077","url":null,"abstract":"","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"22 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Probabilistic parameter estimates that require less small print 需要较少小字体的概率参数估计
IF 1.8 4区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-12-22 DOI: 10.1080/00031305.2025.2606079
James A. Hanley
Although we have had nearly a century to refine it, our teaching of confidence intervals for parameters is still imperfect. Despite all of our warnings regarding these intervals, it is not uncommon for end-users to mis-interpret them. We discuss some possible reasons for this, and using a printed figure and a Shiny app, work through a simple and close-to-home example while trying to avoid many of these traps. We urge teachers to (a) begin with contexts that require less technical knowledge, or where the technical details can be kept out of the way (b) avoid the traditional (and symmetric) ‘point estimate ± a z- or t-based margin of error’ confidence intervals that lead to lazy and muddled thinking (c) start with a direct approach – rather than an indirect frequentist one that can end up being misinterpreted and (d) encourage the reverse logic that asks what parameter values might have produced the data we see, rather than what data values will be produced by a parameter value.
尽管我们有近一个世纪的时间来完善它,但我们对参数置信区间的教学仍然不完善。尽管我们对这些间隔提出了所有警告,但最终用户误解它们的情况并不罕见。我们讨论了一些可能的原因,并使用一个打印的数字和一个闪亮的应用程序,通过一个简单而接近家庭的例子,同时试图避免许多这些陷阱。我们敦促教师(a)从需要较少技术知识的环境开始;或者技术细节可以拒之门外(b)的方式避免了传统(对称)的点估计±z——或者t-based误差的置信区间,导致懒惰和混乱的思维(c)通过一个直接的方法,而不是一个间接频率论的一个能够最终被误解和(d)鼓励相反的逻辑,问什么参数值可能产生的数据我们可以看到,而不是数据值将产生的参数值。
{"title":"Probabilistic parameter estimates that require less small print","authors":"James A. Hanley","doi":"10.1080/00031305.2025.2606079","DOIUrl":"https://doi.org/10.1080/00031305.2025.2606079","url":null,"abstract":"Although we have had nearly a century to refine it, our teaching of confidence intervals for parameters is still imperfect. Despite all of our warnings regarding these intervals, it is not uncommon for end-users to mis-interpret them. We discuss some possible reasons for this, and using a printed figure and a <span>Shiny</span> app, work through a simple and close-to-home example while trying to avoid many of these traps. We urge teachers to (a) begin with contexts that require less technical knowledge, or where the technical details can be kept out of the way (b) avoid the traditional (and symmetric) ‘point estimate <span><img alt=\"\" data-formula-source='{\"type\":\"image\",\"src\":\"/cms/asset/489dd3c1-6837-4878-ad3f-ec142d578d2d/utas_a_2606079_ilm0001.gif\"}' src=\"//:0\"/></span><span><img alt=\"\" data-formula-source='{\"type\":\"mathjax\"}' src=\"//:0\"/><math display=\"inline\"><mo>±</mo></math></span> a <i>z</i>- or <i>t</i>-based margin of error’ confidence intervals that lead to lazy and muddled thinking (c) start with a direct approach – rather than an indirect frequentist one that can end up being misinterpreted and (d) encourage the reverse logic that asks what parameter values might have produced the data we see, rather than what data values will be produced by a parameter value.","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"45 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145801429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Probability Proofs for Stirling (and More): the Ubiquitous Role of 2π 斯特林(及更多)的概率证明:2π的普遍作用
IF 1.8 4区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-12-22 DOI: 10.1080/00031305.2025.2603256
Nils Lid Hjort, Emil Aas Stoltenberg
The Stirling approximation formula for dates from 1730. Here we give new and instructive proofs of this and related approximation formulae via tools of probability and statistics. There are connections to the Central Limit Theorem and also to approximations of marginal distributions in Bayesian setups, with arguments which can be worked through by Master and PhD level students (and above). Certain formulae emerge by working through particular instances, some independently verifiable but others perhaps not. A particular case yielding new formulae is that of summing independent uniforms, related to the Irwin–Hall distribution. Yet further proofs of the Stirling flow from examining aspects of limiting normality of the sample median of uniforms, and from these again we find a proof for the Wallis product formula for . A section detailing historical aspects and development is included, from Wallis 1656 and de Moivre and Stirling 1730 to Laplace 1778, etc.
𝑛!的斯特林近似公式可追溯到1730年。本文利用概率论和统计学的工具,给出了新的、有指导意义的证明和有关的近似公式。这与中心极限定理有关,也与贝叶斯设置中的边际分布近似有关,其论证可以由硕士和博士水平的学生(及以上)完成。某些公式是通过特定的实例得出的,有些是可以独立验证的,而另一些则可能不能。产生新公式的一个特殊情况是与欧文-霍尔分布有关的独立制服的求和。然而,通过考察均匀样本中位数的极限正态性,进一步证明了斯特林流,并由此再次证明了沃利斯积公式。一节详细介绍了历史方面和发展,包括从1656年的沃利斯和1730年的德·莫弗和斯特林到1778年的拉普拉斯等。
{"title":"Probability Proofs for Stirling (and More): the Ubiquitous Role of 2π\u0000","authors":"Nils Lid Hjort, Emil Aas Stoltenberg","doi":"10.1080/00031305.2025.2603256","DOIUrl":"https://doi.org/10.1080/00031305.2025.2603256","url":null,"abstract":"The Stirling approximation formula for <span><img alt=\"\" data-formula-source='{\"type\":\"image\",\"src\":\"/cms/asset/65e2d19f-3328-49b9-9cec-082ae3947173/utas_a_2603256_ilm0002.gif\"}' src=\"//:0\"/></span><span><mjx-container aria-label=\"n factorial\" ctxtmenu_counter=\"0\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" overflow=\"linebreak\" role=\"tree\" sre-explorer- style=\"font-size: 121%;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\" data-semantic-structure=\"(2 0 1)\"><mjx-mrow data-semantic-children=\"0,1\" data-semantic-content=\"1\" data-semantic- data-semantic-owns=\"0 1\" data-semantic-role=\"endpunct\" data-semantic-speech=\"n factorial\" data-semantic-type=\"punctuated\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-parent=\"2\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\"><mjx-c>𝑛</mjx-c></mjx-mi><mjx-mo data-semantic- data-semantic-operator=\"punctuated\" data-semantic-parent=\"2\" data-semantic-role=\"exclamation\" data-semantic-type=\"punctuation\"><mjx-c>!</mjx-c></mjx-mo></mjx-mrow></mjx-math></mjx-container></span> dates from 1730. Here we give new and instructive proofs of this and related approximation formulae via tools of probability and statistics. There are connections to the Central Limit Theorem and also to approximations of marginal distributions in Bayesian setups, with arguments which can be worked through by Master and PhD level students (and above). Certain formulae emerge by working through particular instances, some independently verifiable but others perhaps not. A particular case yielding new formulae is that of summing independent uniforms, related to the Irwin–Hall distribution. Yet further proofs of the Stirling flow from examining aspects of limiting normality of the sample median of uniforms, and from these again we find a proof for the Wallis product formula for <span><img alt=\"\" data-formula-source='{\"type\":\"image\",\"src\":\"/cms/asset/fdc92f7b-9d1d-4e7e-9789-f159d9b7d2f2/utas_a_2603256_ilm0003.gif\"}' src=\"//:0\"/></span><span><mjx-container aria-label=\"pi\" ctxtmenu_counter=\"1\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" overflow=\"linebreak\" role=\"tree\" sre-explorer- style=\"font-size: 121%;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\" data-semantic-structure=\"0\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-role=\"greekletter\" data-semantic-speech=\"pi\" data-semantic-type=\"identifier\"><mjx-c>𝜋</mjx-c></mjx-mi></mjx-math></mjx-container></span>. A section detailing historical aspects and development is included, from Wallis 1656 and de Moivre and Stirling 1730 to Laplace 1778, etc.","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"5 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145801448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond the Yard Line: Accommodating Rounded Sports Data in Statistical Models 超越码线:在统计模型中容纳圆形运动数据
IF 1.8 4区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-12-22 DOI: 10.1080/00031305.2025.2604812
Amanda K. Glazer, Layla Parast, Mevin B. Hooten
In American football, rushing, passing, and receiving yards are recorded as whole numbers, but using a unique method of rounding, even though the true yardage is continuous and recorded precisely on the field. This rounding introduces measurement error that is systematically ignored in most statistical analyses of these data. Beyond rounding, football yardage presents additional challenges: it can take on negative values and is strongly skewed. These characteristics complicate distributional assumptions and propagate rounding effects. We illustrate the consequences of these issues using data from running backs during the 2023 National Football League regular season. We show that appropriately modeling play-level yardage as a discrete, skewed, and possibly negative quantity, without access to the true values, is important to reconcile the approach with the data generation process. We compare candidate models that correctly incorporate rounding from a model checking and validation perspective. Our findings underscore the broader importance of accounting for discretization and asymmetry in sports analytics and other fields, where recorded data may mask the underlying measurement process in ways that meaningfully affect statistical conclusions.
在美式足球中,抢断、传球和接球的码数被记录为整数,但使用一种独特的四舍五入方法,尽管真正的码数是连续的,并精确地记录在球场上。这种四舍五入引入了测量误差,在这些数据的大多数统计分析中,这种误差被系统地忽略了。除了四舍五入之外,足球码数还带来了额外的挑战:它可以取负值,并且严重偏斜。这些特征使分布假设复杂化,并传播舍入效应。我们使用2023年美国国家橄榄球联盟常规赛期间跑卫的数据来说明这些问题的后果。我们表明,适当地将游戏水平码数建模为离散的、倾斜的、可能是负的数量,而无法获得真实值,这对于将方法与数据生成过程相协调非常重要。我们从模型检查和验证的角度比较正确地包含舍入的候选模型。我们的研究结果强调了在体育分析和其他领域中对离散化和不对称进行核算的更广泛的重要性,在这些领域中,记录的数据可能会以有意义的方式掩盖潜在的测量过程,从而影响统计结论。
{"title":"Beyond the Yard Line: Accommodating Rounded Sports Data in Statistical Models","authors":"Amanda K. Glazer, Layla Parast, Mevin B. Hooten","doi":"10.1080/00031305.2025.2604812","DOIUrl":"https://doi.org/10.1080/00031305.2025.2604812","url":null,"abstract":"In American football, rushing, passing, and receiving yards are recorded as whole numbers, but using a unique method of rounding, even though the true yardage is continuous and recorded precisely on the field. This rounding introduces measurement error that is systematically ignored in most statistical analyses of these data. Beyond rounding, football yardage presents additional challenges: it can take on negative values and is strongly skewed. These characteristics complicate distributional assumptions and propagate rounding effects. We illustrate the consequences of these issues using data from running backs during the 2023 National Football League regular season. We show that appropriately modeling play-level yardage as a discrete, skewed, and possibly negative quantity, without access to the true values, is important to reconcile the approach with the data generation process. We compare candidate models that correctly incorporate rounding from a model checking and validation perspective. Our findings underscore the broader importance of accounting for discretization and asymmetry in sports analytics and other fields, where recorded data may mask the underlying measurement process in ways that meaningfully affect statistical conclusions.","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"20 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145801450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Shared ancestors and the birthday problem 共同的祖先和生日问题
IF 1.8 4区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-12-05 DOI: 10.1080/00031305.2025.2595972
Lily Agranat-Tamir, Kennedy D. Agwamba, Jazlyn A. Mooney, Noah A. Rosenberg
{"title":"Shared ancestors and the birthday problem","authors":"Lily Agranat-Tamir, Kennedy D. Agwamba, Jazlyn A. Mooney, Noah A. Rosenberg","doi":"10.1080/00031305.2025.2595972","DOIUrl":"https://doi.org/10.1080/00031305.2025.2595972","url":null,"abstract":"","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"55 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145680064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Momentum effects in team sports: analyzing the interplay between offense and defense in the NBA 团队运动中的动量效应:NBA进攻与防守相互作用分析
IF 1.8 4区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-12-04 DOI: 10.1080/00031305.2025.2595980
David Winkelmann, Rouven Michels
{"title":"Momentum effects in team sports: analyzing the interplay between offense and defense in the NBA","authors":"David Winkelmann, Rouven Michels","doi":"10.1080/00031305.2025.2595980","DOIUrl":"https://doi.org/10.1080/00031305.2025.2595980","url":null,"abstract":"","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"1 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145680066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
American Statistician
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1