首页 > 最新文献

arXiv - STAT - Applications最新文献

英文 中文
Who's the GOAT? Sports Rankings and Data-Driven Random Walks on the Symmetric Group 谁是 GOAT?体育排名和数据驱动的对称组随机行走
Pub Date : 2024-09-18 DOI: arxiv-2409.12107
Gian-Gabriel P. Garcia, J. Carlos Martínez Mori
Given a collection of historical sports rankings, can one tell which playeris the greatest of all time (i.e., the GOAT)? In this work, we design adata-driven random walk on the symmetric group to obtain a stationarydistribution over player rankings, spanning across different time periods insports history. We combine this distribution with a notion of stochasticdominance to obtain a partial order over the players. We implement our methodsusing publicly available data from the Association of Tennis Professionals(ATP) and the Women's Tennis Association (WTA) to find the GOATs in therespective categories.
给定一组历史体育排名,人们能否知道哪位球员是史上最伟大的球员(即 GOAT)?在这项研究中,我们设计了对称组上的数据驱动随机行走,以获得跨越历史不同时期的球员排名的固定分布。我们将这一分布与随机优势的概念相结合,从而得到球员的部分排序。我们利用网球职业运动员协会(ATP)和女子网球协会(WTA)的公开数据来实现我们的方法,从而找到相应类别中的 GOAT。
{"title":"Who's the GOAT? Sports Rankings and Data-Driven Random Walks on the Symmetric Group","authors":"Gian-Gabriel P. Garcia, J. Carlos Martínez Mori","doi":"arxiv-2409.12107","DOIUrl":"https://doi.org/arxiv-2409.12107","url":null,"abstract":"Given a collection of historical sports rankings, can one tell which player\u0000is the greatest of all time (i.e., the GOAT)? In this work, we design a\u0000data-driven random walk on the symmetric group to obtain a stationary\u0000distribution over player rankings, spanning across different time periods in\u0000sports history. We combine this distribution with a notion of stochastic\u0000dominance to obtain a partial order over the players. We implement our methods\u0000using publicly available data from the Association of Tennis Professionals\u0000(ATP) and the Women's Tennis Association (WTA) to find the GOATs in the\u0000respective categories.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"85 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conformity assessment of processes and lots in the framework of JCGM 106:2012 在JCGM 106:2012框架内对过程和批量进行符合性评估
Pub Date : 2024-09-18 DOI: arxiv-2409.11912
Rainer Göb, Steffen Uhlig, Bernard Colson
ISO/IEC 17000:2020 defines conformity assessment as an "activity to determinewhether specified requirements relating to a product, process, system, personor body are fulfilled". JCGM (2012) establishes a framework for accounting formeasurement uncertainty in conformity assessment. The focus of JCGM (2012) ison the conformity assessment of individual units of product based onmeasurements on a cardinal continuous scale. However, the scheme can also beapplied to composite assessment targets like finite lots of product ormanufacturing processes, and to the evaluation of characteristics in discretecardinal or nominal scales. We consider the application of the JCGM scheme in the conformity assessmentof finite lots or processes of discrete units subject to a dichotomous qualityclassification as conforming and nonconforming. A lot or process is classifiedas conforming if the actual proportion nonconforming does not exceed aprescribed upper tolerance limit, otherwise the lot or process is classified asnonconforming. The measurement on the lot or process is a statisticalestimation of the proportion nonconforming based on attributes or variablessampling, and meassurement uncertainty is sampling uncertainty. Following JCGM(2012), we analyse the effect of measurement uncertainty (sampling uncertainty)in attributes sampling, and we calculate key conformity assessment parameters,in particular the producer's and consumer's risk. We suggest to integrate suchparameters as a useful add-on into ISO acceptance sampling standards such asthe ISO 2859 series.
ISO/IEC 17000:2020 将符合性评估定义为 "确定与产品、过程、系统、个人或机构有关的规定要求是否得到满足的活动"。JCGM(2012)为符合性评估中测量不确定性的核算建立了一个框架。JCGM(2012)的重点是基于连续标尺上的测量对单个产品单位进行符合性评估。不过,该方案也可应用于复合评估目标,如有限批次的产品或制造过程,以及离散标度或名义标度的特性评估。我们考虑将 JCGM 方案应用于有限批次或离散单元过程的符合性评估,这些离散单元被二分为合格和不合格两种质量分类。如果不合格的实际比例不超过规定的公差上限,则批量或过程被归类为合格,否则被归类为不合格。对批次或过程的测量是基于属性或无变异抽样对不合格比例的统计估计,测量的不确定性就是抽样的不确定性。根据 JCGM(2012),我们分析了属性抽样中测量不确定性(抽样不确定性)的影响,并计算了关键的合格评定参数,特别是生产者和消费者的风险。我们建议将这些参数作为有用的附加参数纳入 ISO 验收抽样标准,如 ISO 2859 系列标准。
{"title":"Conformity assessment of processes and lots in the framework of JCGM 106:2012","authors":"Rainer Göb, Steffen Uhlig, Bernard Colson","doi":"arxiv-2409.11912","DOIUrl":"https://doi.org/arxiv-2409.11912","url":null,"abstract":"ISO/IEC 17000:2020 defines conformity assessment as an \"activity to determine\u0000whether specified requirements relating to a product, process, system, person\u0000or body are fulfilled\". JCGM (2012) establishes a framework for accounting for\u0000measurement uncertainty in conformity assessment. The focus of JCGM (2012) is\u0000on the conformity assessment of individual units of product based on\u0000measurements on a cardinal continuous scale. However, the scheme can also be\u0000applied to composite assessment targets like finite lots of product or\u0000manufacturing processes, and to the evaluation of characteristics in discrete\u0000cardinal or nominal scales. We consider the application of the JCGM scheme in the conformity assessment\u0000of finite lots or processes of discrete units subject to a dichotomous quality\u0000classification as conforming and nonconforming. A lot or process is classified\u0000as conforming if the actual proportion nonconforming does not exceed a\u0000prescribed upper tolerance limit, otherwise the lot or process is classified as\u0000nonconforming. The measurement on the lot or process is a statistical\u0000estimation of the proportion nonconforming based on attributes or variables\u0000sampling, and meassurement uncertainty is sampling uncertainty. Following JCGM\u0000(2012), we analyse the effect of measurement uncertainty (sampling uncertainty)\u0000in attributes sampling, and we calculate key conformity assessment parameters,\u0000in particular the producer's and consumer's risk. We suggest to integrate such\u0000parameters as a useful add-on into ISO acceptance sampling standards such as\u0000the ISO 2859 series.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian estimation of the number of significant principal components for cultural data 用贝叶斯方法估算文化数据的重要主成分数量
Pub Date : 2024-09-18 DOI: arxiv-2409.12129
Joshua C. Macdonald, Javier Blanco-Portillo, Marcus W. Feldman, Yoav Ram
Principal component analysis (PCA) is often used to analyze multivariate datatogether with cluster analysis, which depends on the number of principalcomponents used. It is therefore important to determine the number ofsignificant principal components (PCs) extracted from a data set. Here we use avariational Bayesian version of classical PCA, to develop a new method forestimating the number of significant PCs in contexts where the number ofsamples is of a similar to or greater than the number of features. Thiseliminates guesswork and potential bias in manually determining the number ofprincipal components and avoids overestimation of variance by filtering noise.This framework can be applied to datasets of different shapes (number of rowsand columns), different data types (binary, ordinal, categorical, continuous),and with noisy and missing data. Therefore, it is especially useful for datawith arbitrary encodings and similar numbers of rows and columns, such ascultural, ecological, morphological, and behavioral datasets. We tested ourmethod on both synthetic data and empirical datasets and found that it mayunderestimate but not overestimate the number of principal components for thesynthetic data. A small number of components was found for each empiricaldataset. These results suggest that it is broadly applicable across the lifesciences.
主成分分析(PCA)通常与聚类分析一起用于分析多变量数据,而聚类分析则取决于所使用的主成分数量。因此,确定从数据集中提取的重要主成分(PC)的数量非常重要。在此,我们使用经典 PCA 的变异贝叶斯版本,开发出一种新方法,可以在样本数量与特征数量相近或大于特征数量的情况下,估算出重要 PC 的数量。该框架可应用于不同形状(行列数)、不同数据类型(二元、序数、分类、连续)以及存在噪声和缺失数据的数据集。因此,该方法尤其适用于具有任意编码和类似行列数的数据,如文化、生态、形态和行为数据集。我们在合成数据和经验数据集上测试了我们的方法,发现它可能会低估但不会高估合成数据的主成分数。每个经验数据集的主成分数量都很少。这些结果表明,该方法广泛适用于生命科学领域。
{"title":"Bayesian estimation of the number of significant principal components for cultural data","authors":"Joshua C. Macdonald, Javier Blanco-Portillo, Marcus W. Feldman, Yoav Ram","doi":"arxiv-2409.12129","DOIUrl":"https://doi.org/arxiv-2409.12129","url":null,"abstract":"Principal component analysis (PCA) is often used to analyze multivariate data\u0000together with cluster analysis, which depends on the number of principal\u0000components used. It is therefore important to determine the number of\u0000significant principal components (PCs) extracted from a data set. Here we use a\u0000variational Bayesian version of classical PCA, to develop a new method for\u0000estimating the number of significant PCs in contexts where the number of\u0000samples is of a similar to or greater than the number of features. This\u0000eliminates guesswork and potential bias in manually determining the number of\u0000principal components and avoids overestimation of variance by filtering noise.\u0000This framework can be applied to datasets of different shapes (number of rows\u0000and columns), different data types (binary, ordinal, categorical, continuous),\u0000and with noisy and missing data. Therefore, it is especially useful for data\u0000with arbitrary encodings and similar numbers of rows and columns, such as\u0000cultural, ecological, morphological, and behavioral datasets. We tested our\u0000method on both synthetic data and empirical datasets and found that it may\u0000underestimate but not overestimate the number of principal components for the\u0000synthetic data. A small number of components was found for each empirical\u0000dataset. These results suggest that it is broadly applicable across the life\u0000sciences.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Visual Search with Highly Heuristic Decision Rules 采用高度启发式决策规则的最佳视觉搜索
Pub Date : 2024-09-18 DOI: arxiv-2409.12124
Anqi Zhang, Wilson S. Geisler
Visual search is a fundamental natural task for humans and other animals. Weinvestigated the decision processes humans use when searching briefly presenteddisplays having well-separated potential target-object locations. Performancewas compared with the Bayesian-optimal decision process under the assumptionthat the information from the different potential target locations isstatistically independent. Surprisingly, humans performed slightly better thanoptimal, despite humans' substantial loss of sensitivity in the fovea, and theimplausibility of the human brain replicating the optimal computations. We showthat three factors can quantitatively explain these seemingly paradoxicalresults. Most importantly, simple and fixed heuristic decision rules reach nearoptimal search performance. Secondly, foveal neglect primarily affects only thecentral potential target location. Finally, spatially correlated neural noisecauses search performance to exceed that predicted for independent noise. Thesefindings have far-reaching implications for understanding visual search tasksand other identification tasks in humans and other animals.
视觉搜索是人类和其他动物的一项基本自然任务。我们研究了人类在搜索简短呈现的、潜在目标-对象位置完全分离的显示屏时所使用的决策过程。我们将人类的表现与贝叶斯最优决策过程进行了比较,贝叶斯最优决策过程的假设是:来自不同潜在目标位置的信息在统计学上是独立的。令人惊讶的是,尽管人类在眼窝处的灵敏度大幅下降,而且人脑复制最优计算的可能性很小,但人类的表现却略高于最优结果。我们的研究表明,有三个因素可以定量解释这些看似矛盾的结果。最重要的是,简单而固定的启发式决策规则可以达到接近最优的搜索性能。其次,眼窝忽视主要只影响中央潜在目标位置。最后,空间相关神经噪声导致搜索性能超过了独立噪声的预测值。这些发现对理解人类和其他动物的视觉搜索任务和其他识别任务具有深远影响。
{"title":"Optimal Visual Search with Highly Heuristic Decision Rules","authors":"Anqi Zhang, Wilson S. Geisler","doi":"arxiv-2409.12124","DOIUrl":"https://doi.org/arxiv-2409.12124","url":null,"abstract":"Visual search is a fundamental natural task for humans and other animals. We\u0000investigated the decision processes humans use when searching briefly presented\u0000displays having well-separated potential target-object locations. Performance\u0000was compared with the Bayesian-optimal decision process under the assumption\u0000that the information from the different potential target locations is\u0000statistically independent. Surprisingly, humans performed slightly better than\u0000optimal, despite humans' substantial loss of sensitivity in the fovea, and the\u0000implausibility of the human brain replicating the optimal computations. We show\u0000that three factors can quantitatively explain these seemingly paradoxical\u0000results. Most importantly, simple and fixed heuristic decision rules reach near\u0000optimal search performance. Secondly, foveal neglect primarily affects only the\u0000central potential target location. Finally, spatially correlated neural noise\u0000causes search performance to exceed that predicted for independent noise. These\u0000findings have far-reaching implications for understanding visual search tasks\u0000and other identification tasks in humans and other animals.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Equity considerations in COVID-19 vaccine allocation modelling: a literature review COVID-19 疫苗分配模型中的公平考虑因素:文献综述
Pub Date : 2024-09-17 DOI: arxiv-2409.11462
Eva Rumpler, Marc Lipsitch
We conducted a literature review of COVID-19 vaccine allocation modellingpapers, specifically looking for publications that considered equity. We foundthat most models did not take equity into account, with the vast majority ofpublications presenting aggregated results and no results by any subgroup (e.g.age, race, geography, etc). We then give examples of how modelling can beuseful to answer equity questions, and highlight some of the findings from thepublications that did. Lastly, we describe seven considerations that seemimportant to consider when including equity in future vaccine allocationmodels.
我们对 COVID-19 疫苗分配模型论文进行了文献综述,特别是寻找考虑公平性的出版物。我们发现,大多数模型都没有考虑到公平性,绝大多数出版物都提供了汇总结果,而没有按任何子群体(如年龄、种族、地域等)分类的结果。然后,我们举例说明建模如何有助于回答公平问题,并重点介绍了建模出版物中的一些结果。最后,我们介绍了将公平性纳入未来疫苗分配模型时需要考虑的七个重要因素。
{"title":"Equity considerations in COVID-19 vaccine allocation modelling: a literature review","authors":"Eva Rumpler, Marc Lipsitch","doi":"arxiv-2409.11462","DOIUrl":"https://doi.org/arxiv-2409.11462","url":null,"abstract":"We conducted a literature review of COVID-19 vaccine allocation modelling\u0000papers, specifically looking for publications that considered equity. We found\u0000that most models did not take equity into account, with the vast majority of\u0000publications presenting aggregated results and no results by any subgroup (e.g.\u0000age, race, geography, etc). We then give examples of how modelling can be\u0000useful to answer equity questions, and highlight some of the findings from the\u0000publications that did. Lastly, we describe seven considerations that seem\u0000important to consider when including equity in future vaccine allocation\u0000models.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing for racial bias using inconsistent perceptions of race 利用不一致的种族观念测试种族偏见
Pub Date : 2024-09-17 DOI: arxiv-2409.11269
Nora Gera, Emma Pierson
Tests for racial bias commonly assess whether two people of different racesare treated differently. A fundamental challenge is that, because two peoplemay differ in many ways, factors besides race might explain differences intreatment. Here, we propose a test for bias which circumvents the difficulty ofcomparing two people by instead assessing whether the $textit{same person}$ istreated differently when their race is perceived differently. We apply ourmethod to test for bias in police traffic stops, finding that the same driveris likelier to be searched or arrested by police when they are perceived asHispanic than when they are perceived as white. Our test is broadly applicableto other datasets where race, gender, or other identity data are perceivedrather than self-reported, and the same person is observed multiple times.
种族偏见测试通常评估两个不同种族的人是否受到不同的待遇。一个基本的挑战是,由于两个人可能在许多方面存在差异,因此除了种族之外,其他因素也可能解释待遇上的差异。在这里,我们提出了一种偏见测试方法,它可以规避将两个人进行比较的困难,而是评估当$textit{same person}$的种族被认为是不同的时候,他们是否受到了不同的待遇。我们将这一方法应用于测试警方拦截车辆时是否存在偏见,结果发现,当同一司机被认为是西班牙裔时,比被认为是白人时更容易被警方搜查或逮捕。我们的检验方法广泛适用于其他数据集,在这些数据集中,种族、性别或其他身份数据是被感知的,而不是自我报告的,而且同一个人被多次观察。
{"title":"Testing for racial bias using inconsistent perceptions of race","authors":"Nora Gera, Emma Pierson","doi":"arxiv-2409.11269","DOIUrl":"https://doi.org/arxiv-2409.11269","url":null,"abstract":"Tests for racial bias commonly assess whether two people of different races\u0000are treated differently. A fundamental challenge is that, because two people\u0000may differ in many ways, factors besides race might explain differences in\u0000treatment. Here, we propose a test for bias which circumvents the difficulty of\u0000comparing two people by instead assessing whether the $textit{same person}$ is\u0000treated differently when their race is perceived differently. We apply our\u0000method to test for bias in police traffic stops, finding that the same driver\u0000is likelier to be searched or arrested by police when they are perceived as\u0000Hispanic than when they are perceived as white. Our test is broadly applicable\u0000to other datasets where race, gender, or other identity data are perceived\u0000rather than self-reported, and the same person is observed multiple times.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"211 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Connected Vehicle Data for Near-Crash Detection and Analysis in Urban Environments 利用车联网数据进行城市环境中的近距离碰撞检测和分析
Pub Date : 2024-09-17 DOI: arxiv-2409.11341
Xinyu LiJason, DayongJason, Wu, Xinyue Ye, Quan Sun
Urban traffic safety is a pressing concern in modern transportation systems,especially in rapidly growing metropolitan areas where increased trafficcongestion, complex road networks, and diverse driving behaviors exacerbate therisk of traffic incidents. Traditional traffic crash data analysis offersvaluable insights but often overlooks a broader range of road safety risks.Near-crash events, which occur more frequently and signal potential collisions,provide a more comprehensive perspective on traffic safety. However, city-scaleanalysis of near-crash events remains limited due to the significant challengesin large-scale real-world data collection, processing, and analysis. This studyutilizes one month of connected vehicle data, comprising billions of records,to detect and analyze near-crash events across the road network in the City ofSan Antonio, Texas. We propose an efficient framework integratingspatial-temporal buffering and heading algorithms to accurately identify andmap near-crash events. A binary logistic regression model is employed to assessthe influence of road geometry, traffic volume, and vehicle types on near-crashrisks. Additionally, we examine spatial and temporal patterns, includingvariations by time of day, day of the week, and road category. The findings ofthis study show that the vehicles on more than half of road segments will beinvolved in at least one near-crash event. In addition, more than 50%near-crash events involved vehicles traveling at speeds over 57.98 mph, andmany occurred at short distances between vehicles. The analysis also found thatwider roadbeds and multiple lanes reduced near-crash risks, while single-unittrucks slightly increased the likelihood of near-crash events. Finally, thespatial-temporal analysis revealed that near-crash risks were most prominentduring weekday peak hours, especially in downtown areas.
城市交通安全是现代交通系统亟待解决的问题,尤其是在快速发展的大都市地区,交通拥堵加剧、道路网络复杂、驾驶行为多样,这些都加剧了交通事故的风险。传统的交通事故数据分析提供了有价值的见解,但往往忽略了更广泛的道路安全风险。近距离碰撞事件发生频率更高,预示着潜在的碰撞,为交通安全提供了更全面的视角。然而,由于大规模真实世界数据收集、处理和分析面临巨大挑战,城市规模的近碰撞事件分析仍然有限。本研究利用一个月的联网车辆数据(包括数十亿条记录)来检测和分析德克萨斯州圣安东尼奥市道路网络中的近碰撞事件。我们提出了一个整合空间-时间缓冲和航向算法的高效框架,用于准确识别和绘制近碰撞事件地图。我们采用二元逻辑回归模型来评估道路几何形状、交通流量和车辆类型对近碰撞风险的影响。此外,我们还研究了空间和时间模式,包括一天中不同时间、一周中不同日期和道路类别的变化。研究结果表明,半数以上路段的车辆至少会发生一次近距离碰撞事件。此外,50%以上的近距离碰撞事件涉及时速超过 57.98 英里/小时的车辆,而且很多都发生在车辆间距较短的路段。分析还发现,较宽的路基和多车道降低了近距离碰撞的风险,而单辆单车则略微增加了近距离碰撞的可能性。最后,时空分析表明,近距离碰撞风险在工作日高峰时段最为突出,尤其是在市中心地区。
{"title":"Leveraging Connected Vehicle Data for Near-Crash Detection and Analysis in Urban Environments","authors":"Xinyu LiJason, DayongJason, Wu, Xinyue Ye, Quan Sun","doi":"arxiv-2409.11341","DOIUrl":"https://doi.org/arxiv-2409.11341","url":null,"abstract":"Urban traffic safety is a pressing concern in modern transportation systems,\u0000especially in rapidly growing metropolitan areas where increased traffic\u0000congestion, complex road networks, and diverse driving behaviors exacerbate the\u0000risk of traffic incidents. Traditional traffic crash data analysis offers\u0000valuable insights but often overlooks a broader range of road safety risks.\u0000Near-crash events, which occur more frequently and signal potential collisions,\u0000provide a more comprehensive perspective on traffic safety. However, city-scale\u0000analysis of near-crash events remains limited due to the significant challenges\u0000in large-scale real-world data collection, processing, and analysis. This study\u0000utilizes one month of connected vehicle data, comprising billions of records,\u0000to detect and analyze near-crash events across the road network in the City of\u0000San Antonio, Texas. We propose an efficient framework integrating\u0000spatial-temporal buffering and heading algorithms to accurately identify and\u0000map near-crash events. A binary logistic regression model is employed to assess\u0000the influence of road geometry, traffic volume, and vehicle types on near-crash\u0000risks. Additionally, we examine spatial and temporal patterns, including\u0000variations by time of day, day of the week, and road category. The findings of\u0000this study show that the vehicles on more than half of road segments will be\u0000involved in at least one near-crash event. In addition, more than 50%\u0000near-crash events involved vehicles traveling at speeds over 57.98 mph, and\u0000many occurred at short distances between vehicles. The analysis also found that\u0000wider roadbeds and multiple lanes reduced near-crash risks, while single-unit\u0000trucks slightly increased the likelihood of near-crash events. Finally, the\u0000spatial-temporal analysis revealed that near-crash risks were most prominent\u0000during weekday peak hours, especially in downtown areas.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatio-Temporal-Network Point Processes for Modeling Crime Events with Landmarks 用地标模拟犯罪事件的时空网络点过程
Pub Date : 2024-09-17 DOI: arxiv-2409.10882
Zheng Dong, Jorge Mateu, Yao Xie
Self-exciting point processes are widely used to model the contagious effectsof crime events living within continuous geographic space, using theiroccurrence time and locations. However, in urban environments, most events arenaturally constrained within the city's street network structure, and thecontagious effects of crime are governed by such a network geography.Meanwhile, the complex distribution of urban infrastructures also plays animportant role in shaping crime patterns across space. We introduce a novelspatio-temporal-network point process framework for crime modeling thatintegrates these urban environmental characteristics by incorporatingself-attention graph neural networks. Our framework incorporates the streetnetwork structure as the underlying event space, where crime events can occurat random locations on the network edges. To realistically capture criminalmovement patterns, distances between events are measured using street networkdistances. We then propose a new mark for a crime event by concatenating theevent's crime category with the type of its nearby landmark, aiming to capturehow the urban design influences the mixing structures of various crime types. Agraph attention network architecture is adopted to learn the existence ofmark-to-mark interactions. Extensive experiments on crime data from Valencia,Spain, demonstrate the effectiveness of our framework in understanding thecrime landscape and forecasting crime risks across regions.
自激点过程被广泛用于模拟连续地理空间内犯罪事件的传染效应,使用的是事件发生的时间和地点。然而,在城市环境中,大多数事件都天然地受限于城市的街道网络结构,犯罪的传染效应也受制于这种网络地理结构。同时,城市基础设施的复杂分布也在塑造跨空间犯罪模式方面发挥着重要作用。我们为犯罪建模引入了一个新颖的时空网络点过程框架,该框架通过纳入自我关注图神经网络将这些城市环境特征整合在一起。我们的框架将街道网络结构作为底层事件空间,犯罪事件可能发生在网络边缘的随机位置。为了真实地捕捉犯罪运动模式,我们使用街道网络距离来测量事件之间的距离。然后,我们通过将犯罪事件的犯罪类别与其附近地标的类型连接起来,为犯罪事件提出一个新的标记,旨在捕捉城市设计如何影响各种犯罪类型的混合结构。我们采用了注意力图网络结构来学习标记与标记之间的交互作用。对西班牙巴伦西亚的犯罪数据进行的大量实验证明了我们的框架在理解犯罪景观和预测跨区域犯罪风险方面的有效性。
{"title":"Spatio-Temporal-Network Point Processes for Modeling Crime Events with Landmarks","authors":"Zheng Dong, Jorge Mateu, Yao Xie","doi":"arxiv-2409.10882","DOIUrl":"https://doi.org/arxiv-2409.10882","url":null,"abstract":"Self-exciting point processes are widely used to model the contagious effects\u0000of crime events living within continuous geographic space, using their\u0000occurrence time and locations. However, in urban environments, most events are\u0000naturally constrained within the city's street network structure, and the\u0000contagious effects of crime are governed by such a network geography.\u0000Meanwhile, the complex distribution of urban infrastructures also plays an\u0000important role in shaping crime patterns across space. We introduce a novel\u0000spatio-temporal-network point process framework for crime modeling that\u0000integrates these urban environmental characteristics by incorporating\u0000self-attention graph neural networks. Our framework incorporates the street\u0000network structure as the underlying event space, where crime events can occur\u0000at random locations on the network edges. To realistically capture criminal\u0000movement patterns, distances between events are measured using street network\u0000distances. We then propose a new mark for a crime event by concatenating the\u0000event's crime category with the type of its nearby landmark, aiming to capture\u0000how the urban design influences the mixing structures of various crime types. A\u0000graph attention network architecture is adopted to learn the existence of\u0000mark-to-mark interactions. Extensive experiments on crime data from Valencia,\u0000Spain, demonstrate the effectiveness of our framework in understanding the\u0000crime landscape and forecasting crime risks across regions.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A point process approach for the classification of noisy calcium imaging data 对噪声钙成像数据进行分类的点过程方法
Pub Date : 2024-09-16 DOI: arxiv-2409.10409
Arianna Burzacchi, Nicoletta D'Angelo, David Payares-Garcia, Jorge Mateu
We study noisy calcium imaging data, with a focus on the classification ofspike traces. As raw traces obscure the true temporal structure of neuron'sactivity, we performed a tuned filtering of the calcium concentration using twomethods: a biophysical model and a kernel mapping. The former characterizesspike trains related to a particular triggering event, while the latter filtersout the signal and refines the selection of the underlying neuronal response.Transitioning from traditional time series analysis to point process theory,the study explores spike-time distance metrics and point pattern prototypes todescribe repeated observations. We assume that the analyzed neuron's firingevents, i.e. spike occurrences, are temporal point process events. Inparticular, the study aims to categorize 47 point patterns by depth, assumingthe similarity of spike occurrences within specific depth categories. Theresults highlight the pivotal roles of depth and stimuli in discerning diversetemporal structures of neuron firing events, confirming the point processapproach based on prototype analysis is largely useful in the classification ofspike traces.
我们研究了噪声钙成像数据,重点是对尖峰迹线进行分类。由于原始踪迹模糊了神经元活动的真实时间结构,我们使用两种方法对钙浓度进行了调整过滤:生物物理模型和核映射。前者描述了与特定触发事件相关的尖峰序列,后者则滤除了信号并完善了对潜在神经元响应的选择。从传统的时间序列分析过渡到点过程理论,该研究探索了尖峰-时间距离度量和点模式原型,以描述重复观测。我们假设被分析神经元的发射事件(即尖峰发生)是时间点过程事件。具体而言,本研究旨在根据深度对 47 个点模式进行分类,并假设特定深度类别中的尖峰发生具有相似性。研究结果强调了深度和刺激在辨别神经元发射事件的不同时间结构中的关键作用,证实了基于原型分析的点过程方法在尖峰轨迹分类中非常有用。
{"title":"A point process approach for the classification of noisy calcium imaging data","authors":"Arianna Burzacchi, Nicoletta D'Angelo, David Payares-Garcia, Jorge Mateu","doi":"arxiv-2409.10409","DOIUrl":"https://doi.org/arxiv-2409.10409","url":null,"abstract":"We study noisy calcium imaging data, with a focus on the classification of\u0000spike traces. As raw traces obscure the true temporal structure of neuron's\u0000activity, we performed a tuned filtering of the calcium concentration using two\u0000methods: a biophysical model and a kernel mapping. The former characterizes\u0000spike trains related to a particular triggering event, while the latter filters\u0000out the signal and refines the selection of the underlying neuronal response.\u0000Transitioning from traditional time series analysis to point process theory,\u0000the study explores spike-time distance metrics and point pattern prototypes to\u0000describe repeated observations. We assume that the analyzed neuron's firing\u0000events, i.e. spike occurrences, are temporal point process events. In\u0000particular, the study aims to categorize 47 point patterns by depth, assuming\u0000the similarity of spike occurrences within specific depth categories. The\u0000results highlight the pivotal roles of depth and stimuli in discerning diverse\u0000temporal structures of neuron firing events, confirming the point process\u0000approach based on prototype analysis is largely useful in the classification of\u0000spike traces.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TCDformer-based Momentum Transfer Model for Long-term Sports Prediction 基于 TCDformer 的长期运动预测动量传递模型
Pub Date : 2024-09-16 DOI: arxiv-2409.10176
Hui Liu, Jiacheng Gu, Xiyuan Huang, Junjie Shi, Tongtong Feng, Ning He
Accurate sports prediction is a crucial skill for professional coaches, whichcan assist in developing effective training strategies and scientificcompetition tactics. Traditional methods often use complex mathematicalstatistical techniques to boost predictability, but this often is limited bydataset scale and has difficulty handling long-term predictions with variabledistributions, notably underperforming when predicting point-set-gamemulti-level matches. To deal with this challenge, this paper proposes TM2, aTCDformer-based Momentum Transfer Model for long-term sports prediction, whichencompasses a momentum encoding module and a prediction module based onmomentum transfer. TM2 initially encodes momentum in large-scale unstructuredtime series using the local linear scaling approximation (LLSA) module. Then itdecomposes the reconstructed time series with momentum transfer into trend andseasonal components. The final prediction results are derived from the additivecombination of a multilayer perceptron (MLP) for predicting trend componentsand wavelet attention mechanisms for seasonal components. Comprehensiveexperimental results show that on the 2023 Wimbledon men's tournament datasets,TM2 significantly surpasses existing sports prediction models in terms ofperformance, reducing MSE by 61.64% and MAE by 63.64%.
准确的体育预测是专业教练的一项重要技能,有助于制定有效的训练策略和科学的比赛战术。传统方法通常使用复杂的数学统计技术来提高预测能力,但这种方法往往受到数据集规模的限制,难以处理具有变异分布的长期预测,尤其是在预测点数-集数-多级比赛时表现不佳。为了应对这一挑战,本文提出了基于 TCDformer 的动量传递模型 TM2,用于长期体育预测,该模型包括动量编码模块和基于动量传递的预测模块。TM2 首先使用局部线性缩放近似(LLSA)模块对大规模非结构化时间序列中的动量进行编码。然后,它将带有动量传递的重建时间序列分解为趋势和季节成分。最终的预测结果来自于预测趋势成分的多层感知器(MLP)和预测季节成分的小波注意机制的相加组合。综合实验结果表明,在 2023 年温布尔登男子锦标赛数据集上,TM2 的性能大大超过了现有的体育预测模型,MSE 降低了 61.64%,MAE 降低了 63.64%。
{"title":"TCDformer-based Momentum Transfer Model for Long-term Sports Prediction","authors":"Hui Liu, Jiacheng Gu, Xiyuan Huang, Junjie Shi, Tongtong Feng, Ning He","doi":"arxiv-2409.10176","DOIUrl":"https://doi.org/arxiv-2409.10176","url":null,"abstract":"Accurate sports prediction is a crucial skill for professional coaches, which\u0000can assist in developing effective training strategies and scientific\u0000competition tactics. Traditional methods often use complex mathematical\u0000statistical techniques to boost predictability, but this often is limited by\u0000dataset scale and has difficulty handling long-term predictions with variable\u0000distributions, notably underperforming when predicting point-set-game\u0000multi-level matches. To deal with this challenge, this paper proposes TM2, a\u0000TCDformer-based Momentum Transfer Model for long-term sports prediction, which\u0000encompasses a momentum encoding module and a prediction module based on\u0000momentum transfer. TM2 initially encodes momentum in large-scale unstructured\u0000time series using the local linear scaling approximation (LLSA) module. Then it\u0000decomposes the reconstructed time series with momentum transfer into trend and\u0000seasonal components. The final prediction results are derived from the additive\u0000combination of a multilayer perceptron (MLP) for predicting trend components\u0000and wavelet attention mechanisms for seasonal components. Comprehensive\u0000experimental results show that on the 2023 Wimbledon men's tournament datasets,\u0000TM2 significantly surpasses existing sports prediction models in terms of\u0000performance, reducing MSE by 61.64% and MAE by 63.64%.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"188 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - STAT - Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1