arXiv - STAT - Other Statistics最新文献_第8页

ceylon: An R package for plotting the maps of Sri Lanka 锡兰用于绘制斯里兰卡地图的 R 软件包

arXiv - STAT - Other Statistics

Pub Date : 2024-01-04 DOI: arxiv-2401.02467

Thiyanga S. Talagala

The rapid evolution in the fields of computer science, data science, andartificial intelligence has significantly transformed the utilisation of datafor decision-making. Data visualisation plays a critical role in any work thatinvolves data. Visualising data on maps is frequently encountered in manyfields. Visualising data on maps not only transforms raw data into visuallycomprehensible representations but also converts complex spatial informationinto simple, understandable form. Locating the data files necessary for mapcreation can be a challenging task. Establishing a centralised repository canalleviate the challenging task of finding shape files, allowing users toefficiently discover geographic data. The ceylon R package is designed to makesimple feature data related to Sri Lanka's administrative boundaries and riversand streams accessible for a diverse range of R users. With straightforwardfunctionalities, this package allows users to quickly plot and exploreadministrative boundaries and rivers and streams in Sri Lanka.

计算机科学、数据科学和人工智能领域的快速发展极大地改变了利用数据进行决策的方式。数据可视化在任何涉及数据的工作中都起着至关重要的作用。在地图上可视化数据在许多领域都经常遇到。在地图上实现数据可视化不仅能将原始数据转化为可直观理解的表述，还能将复杂的空间信息转化为简单易懂的形式。找到创建地图所需的数据文件是一项具有挑战性的任务。建立一个集中的存储库可以减轻寻找形状文件的艰巨任务，使用户能够有效地发现地理数据。ceylon R 软件包的设计目的是让不同的 R 用户都能获取与斯里兰卡行政边界和河流相关的简单特征数据。该软件包具有直观的功能，可让用户快速绘制和探索斯里兰卡的行政边界、河流和溪流。

引用次数: 0

Facilitating the Integration of Ethical Reasoning into Quantitative Courses: Stakeholder Analysis, Ethical Practice Standards, and Case Studies 促进将道德推理融入定量课程：利益相关者分析、道德实践标准和案例研究

arXiv - STAT - Other Statistics

Pub Date : 2024-01-03 DOI: arxiv-2401.01973

Rochelle E. Tractenberg, Suzanne Thorton

Case studies are typically used to teach 'ethics', but in quantitativecourses it can seem distracting, for both instructor and learner, to introducea case analysis. Moreover, case analyses are typically focused on issuesrelating to people: obtaining consent, dealing with research team members,and/or potential institutional policy violations. While relevant to someresearch, not all students in quantitative courses plan to become researchers,and ethical practice is an essential topic for students of of mathematics,statistics, data science, and computing regardless of whether or not thelearner intends to do research. Ethical reasoning is a way of thinking thatrequires the individual to assess what they know about a potential ethicalproblem (their prerequisite knowledge), and in some cases, how behaviors theyobserve, are directed to perform, or have performed, diverge from what theyknow to be ethical behavior. Ethical reasoning is a learnable, improvable setof knowledge, skills, and abilities that enable learners to recognize what theydo and do not know about what constitutes 'ethical practice' of a discipline,and in some cases, to contemplate alternative decisions about how to firstrecognize, and then proceed past, or respond to, such divergences. Astakeholder analysis is part of prerequisite knowledge, and can be used whetherthere is or is not an actual case or situation to react to. In courses withmainly quantitative content, a stakeholder analysis is a useful tool forinstruction and assessment. It can be used to both integrate authentic ethicalcontent and encourage careful quantitative thought. It is a mistake to treat'training in ethical practice' and 'training in responsible conduct ofresearch' as the same thing. This paper discusses how to introduce ethicalreasoning, stakeholder analysis, and ethical practice standards authenticallyin quantitative courses.

案例研究通常用于 "伦理 "教学，但在定量课程中，引入案例分析似乎会分散教师和学员的注意力。此外，案例分析通常侧重于与人相关的问题：获得同意、与研究团队成员打交道和/或可能违反机构政策。虽然与某些研究相关，但并非所有学习定量课程的学生都打算成为研究人员，而且无论学习者是否打算从事研究工作，道德实践都是数学、统计学、数据科学和计算机专业学生的必修课。道德推理是一种思维方式，它要求个人评估他们对潜在道德问题的了解（他们的前提知识），以及在某些情况下，他们所观察到的、被指示去做的或已经做过的行为是如何偏离他们所了解的道德行为的。伦理推理是一套可学习、可改进的知识、技能和能力，它能使学习者认识到他们对什么是学科的 "伦理实践 "知道什么和不知道什么，在某些情况下，还能使学习者思考如何首先认识到这种偏差，然后超越或应对这种偏差。利益相关者分析是先决知识的一部分，无论是否有实际案例或情况需要应对，都可以使用。在以定量内容为主的课程中，利益相关者分析是一种有用的教学和评估工具。它既可以用来整合真实的伦理内容，也可以用来鼓励认真的定量思考。把 "伦理实践培训 "和 "负责任的研究行为培训 "当作一回事是错误的。本文讨论了如何在定量课程中真实地引入伦理推理、利益相关者分析和伦理实践标准。

{"title":"Facilitating the Integration of Ethical Reasoning into Quantitative Courses: Stakeholder Analysis, Ethical Practice Standards, and Case Studies","authors":"Rochelle E. Tractenberg, Suzanne Thorton","doi":"arxiv-2401.01973","DOIUrl":"https://doi.org/arxiv-2401.01973","url":null,"abstract":"Case studies are typically used to teach 'ethics', but in quantitative\u0000courses it can seem distracting, for both instructor and learner, to introduce\u0000a case analysis. Moreover, case analyses are typically focused on issues\u0000relating to people: obtaining consent, dealing with research team members,\u0000and/or potential institutional policy violations. While relevant to some\u0000research, not all students in quantitative courses plan to become researchers,\u0000and ethical practice is an essential topic for students of of mathematics,\u0000statistics, data science, and computing regardless of whether or not the\u0000learner intends to do research. Ethical reasoning is a way of thinking that\u0000requires the individual to assess what they know about a potential ethical\u0000problem (their prerequisite knowledge), and in some cases, how behaviors they\u0000observe, are directed to perform, or have performed, diverge from what they\u0000know to be ethical behavior. Ethical reasoning is a learnable, improvable set\u0000of knowledge, skills, and abilities that enable learners to recognize what they\u0000do and do not know about what constitutes 'ethical practice' of a discipline,\u0000and in some cases, to contemplate alternative decisions about how to first\u0000recognize, and then proceed past, or respond to, such divergences. A\u0000stakeholder analysis is part of prerequisite knowledge, and can be used whether\u0000there is or is not an actual case or situation to react to. In courses with\u0000mainly quantitative content, a stakeholder analysis is a useful tool for\u0000instruction and assessment. It can be used to both integrate authentic ethical\u0000content and encourage careful quantitative thought. It is a mistake to treat\u0000'training in ethical practice' and 'training in responsible conduct of\u0000research' as the same thing. This paper discusses how to introduce ethical\u0000reasoning, stakeholder analysis, and ethical practice standards authentically\u0000in quantitative courses.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139103985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Revisiting the effect of greediness on the efficacy of exchange algorithms for generating exact optimal experimental designs 重新审视 "贪婪 "对交换算法生成精确最优实验设计效果的影响

arXiv - STAT - Other Statistics

Pub Date : 2023-12-19 DOI: arxiv-2312.12645

William T. Gullion, Stephen J. Walsh

Coordinate exchange (CEXCH) is a popular algorithm for generating exactoptimal experimental designs. The authors of CEXCH advocated for a highlygreedy implementation - one that exchanges and optimizes single elementcoordinates of the design matrix. We revisit the effect of greediness on CEXCHsefficacy for generating highly efficient designs. We implement thesingle-element CEXCH (most greedy), a design-row (medium greedy) optimizationexchange, and particle swarm optimization (PSO; least greedy) on 21 exactresponse surface design scenarios, under the $D$- and $I-$criterion, which havewell-known optimal designs that have been reproduced by several researchers. Wefound essentially no difference in performance of the most greedy CEXCH and themedium greedy CEXCH. PSO did exhibit better efficacy for generating $D$-optimaldesigns, and for most $I$-optimal designs than CEXCH, but not to a strongdegree under our parametrization. This work suggests that further investigationof the greediness dimension and its effect on CEXCH efficacy on a wider suiteof models and criterion is warranted.

坐标交换（CEXCH）是一种用于生成精确最优实验设计的流行算法。CEXCH 的作者主张采用高度贪婪的实现方法，即交换和优化设计矩阵的单元素坐标。我们重新审视了贪心对 CEXCH 生成高效设计的影响。我们在 21 个精确响应曲面设计方案上实施了单元素 CEXCH（最贪婪）、设计行（中等贪婪）优化交换和粒子群优化（PSO；最不贪婪），在 $D$- 和 $I-$- 标准下，这些方案都有众所周知的最优设计，并已被多位研究人员复制。我们发现，最贪心的 CEXCH 和中等贪心的 CEXCH 在性能上基本没有差别。在生成 $D$ 最佳设计和大多数 $I$ 最佳设计方面，PSO 确实比 CEXCH 表现出更高的效率，但在我们的参数化条件下并没有达到很高的程度。这项工作表明，有必要进一步研究贪婪度维度及其对 CEXCH 在更广泛的模型和标准上的有效性的影响。

{"title":"Revisiting the effect of greediness on the efficacy of exchange algorithms for generating exact optimal experimental designs","authors":"William T. Gullion, Stephen J. Walsh","doi":"arxiv-2312.12645","DOIUrl":"https://doi.org/arxiv-2312.12645","url":null,"abstract":"Coordinate exchange (CEXCH) is a popular algorithm for generating exact\u0000optimal experimental designs. The authors of CEXCH advocated for a highly\u0000greedy implementation - one that exchanges and optimizes single element\u0000coordinates of the design matrix. We revisit the effect of greediness on CEXCHs\u0000efficacy for generating highly efficient designs. We implement the\u0000single-element CEXCH (most greedy), a design-row (medium greedy) optimization\u0000exchange, and particle swarm optimization (PSO; least greedy) on 21 exact\u0000response surface design scenarios, under the $D$- and $I-$criterion, which have\u0000well-known optimal designs that have been reproduced by several researchers. We\u0000found essentially no difference in performance of the most greedy CEXCH and the\u0000medium greedy CEXCH. PSO did exhibit better efficacy for generating $D$-optimal\u0000designs, and for most $I$-optimal designs than CEXCH, but not to a strong\u0000degree under our parametrization. This work suggests that further investigation\u0000of the greediness dimension and its effect on CEXCH efficacy on a wider suite\u0000of models and criterion is warranted.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138825071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Applying Pre-Trained Deep-Learning Model on Wrist Angel Data -- An Analysis Plan 将预先训练好的深度学习模型应用于手腕天使数据 -- 分析计划

arXiv - STAT - Other Statistics

Pub Date : 2023-12-14 DOI: arxiv-2312.09052

Harald Vilhelm Skat-Rørdam, Mia Hang Knudsen, Simon Nørby Knudsen, Nicole Nadine Lønfeldt, Sneha Das, Line Katrine Harder Clemmensen

We aim to investigate if we can improve predictions of stress caused by OCDsymptoms using pre-trained models, and present our statistical analysis plan inthis paper. With the methods presented in this plan, we aim to avoid bias fromdata knowledge and thereby strengthen our hypotheses and findings. The WristAngel study, which this statistical analysis plan concerns, contains data fromnine participants, between 8 and 17 years old, diagnosed withobsessive-compulsive disorder (OCD). The data was obtained by an Empatica E4wristband, which the participants wore during waking hours for 8 weeks. Thepurpose of the study is to assess the feasibility of predicting the in-the-wildOCD events captured during this period. In our analysis, we aim to investigateif we can improve predictions of stress caused by OCD symptoms, and to do thiswe have created a pre-trained model, trained on four open-source data forstress prediction. We intend to apply this pre-trained model to the Wrist Angeldata by fine-tuning, thereby utilizing transfer learning. The pre-trained modelis a convolutional neural network that uses blood volume pulse, heart rate,electrodermal activity, and skin temperature as time series windows to predictOCD events. Furthermore, using accelerometer data, another model filtersphysical activity to further improve performance, given that physical activityis physiologically similar to stress. By evaluating various ways of applyingour model (fine-tuned, non-fine-tuned, pre-trained, non-pre-trained, and withor without activity classification), we contextualize the problem such that itcan be assessed if transfer learning is a viable strategy in this domain.

我们的目的是研究是否可以利用预先训练好的模型来改进对强迫症症状所引起的压力的预测，并在本文中介绍了我们的统计分析计划。通过本计划中介绍的方法，我们旨在避免数据知识带来的偏差，从而加强我们的假设和研究结果。本统计分析计划所涉及的 "腕天使 "研究包含九名被诊断患有强迫症（OCD）的 8 至 17 岁参与者的数据。数据通过 Empatica E4 腕带获得，参与者在清醒时佩戴该腕带，为期 8 周。研究的目的是评估预测在此期间捕捉到的强迫症事件的可行性。在我们的分析中，我们旨在研究是否可以改进对强迫症症状引起的压力的预测，为此，我们创建了一个预训练模型，该模型在四个用于压力预测的开源数据上进行了训练。我们打算通过微调将这一预训练模型应用到腕部天使数据中，从而利用迁移学习。预训练模型是一个卷积神经网络，它使用血容量脉搏、心率、皮肤电活动和皮肤温度作为时间序列窗口来预测OCD 事件。此外，考虑到体力活动在生理上与压力相似，另一个模型使用加速计数据过滤体力活动，以进一步提高性能。通过评估应用我们的模型的各种方法（微调、非微调、预训练、非预训练、有或无活动分类），我们将问题具体化，从而可以评估迁移学习在该领域是否是一种可行的策略。

{"title":"Applying Pre-Trained Deep-Learning Model on Wrist Angel Data -- An Analysis Plan","authors":"Harald Vilhelm Skat-Rørdam, Mia Hang Knudsen, Simon Nørby Knudsen, Nicole Nadine Lønfeldt, Sneha Das, Line Katrine Harder Clemmensen","doi":"arxiv-2312.09052","DOIUrl":"https://doi.org/arxiv-2312.09052","url":null,"abstract":"We aim to investigate if we can improve predictions of stress caused by OCD\u0000symptoms using pre-trained models, and present our statistical analysis plan in\u0000this paper. With the methods presented in this plan, we aim to avoid bias from\u0000data knowledge and thereby strengthen our hypotheses and findings. The Wrist\u0000Angel study, which this statistical analysis plan concerns, contains data from\u0000nine participants, between 8 and 17 years old, diagnosed with\u0000obsessive-compulsive disorder (OCD). The data was obtained by an Empatica E4\u0000wristband, which the participants wore during waking hours for 8 weeks. The\u0000purpose of the study is to assess the feasibility of predicting the in-the-wild\u0000OCD events captured during this period. In our analysis, we aim to investigate\u0000if we can improve predictions of stress caused by OCD symptoms, and to do this\u0000we have created a pre-trained model, trained on four open-source data for\u0000stress prediction. We intend to apply this pre-trained model to the Wrist Angel\u0000data by fine-tuning, thereby utilizing transfer learning. The pre-trained model\u0000is a convolutional neural network that uses blood volume pulse, heart rate,\u0000electrodermal activity, and skin temperature as time series windows to predict\u0000OCD events. Furthermore, using accelerometer data, another model filters\u0000physical activity to further improve performance, given that physical activity\u0000is physiologically similar to stress. By evaluating various ways of applying\u0000our model (fine-tuned, non-fine-tuned, pre-trained, non-pre-trained, and with\u0000or without activity classification), we contextualize the problem such that it\u0000can be assessed if transfer learning is a viable strategy in this domain.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"104 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138685147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Introduction to probability and statistics: a computational framework of randomness 概率和统计学导论：随机性的计算框架

arXiv - STAT - Other Statistics

Pub Date : 2023-12-06 DOI: arxiv-2401.08622

Lakshman Mahto

This text presents an unified approach of probability and statistics in thepursuit of understanding and computation of randomness in engineering orphysical or social system with prediction with generalizability. Starting fromelementary probability and theory of distributions, the material progressestowards conceptual and advances in prediction and generalization in statisticalmodels and large sample theory. We also pay special attention to unifiedderivation approach and one-shot proof of each and every probabilistic concept.Our presentation of intuitive and computation framework of conditionaldistribution and probability are strongly influenced by unified patterns oflinear models for regression and for classification. The text ends with afuture note on the unified approximation of the linear models, the generalizedlinear models and the discovery models to neural networks and a summarized MLsystem.

这本教材介绍了概率论与统计学的统一方法，旨在理解和计算工程、物理或社会系统中的随机性，并进行具有普适性的预测。教材从基本概率和分布理论开始，逐步深入到统计模型和大样本理论中预测和概括的概念和进展。我们对条件分布和概率的直观和计算框架的介绍深受回归和分类线性模型统一模式的影响。我们对条件分布和概率的直观和计算框架的介绍，深受回归和分类的统一线性模型模式的影响。最后，我们将对线性模型、广义线性模型和发现模型到神经网络的统一近似，以及一个总结性的 ML 系统做一个展望。

引用次数: 0

A Conversation with A. Philip Dawid 对话A. Philip david

arXiv - STAT - Other Statistics

Pub Date : 2023-12-01 DOI: arxiv-2312.00632

Vladimir Vovk, Glenn Shafer

Beginning in the 1970s, Alexander Philip Dawid has been a leading contributorto the foundations of statistics and especially to the development andapplication of Bayesian statistics. He is also known for his work on causality,especially his notation for conditional independence and his critique of theoveruse of counterfactuals, and for his contributions to forensic statistics. Dawid was born in Lancashire, England, on February 1, 1946. His family movedto London soon afterwards, and he attended the City of London School from 1956to 1963. He studied mathematics at Cambridge, earning a BA (Bachelor of Arts)degree in 1966. After earning a Diploma in Mathematical Statistics in theacademic year 1966-1967, he studied for a PhD at Imperial, then at UCL, wherehe became a Lecturer in Statistics in 1969. In 1978, he left UCL for a positionas Professor of Statistics in the Department of Mathematics, The CityUniversity, London, where he served as Head of Statistics Section and Directorof the Statistical Laboratory. He returned to the Department of Statistics atUCL in 1981, serving as Head of Department from 1983 to 1993. He moved to theUniversity of Cambridge in 2007, becoming Professor of Statistics and Fellow ofDarwin College. He has continued his work in mathematical statistics afterretiring from Cambridge in 2013 and was elected Fellow of the Royal Society in2018.

从20世纪70年代开始，亚历山大·菲利普·大卫一直是统计学基础的主要贡献者，特别是贝叶斯统计的发展和应用。他也因其对因果关系的研究而闻名，特别是他对条件独立的注解和对过度使用反事实的批评，以及他对法医统计的贡献。大卫于1946年2月1日出生在英格兰兰开夏郡。此后不久，他的家人搬到了伦敦，1956年至1963年，他就读于伦敦城市学校。他在剑桥大学学习数学，并于1966年获得文学学士学位。在1966-1967学年获得数理统计文凭后，他先后在帝国理工学院和伦敦大学学院攻读博士学位，并于1969年成为伦敦大学学院统计学讲师。1978年，他离开伦敦大学学院，在伦敦城市大学数学系担任统计学教授，担任统计科主任和统计实验室主任。他于1981年回到伦敦大学学院统计系，1983年至1993年担任系主任。他于2007年移居剑桥大学，成为统计学教授和达尔文学院研究员。2013年从剑桥大学退休后，他继续从事数理统计方面的工作，并于2018年当选为英国皇家学会会员。

{"title":"A Conversation with A. Philip Dawid","authors":"Vladimir Vovk, Glenn Shafer","doi":"arxiv-2312.00632","DOIUrl":"https://doi.org/arxiv-2312.00632","url":null,"abstract":"Beginning in the 1970s, Alexander Philip Dawid has been a leading contributor\u0000to the foundations of statistics and especially to the development and\u0000application of Bayesian statistics. He is also known for his work on causality,\u0000especially his notation for conditional independence and his critique of the\u0000overuse of counterfactuals, and for his contributions to forensic statistics. Dawid was born in Lancashire, England, on February 1, 1946. His family moved\u0000to London soon afterwards, and he attended the City of London School from 1956\u0000to 1963. He studied mathematics at Cambridge, earning a BA (Bachelor of Arts)\u0000degree in 1966. After earning a Diploma in Mathematical Statistics in the\u0000academic year 1966-1967, he studied for a PhD at Imperial, then at UCL, where\u0000he became a Lecturer in Statistics in 1969. In 1978, he left UCL for a position\u0000as Professor of Statistics in the Department of Mathematics, The City\u0000University, London, where he served as Head of Statistics Section and Director\u0000of the Statistical Laboratory. He returned to the Department of Statistics at\u0000UCL in 1981, serving as Head of Department from 1983 to 1993. He moved to the\u0000University of Cambridge in 2007, becoming Professor of Statistics and Fellow of\u0000Darwin College. He has continued his work in mathematical statistics after\u0000retiring from Cambridge in 2013 and was elected Fellow of the Royal Society in\u00002018.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

edibble: An R package to encapsulate elements of experimental designs for better planning, management and workflow edibble:一个R包，用于封装实验设计的元素，以便更好地规划、管理和工作流程

arXiv - STAT - Other Statistics

Pub Date : 2023-11-16 DOI: arxiv-2311.09705

Emi Tanaka

I present an R package called edibble that facilitates the design ofexperiments by encapsulating elements of the experiment in a series ofcomposable functions. This package is an interpretation of "the grammar ofexperimental designs" by Tanaka (2023) in the R programming language. The mainfeatures of the edibble package are demonstrated, illustrating how it can beused to create a wide array of experimental designs. The implemented systemaims to encourage cognitive thinking for holistic planning and data managementof experiments in a streamlined workflow. This workflow can increase theinherent value of experimental data by reducing potential errors or noise withcareful preplanning, as well as, ensuring fit-for-purpose analysis ofexperimental data.

我提出了一个叫做edibble的R包，它通过将实验的元素封装在一系列可组合的函数中来简化实验的设计。这个包是Tanaka(2023)用R编程语言解释的“实验设计的语法”。演示了可食用包装的主要特点，说明了如何使用它来创建一系列广泛的实验设计。所实施的系统旨在鼓励认知思维，以便在简化的工作流程中进行整体规划和实验数据管理。该工作流程可以通过减少潜在的错误或噪音来增加实验数据的内在价值，并通过仔细的预先计划，以及确保实验数据的适合目的分析。

引用次数: 0

Directional Gaussian spatial processes for South African wind data 南非风数据的方向高斯空间过程

arXiv - STAT - Other Statistics

Pub Date : 2023-11-10 DOI: arxiv-2311.05954

Jacobus S. Blom, Priyanka Nagar, Andriette Bekker

Accurate wind pattern modelling is crucial for various applications,including renewable energy, agriculture, and climate adaptation. In this paper,we introduce the wrapped Gaussian spatial process (WGSP), as well as theprojected Gaussian spatial process (PGSP) custom-tailored for South Africa'sintricate wind behaviour. Unlike conventional models struggling with thecircular nature of wind direction, the WGSP and PGSP adeptly incorporatecircular statistics to address this challenge. Leveraging historical datasourced from meteorological stations throughout South Africa, the WGSP and PGSPsignificantly increase predictive accuracy while capturing the nuanced spatialdependencies inherent to wind patterns. The superiority of the PGSP model incapturing the structural characteristics of the South African wind data isevident. As opposed to the PGSP, the WGSP model is computationally lessdemanding, allows for the use of less informative priors, and its parametersare more easily interpretable. The implications of this study are far-reaching,offering potential benefits ranging from the optimisation of renewable energysystems to the informed decision-making in agriculture and climate adaptationstrategies. The WGSP and PGSP emerge as robust and invaluable tools,facilitating precise modelling of wind patterns within the dynamic context ofSouth Africa.

准确的风型建模对各种应用至关重要，包括可再生能源、农业和气候适应。在本文中，我们介绍了包裹高斯空间过程(WGSP)，以及为南非复杂的风行为定制的投影高斯空间过程(PGSP)。与传统模型与风向的圆形特性作斗争不同，WGSP和PGSP巧妙地结合了圆形统计数据来解决这一挑战。利用来自南非各地气象站的历史数据，WGSP和pgsp显著提高了预测精度，同时捕获了风模式固有的细微空间依赖性。PGSP模型在捕捉南非风数据的结构特征方面的优势是显而易见的。与PGSP相反，WGSP模型在计算上要求较低，允许使用较少信息的先验，并且其参数更容易解释。这项研究的影响是深远的，提供了从可再生能源系统的优化到农业和气候适应战略的明智决策的潜在好处。WGSP和PGSP成为强大而宝贵的工具，促进了南非动态环境下风型的精确建模。

{"title":"Directional Gaussian spatial processes for South African wind data","authors":"Jacobus S. Blom, Priyanka Nagar, Andriette Bekker","doi":"arxiv-2311.05954","DOIUrl":"https://doi.org/arxiv-2311.05954","url":null,"abstract":"Accurate wind pattern modelling is crucial for various applications,\u0000including renewable energy, agriculture, and climate adaptation. In this paper,\u0000we introduce the wrapped Gaussian spatial process (WGSP), as well as the\u0000projected Gaussian spatial process (PGSP) custom-tailored for South Africa's\u0000intricate wind behaviour. Unlike conventional models struggling with the\u0000circular nature of wind direction, the WGSP and PGSP adeptly incorporate\u0000circular statistics to address this challenge. Leveraging historical data\u0000sourced from meteorological stations throughout South Africa, the WGSP and PGSP\u0000significantly increase predictive accuracy while capturing the nuanced spatial\u0000dependencies inherent to wind patterns. The superiority of the PGSP model in\u0000capturing the structural characteristics of the South African wind data is\u0000evident. As opposed to the PGSP, the WGSP model is computationally less\u0000demanding, allows for the use of less informative priors, and its parameters\u0000are more easily interpretable. The implications of this study are far-reaching,\u0000offering potential benefits ranging from the optimisation of renewable energy\u0000systems to the informed decision-making in agriculture and climate adaptation\u0000strategies. The WGSP and PGSP emerge as robust and invaluable tools,\u0000facilitating precise modelling of wind patterns within the dynamic context of\u0000South Africa.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"21 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Gaussian Differential Privacy on Riemannian Manifolds 黎曼流形上的高斯微分隐私性

arXiv - STAT - Other Statistics

Pub Date : 2023-11-09 DOI: arxiv-2311.10101

Yangdi Jiang, Xiaotian Chang, Yi Liu, Lei Ding, Linglong Kong, Bei Jiang

We develop an advanced approach for extending Gaussian Differential Privacy(GDP) to general Riemannian manifolds. The concept of GDP stands out as aprominent privacy definition that strongly warrants extension to manifoldsettings, due to its central limit properties. By harnessing the power of therenowned Bishop-Gromov theorem in geometric analysis, we propose a RiemannianGaussian distribution that integrates the Riemannian distance, allowing us toachieve GDP in Riemannian manifolds with bounded Ricci curvature. To the bestof our knowledge, this work marks the first instance of extending the GDPframework to accommodate general Riemannian manifolds, encompassing curvedspaces, and circumventing the reliance on tangent space summaries. We provide asimple algorithm to evaluate the privacy budget $mu$ on any one-dimensionalmanifold and introduce a versatile Markov Chain Monte Carlo (MCMC)-basedalgorithm to calculate $mu$ on any Riemannian manifold with constantcurvature. Through simulations on one of the most prevalent manifolds instatistics, the unit sphere $S^d$, we demonstrate the superior utility of ourRiemannian Gaussian mechanism in comparison to the previously proposedRiemannian Laplace mechanism for implementing GDP.

我们开发了一种将高斯微分隐私(GDP)扩展到一般黎曼流形的先进方法。GDP的概念作为一个突出的隐私定义脱颖而出，由于其中心限制属性，它强烈要求将其扩展到多种设置。通过利用几何分析中著名的毕晓普-格罗莫夫定理的力量，我们提出了一个黎曼-高斯分布，它集成了黎曼距离，使我们能够在有界里奇曲率的黎曼流形中实现GDP。据我们所知，这项工作标志着扩展gdp框架以适应一般黎曼流形的第一个实例，包括曲线空间，并绕过对切空间总结的依赖。我们提供了一种简单的算法来计算任意一维流形上的隐私预算$mu$，并引入了一种基于通用马尔可夫链蒙特卡罗(MCMC)的算法来计算任意具有恒定曲率的黎曼流形上的$mu$。通过对最流行的流形统计之一，单位球S^d的模拟，我们证明了与之前提出的黎曼拉普拉斯机制相比，我们的黎曼高斯机制在实现GDP方面的优越效用。

{"title":"Gaussian Differential Privacy on Riemannian Manifolds","authors":"Yangdi Jiang, Xiaotian Chang, Yi Liu, Lei Ding, Linglong Kong, Bei Jiang","doi":"arxiv-2311.10101","DOIUrl":"https://doi.org/arxiv-2311.10101","url":null,"abstract":"We develop an advanced approach for extending Gaussian Differential Privacy\u0000(GDP) to general Riemannian manifolds. The concept of GDP stands out as a\u0000prominent privacy definition that strongly warrants extension to manifold\u0000settings, due to its central limit properties. By harnessing the power of the\u0000renowned Bishop-Gromov theorem in geometric analysis, we propose a Riemannian\u0000Gaussian distribution that integrates the Riemannian distance, allowing us to\u0000achieve GDP in Riemannian manifolds with bounded Ricci curvature. To the best\u0000of our knowledge, this work marks the first instance of extending the GDP\u0000framework to accommodate general Riemannian manifolds, encompassing curved\u0000spaces, and circumventing the reliance on tangent space summaries. We provide a\u0000simple algorithm to evaluate the privacy budget $mu$ on any one-dimensional\u0000manifold and introduce a versatile Markov Chain Monte Carlo (MCMC)-based\u0000algorithm to calculate $mu$ on any Riemannian manifold with constant\u0000curvature. Through simulations on one of the most prevalent manifolds in\u0000statistics, the unit sphere $S^d$, we demonstrate the superior utility of our\u0000Riemannian Gaussian mechanism in comparison to the previously proposed\u0000Riemannian Laplace mechanism for implementing GDP.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Broken Adaptive Ridge Method for Variable Selection in Generalized Partly Linear Models with Application to the Coronary Artery Disease Data 广义部分线性模型变量选择的破碎自适应脊法及其在冠心病数据中的应用

arXiv - STAT - Other Statistics

Pub Date : 2023-11-01 DOI: arxiv-2311.00210

Christian Chan, Xiaotian Dai, Thierry Chekouo, Quan Long, Xuewen Lu

Motivated by the CATHGEN data, we develop a new statistical learning methodfor simultaneous variable selection and parameter estimation under the contextof generalized partly linear models for data with high-dimensional covariates.The method is referred to as the broken adaptive ridge (BAR) estimator, whichis an approximation of the $L_0$-penalized regression by iteratively performingreweighted squared $L_2$-penalized regression. The generalized partly linearmodel extends the generalized linear model by including a non-parametriccomponent to construct a flexible model for modeling various types of covariateeffects. We employ the Bernstein polynomials as the sieve space to approximatethe non-parametric functions so that our method can be implemented easily usingthe existing R packages. Extensive simulation studies suggest that the proposedmethod performs better than other commonly used penalty-based variableselection methods. We apply the method to the CATHGEN data with a binaryresponse from a coronary artery disease study, which motivated our research,and obtained new findings in both high-dimensional genetic and low-dimensionalnon-genetic covariates.

在CATHGEN数据的激励下，我们开发了一种新的统计学习方法，用于高维协变量数据的广义部分线性模型下的变量选择和参数估计。该方法被称为破碎自适应脊(BAR)估计器，它是通过迭代执行重新加权平方的L_2$惩罚回归来逼近L_0$惩罚回归。广义部分线性模型是对广义线性模型的扩展，它包含了一个非参数分量，从而构造了一个灵活的模型来模拟各种类型的协变量效应。我们采用Bernstein多项式作为筛选空间来逼近非参数函数，因此我们的方法可以很容易地使用现有的R包实现。大量的仿真研究表明，所提出的方法比其他常用的基于惩罚的变量选择方法性能更好。我们将该方法应用于一项冠状动脉疾病研究的双响应CATHGEN数据，这激发了我们的研究，并在高维遗传和低维非遗传协变量中获得了新的发现。

{"title":"Broken Adaptive Ridge Method for Variable Selection in Generalized Partly Linear Models with Application to the Coronary Artery Disease Data","authors":"Christian Chan, Xiaotian Dai, Thierry Chekouo, Quan Long, Xuewen Lu","doi":"arxiv-2311.00210","DOIUrl":"https://doi.org/arxiv-2311.00210","url":null,"abstract":"Motivated by the CATHGEN data, we develop a new statistical learning method\u0000for simultaneous variable selection and parameter estimation under the context\u0000of generalized partly linear models for data with high-dimensional covariates.\u0000The method is referred to as the broken adaptive ridge (BAR) estimator, which\u0000is an approximation of the $L_0$-penalized regression by iteratively performing\u0000reweighted squared $L_2$-penalized regression. The generalized partly linear\u0000model extends the generalized linear model by including a non-parametric\u0000component to construct a flexible model for modeling various types of covariate\u0000effects. We employ the Bernstein polynomials as the sieve space to approximate\u0000the non-parametric functions so that our method can be implemented easily using\u0000the existing R packages. Extensive simulation studies suggest that the proposed\u0000method performs better than other commonly used penalty-based variable\u0000selection methods. We apply the method to the CATHGEN data with a binary\u0000response from a coronary artery disease study, which motivated our research,\u0000and obtained new findings in both high-dimensional genetic and low-dimensional\u0000non-genetic covariates.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"21 7","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0