首页 > 最新文献

Stat最新文献

英文 中文
Methods for building a staff workforce of quantitative scientists in academic health care 建立学术医疗定量科学家队伍的方法
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-06 DOI: 10.1002/sta4.683
Sarah Peskoe, Emily Slade, Lacey Rende, Mary Boulos, Manisha Desai, Mihir Gandhi, Jonathan A. L. Gelfond, Shokoufeh Khalatbari, Phillip J. Schulte, Denise C. Snyder, Sandra L. Taylor, Jesse D. Troy, Roger Vaughan, Gina‐Maria Pomann
Collaborative quantitative scientists, including biostatisticians, epidemiologists, bioinformaticists, and data‐related professionals, play vital roles in research, from study design to data analysis and dissemination. It is imperative that academic health care centers (AHCs) establish an environment that provides opportunities for the quantitative scientists who are hired as staff to develop and advance their careers. With the rapid growth of clinical and translational research, AHCs are charged with establishing organizational methods, training tools, best practices, and guidelines to accelerate and support hiring, training, and retaining this staff workforce. This paper describes three essential elements for building and maintaining a successful unit of collaborative staff quantitative scientists in academic health care centers: (1) organizational infrastructure and management, (2) recruitment, and (3) career development and retention. Specific strategies are provided as examples of how AHCs can excel in these areas.
合作的定量科学家,包括生物统计学家、流行病学家、生物信息学家以及与数据相关的专业人员,在从研究设计到数据分析和传播的整个研究过程中发挥着至关重要的作用。学术医疗中心(AHC)必须营造一种环境,为受聘为员工的定量科学家提供发展和晋升的机会。随着临床和转化研究的快速发展,学术医疗中心有责任建立组织方法、培训工具、最佳实践和指导方针,以加快并支持聘用、培训和留住这支员工队伍。本文介绍了在学术医疗中心建立和维持一支成功的定量科学家协作队伍的三个基本要素:(1) 组织基础设施和管理,(2) 招聘,(3) 职业发展和保留。本文提供了具体的策略,作为学术医疗中心如何在这些领域取得优异成绩的范例。
{"title":"Methods for building a staff workforce of quantitative scientists in academic health care","authors":"Sarah Peskoe, Emily Slade, Lacey Rende, Mary Boulos, Manisha Desai, Mihir Gandhi, Jonathan A. L. Gelfond, Shokoufeh Khalatbari, Phillip J. Schulte, Denise C. Snyder, Sandra L. Taylor, Jesse D. Troy, Roger Vaughan, Gina‐Maria Pomann","doi":"10.1002/sta4.683","DOIUrl":"https://doi.org/10.1002/sta4.683","url":null,"abstract":"Collaborative quantitative scientists, including biostatisticians, epidemiologists, bioinformaticists, and data‐related professionals, play vital roles in research, from study design to data analysis and dissemination. It is imperative that academic health care centers (AHCs) establish an environment that provides opportunities for the quantitative scientists who are hired as staff to develop and advance their careers. With the rapid growth of clinical and translational research, AHCs are charged with establishing organizational methods, training tools, best practices, and guidelines to accelerate and support hiring, training, and retaining this staff workforce. This paper describes three essential elements for building and maintaining a successful unit of collaborative staff quantitative scientists in academic health care centers: (1) organizational infrastructure and management, (2) recruitment, and (3) career development and retention. Specific strategies are provided as examples of how AHCs can excel in these areas.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"42 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140883732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Considerations in developing a financial model for an academic statistical consulting centre 为学术统计咨询中心制定财务模式的考虑因素
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-02 DOI: 10.1002/sta4.688
Christy Brown, Yanming Di, Stacey Slone
In operating an academic statistical consulting centre, it is essential to develop a strategy for covering the anticipated costs incurred, such as personnel, facilities, third‐party data, professional development and marketing, and for handling the revenues generated from sources such as university commitments, extramural grants, fees for service, internal memorandums of understanding and consulting courses. As such, this article describes each of these costs and revenue sources in turn, discusses how they vary over phases of a project and life cycles of a centre, provides a review of both historical and modern perspectives in the literature and includes illustrative examples of financial models from three different institutions. These points of consideration are meant to inform consulting groups who are interested in becoming either more or less centrally structured.
在运营学术统计咨询中心时,必须制定一项战略,以支付预期产生的成本,如人员、设施、第三方数据、专业发展和市场营销,并处理从大学承诺、校外赠款、服务费、内部谅解备忘录和咨询课程等来源产生的收入。因此,本文将逐一介绍这些成本和收入来源,讨论它们在项目的不同阶段和中心的生命周期中如何变化,对文献中的历史和现代观点进行回顾,并包括三个不同机构的财务模型示例。这些思考要点旨在为有意采用或不采用中央结构的咨询团体提供参考。
{"title":"Considerations in developing a financial model for an academic statistical consulting centre","authors":"Christy Brown, Yanming Di, Stacey Slone","doi":"10.1002/sta4.688","DOIUrl":"https://doi.org/10.1002/sta4.688","url":null,"abstract":"In operating an academic statistical consulting centre, it is essential to develop a strategy for covering the anticipated costs incurred, such as personnel, facilities, third‐party data, professional development and marketing, and for handling the revenues generated from sources such as university commitments, extramural grants, fees for service, internal memorandums of understanding and consulting courses. As such, this article describes each of these costs and revenue sources in turn, discusses how they vary over phases of a project and life cycles of a centre, provides a review of both historical and modern perspectives in the literature and includes illustrative examples of financial models from three different institutions. These points of consideration are meant to inform consulting groups who are interested in becoming either more or less centrally structured.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"4 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Maximum a posteriori estimation in graphical models using local linear approximation 利用局部线性近似在图形模型中进行最大后验估计
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-01 DOI: 10.1002/sta4.682
Ksheera Sagar, Jyotishka Datta, Sayantan Banerjee, Anindya Bhadra
Sparse structure learning in high‐dimensional Gaussian graphical models is an important problem in multivariate statistical inference, since the sparsity pattern naturally encodes the conditional independence relationship among variables. However, maximum a posteriori (MAP) estimation is challenging under hierarchical prior models, and traditional numerical optimization routines or expectation–maximization algorithms are difficult to implement. To this end, our contribution is a novel local linear approximation scheme that circumvents this issue using a very simple computational algorithm. Most importantly, the condition under which our algorithm is guaranteed to converge to the MAP estimate is explicitly stated and is shown to cover a broad class of completely monotone priors, including the graphical horseshoe. Further, the resulting MAP estimate is shown to be sparse and consistent in the ‐norm. Numerical results validate the speed, scalability and statistical performance of the proposed method.
高维高斯图形模型中的稀疏结构学习是多元统计推断中的一个重要问题,因为稀疏模式自然地编码了变量之间的条件独立性关系。然而,在分层先验模型下,最大后验(MAP)估计具有挑战性,传统的数值优化程序或期望最大化算法难以实现。为此,我们提出了一种新颖的局部线性近似方案,利用非常简单的计算算法规避了这一问题。最重要的是,我们明确提出了保证算法收敛到 MAP 估计值的条件,并证明该条件涵盖了包括图形马蹄在内的一大类完全单调先验。此外,还证明了所得到的 MAP 估计值是稀疏的,并且在-正态下是一致的。数值结果验证了所提方法的速度、可扩展性和统计性能。
{"title":"Maximum a posteriori estimation in graphical models using local linear approximation","authors":"Ksheera Sagar, Jyotishka Datta, Sayantan Banerjee, Anindya Bhadra","doi":"10.1002/sta4.682","DOIUrl":"https://doi.org/10.1002/sta4.682","url":null,"abstract":"Sparse structure learning in high‐dimensional Gaussian graphical models is an important problem in multivariate statistical inference, since the sparsity pattern naturally encodes the conditional independence relationship among variables. However, maximum a posteriori (MAP) estimation is challenging under hierarchical prior models, and traditional numerical optimization routines or expectation–maximization algorithms are difficult to implement. To this end, our contribution is a novel local linear approximation scheme that circumvents this issue using a very simple computational algorithm. Most importantly, the condition under which our algorithm is guaranteed to converge to the MAP estimate is explicitly stated and is shown to cover a broad class of completely monotone priors, including the graphical horseshoe. Further, the resulting MAP estimate is shown to be sparse and consistent in the ‐norm. Numerical results validate the speed, scalability and statistical performance of the proposed method.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"18 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalised minimum moment aberration for designs with both qualitative and quantitative factors 具有定性和定量因素的设计的广义最小矩差
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-01 DOI: 10.1002/sta4.684
Yao Xiao, Na Zou, Hong Qin, Kang Wang
The minimum moment aberration and the minimum Lee‐moment aberration criteria are two popular conceptually simple and computationally cheap criteria for selecting good designs. However, the minimum moment aberration is suitable for qualitative factors, and the minimum Lee‐moment aberration cannot distinguish some designs with high‐level quantitative factors. In this paper, the minimum absolute‐moment aberration criterion is proposed to compare and select designs with multi‐level quantitative factors. We validate the statistical justifications of this criterion from theoretical and numerical aspects. Furthermore, we extend the minimum absolute‐moment aberration criterion into screening designs with both qualitative and quantitative factors, naming the new criterion as the minimum mixed‐moment aberration criterion. Then we utilise a numerical study to compare and evaluate the performance of some popular designs with both qualitative and quantitative factors in computer experiments.
最小力矩畸变和最小李矩畸变准则是两种常用的选择优秀设计的准则,它们概念简单,计算成本低廉。然而,最小矩差适用于定性因子,而最小李矩差则无法区分一些具有高级定量因子的设计。本文提出了最小绝对矩差准则,用于比较和选择具有多级定量因子的设计。我们从理论和数值方面验证了这一标准的统计合理性。此外,我们还将最小绝对矩差准则扩展到同时筛选定性和定量因素的设计,并将新准则命名为最小混合矩差准则。然后,我们利用数值研究,在计算机实验中比较和评估了一些同时具有定性和定量因素的流行设计的性能。
{"title":"Generalised minimum moment aberration for designs with both qualitative and quantitative factors","authors":"Yao Xiao, Na Zou, Hong Qin, Kang Wang","doi":"10.1002/sta4.684","DOIUrl":"https://doi.org/10.1002/sta4.684","url":null,"abstract":"The minimum moment aberration and the minimum Lee‐moment aberration criteria are two popular conceptually simple and computationally cheap criteria for selecting good designs. However, the minimum moment aberration is suitable for qualitative factors, and the minimum Lee‐moment aberration cannot distinguish some designs with high‐level quantitative factors. In this paper, the minimum absolute‐moment aberration criterion is proposed to compare and select designs with multi‐level quantitative factors. We validate the statistical justifications of this criterion from theoretical and numerical aspects. Furthermore, we extend the minimum absolute‐moment aberration criterion into screening designs with both qualitative and quantitative factors, naming the new criterion as the minimum mixed‐moment aberration criterion. Then we utilise a numerical study to compare and evaluate the performance of some popular designs with both qualitative and quantitative factors in computer experiments.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"36 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A sparse empirical Bayes approach to high‐dimensional Gaussian process‐based varying coefficient models 基于高斯过程的高维变化系数模型的稀疏经验贝叶斯方法
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-04-20 DOI: 10.1002/sta4.678
Myungjin Kim, Gyuhyeong Goh
Despite the increasing importance of high‐dimensional varying coefficient models, the study of their Bayesian versions is still in its infancy. This paper contributes to the literature by developing a sparse empirical Bayes formulation that addresses the problem of high‐dimensional model selection in the framework of Bayesian varying coefficient modelling under Gaussian process (GP) priors. To break the computational bottleneck of GP‐based varying coefficient modelling, we introduce the low‐cost computation strategy that incorporates linear algebra techniques and the Laplace approximation into the evaluation of the high‐dimensional posterior model distribution. A simulation study is conducted to demonstrate the superiority of the proposed Bayesian method compared to an existing high‐dimensional varying coefficient modelling approach. In addition, its applicability to real data analysis is illustrated using yeast cell cycle data.
尽管高维变化系数模型越来越重要,但对其贝叶斯版本的研究仍处于起步阶段。本文通过开发一种稀疏经验贝叶斯公式,在高斯过程(GP)先验下的贝叶斯变化系数建模框架内解决了高维模型选择问题,为相关文献做出了贡献。为了打破基于 GP 的变化系数建模的计算瓶颈,我们引入了低成本计算策略,将线性代数技术和拉普拉斯近似纳入高维后验模型分布的评估中。我们进行了一项模拟研究,以证明与现有的高维变化系数建模方法相比,所提出的贝叶斯方法更具优势。此外,还利用酵母细胞周期数据说明了该方法在实际数据分析中的适用性。
{"title":"A sparse empirical Bayes approach to high‐dimensional Gaussian process‐based varying coefficient models","authors":"Myungjin Kim, Gyuhyeong Goh","doi":"10.1002/sta4.678","DOIUrl":"https://doi.org/10.1002/sta4.678","url":null,"abstract":"Despite the increasing importance of high‐dimensional varying coefficient models, the study of their Bayesian versions is still in its infancy. This paper contributes to the literature by developing a sparse empirical Bayes formulation that addresses the problem of high‐dimensional model selection in the framework of Bayesian varying coefficient modelling under Gaussian process (GP) priors. To break the computational bottleneck of GP‐based varying coefficient modelling, we introduce the low‐cost computation strategy that incorporates linear algebra techniques and the Laplace approximation into the evaluation of the high‐dimensional posterior model distribution. A simulation study is conducted to demonstrate the superiority of the proposed Bayesian method compared to an existing high‐dimensional varying coefficient modelling approach. In addition, its applicability to real data analysis is illustrated using yeast cell cycle data.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"87 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140627430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The value of flexible funding for collaborative biostatistics units in universities and academic medical centres 为大学和学术医学中心的合作生物统计单位提供灵活资金的价值
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-04-20 DOI: 10.1002/sta4.679
Emily Slade, Sarah Jane K. Robbins, Kristen J. McQuerry, Anthony A. Mangino
Collaborative biostatistics units within universities and academic medical centres operate under a wide range of different funding models; common to many of these models is the challenge of allocating time to activities that are not linked to a specific research project, such as professional development, mentorship and administrative tasks. The purpose of this paper is to describe a proposed model for ‘flexible funding’, that is, funding that is not linked to a specific research project, within a collaborative biostatistics unit and to detail the benefits and challenges associated with the proposed model. We present results from a qualitative study representing the perspectives of collaborative biostatisticians working under the proposed flexible funding model. In addition to providing examples of activities undertaken as part of time allocated to flexible funding, the qualitative results reveal several benefits of flexible funding both for a collaborative biostatistician (e.g., job satisfaction and professional development) and for the collaborative biostatistics unit as a whole (e.g., retention, process improvement, and leadership).
大学和学术医学中心内的生物统计合作单位在各种不同的资助模式下运作;其中许多模式的共同挑战是如何分配时间开展与特定研究项目无关的活动,如专业发展、指导和行政任务。本文旨在介绍一种 "灵活资助 "的建议模式,即在一个合作生物统计单位内,资助与特定研究项目无关的活动,并详细介绍与建议模式相关的益处和挑战。我们介绍了一项定性研究的结果,该研究代表了在拟议的灵活资助模式下工作的合作生物统计学家的观点。除了举例说明作为灵活资助时间分配的一部分而开展的活动外,定性研究结果还揭示了灵活资助对合作生物统计学家(如工作满意度和职业发展)和整个合作生物统计单位(如留住人才、流程改进和领导力)的若干益处。
{"title":"The value of flexible funding for collaborative biostatistics units in universities and academic medical centres","authors":"Emily Slade, Sarah Jane K. Robbins, Kristen J. McQuerry, Anthony A. Mangino","doi":"10.1002/sta4.679","DOIUrl":"https://doi.org/10.1002/sta4.679","url":null,"abstract":"Collaborative biostatistics units within universities and academic medical centres operate under a wide range of different funding models; common to many of these models is the challenge of allocating time to activities that are not linked to a specific research project, such as professional development, mentorship and administrative tasks. The purpose of this paper is to describe a proposed model for ‘flexible funding’, that is, funding that is not linked to a specific research project, within a collaborative biostatistics unit and to detail the benefits and challenges associated with the proposed model. We present results from a qualitative study representing the perspectives of collaborative biostatisticians working under the proposed flexible funding model. In addition to providing examples of activities undertaken as part of time allocated to flexible funding, the qualitative results reveal several benefits of flexible funding both for a collaborative biostatistician (e.g., job satisfaction and professional development) and for the collaborative biostatistics unit as a whole (e.g., retention, process improvement, and leadership).","PeriodicalId":56159,"journal":{"name":"Stat","volume":"35 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140627162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using extreme value theory to evaluate the leading pedestrian interval road safety intervention 利用极值理论评估领先的行人间隔道路安全干预措施
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-04-18 DOI: 10.1002/sta4.676
Nicola Hewett, Lee Fawcett, Andrew Golightly, Neil Thorpe
Improving road safety is hugely important with the number of deaths on the world's roads remaining unacceptably high; an estimated 1.3 million people die each year as a result of road traffic collisions. Current practice for treating collision hotspots is almost always reactive: once a threshold level of collisions has been overtopped during some pre‐determined observation period, treatment is applied (e.g., road safety cameras). Traffic collisions are rare, so prolonged observation periods are necessary. However, traffic conflicts are more frequent and are a margin of the social cost; hence, traffic conflict before/after studies can be conducted over shorter time periods. We investigate the effect of implementing the leading pedestrian interval treatment at signalised intersections as a safety intervention in a city in north America. Pedestrian‐vehicle traffic conflict data were collected from treatment and control sites during the before and after periods. We implement a before/after study on post‐encroachment times (PETs) where small PET values denote ‘near‐misses’. Hence, extreme value theory is employed to model extremes of our PET processes, with adjustments to the usual modelling framework to account for temporal dependence and treatment effects.
改善道路安全极为重要,因为全球道路上的死亡人数仍然高得令人无法接受;据估计,每年有 130 万人死于道路交通碰撞事故。目前处理碰撞热点的做法几乎总是被动的:一旦在某个预先确定的观察期内碰撞次数超过了临界值,就会采取相应的处理措施(如道路安全摄像机)。交通碰撞很少发生,因此有必要延长观察期。然而,交通冲突较为频繁,是社会成本的一个边际;因此,交通冲突前后的研究可以在较短的时间段内进行。我们在美国北部的一个城市调查了在信号灯控制的交叉路口实施领先行人间隔处理作为安全干预措施的效果。在实施前后,我们分别从实施地点和对照地点收集了行人与车辆交通冲突的数据。我们对蚕食后时间(PET)进行了前后研究,其中较小的 PET 值表示 "近乎失误"。因此,我们采用极值理论对 PET 过程的极值进行建模,并对通常的建模框架进行调整,以考虑时间依赖性和处理效果。
{"title":"Using extreme value theory to evaluate the leading pedestrian interval road safety intervention","authors":"Nicola Hewett, Lee Fawcett, Andrew Golightly, Neil Thorpe","doi":"10.1002/sta4.676","DOIUrl":"https://doi.org/10.1002/sta4.676","url":null,"abstract":"Improving road safety is hugely important with the number of deaths on the world's roads remaining unacceptably high; an estimated 1.3 million people die each year as a result of road traffic collisions. Current practice for treating collision hotspots is almost always reactive: once a threshold level of collisions has been overtopped during some pre‐determined observation period, treatment is applied (e.g., road safety cameras). Traffic collisions are rare, so prolonged observation periods are necessary. However, traffic <jats:italic>conflicts</jats:italic> are more frequent and are a margin of the social cost; hence, traffic conflict before/after studies can be conducted over shorter time periods. We investigate the effect of implementing the leading pedestrian interval treatment at signalised intersections as a safety intervention in a city in north America. Pedestrian‐vehicle traffic conflict data were collected from treatment and control sites during the before and after periods. We implement a before/after study on post‐encroachment times (PETs) where small PET values denote ‘near‐misses’. Hence, extreme value theory is employed to model extremes of our PET processes, with adjustments to the usual modelling framework to account for temporal dependence and treatment effects.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"10 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140626894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The data science discovery program: A model for data science consulting in higher education 数据科学发现计划:高等教育数据科学咨询模式
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-04-18 DOI: 10.1002/sta4.677
C. Taylor Brown, Megan Mehta, Mahathi Ryali, Xiaoran Dong, Iliya Shadfar, Jacqueline Dominquez Davalos, Aaron Culich, Anthony Suen
As one of the largest data science research incubator initiatives in the country, the University of California, Berkeley's Data Science Discovery Program serves as a case study for a scalable and sustainable model of data science consulting in higher education. This case contributes to the broader literature on data science consulting in higher education by analysing the programme's development, institutional influences; staffing and structural model; and defining features, which may prove instructive to similar programmes at other institutions. The programme is characterised by a unique structure of undergraduate consultations led by graduate student mentorship and governance; a streamlined, multidepartmental model that facilitates scalability and sustainability; and diverse modes for undergraduate consulting—including one‐on‐one ad‐hoc data science consultations, extended data science project development and management, peer mentorship and data science workshop instruction. This case demonstrates that universities may be able to initiate a low‐stakes, small‐scale data science consulting initiative and then progressively scale up the project in collaboration with multiple departments and organisations across campus.
作为美国最大的数据科学研究孵化器计划之一,加州大学伯克利分校的数据科学发现计划是高等教育中可扩展、可持续的数据科学咨询模式的案例研究。本案例通过分析该计划的发展、机构影响、人员配备和结构模式,以及可能对其他机构的类似计划具有指导意义的定义特征,为更广泛的高等教育数据科学咨询文献做出了贡献。该计划的特点包括:由研究生指导和管理领导的本科生咨询的独特结构;有利于可扩展性和可持续性的精简的多部门模式;本科生咨询的多样化模式--包括一对一的临时数据科学咨询、扩展的数据科学项目开发和管理、同行指导和数据科学研讨会指导。这个案例表明,大学可以启动一个低风险、小规模的数据科学咨询项目,然后与校园内的多个部门和组织合作,逐步扩大项目规模。
{"title":"The data science discovery program: A model for data science consulting in higher education","authors":"C. Taylor Brown, Megan Mehta, Mahathi Ryali, Xiaoran Dong, Iliya Shadfar, Jacqueline Dominquez Davalos, Aaron Culich, Anthony Suen","doi":"10.1002/sta4.677","DOIUrl":"https://doi.org/10.1002/sta4.677","url":null,"abstract":"As one of the largest data science research incubator initiatives in the country, the University of California, Berkeley's Data Science Discovery Program serves as a case study for a scalable and sustainable model of data science consulting in higher education. This case contributes to the broader literature on data science consulting in higher education by analysing the programme's development, institutional influences; staffing and structural model; and defining features, which may prove instructive to similar programmes at other institutions. The programme is characterised by a unique structure of undergraduate consultations led by graduate student mentorship and governance; a streamlined, multidepartmental model that facilitates scalability and sustainability; and diverse modes for undergraduate consulting—including one‐on‐one ad‐hoc data science consultations, extended data science project development and management, peer mentorship and data science workshop instruction. This case demonstrates that universities may be able to initiate a low‐stakes, small‐scale data science consulting initiative and then progressively scale up the project in collaboration with multiple departments and organisations across campus.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"78 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140630597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Utilizing latent connectivity among mediators in high-dimensional mediation analysis 在高维中介分析中利用中介人之间的潜在关联性
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-04-16 DOI: 10.1002/sta4.675
Jia Yuan Hu, Marley DeSimone, Qing Wang
Mediation analysis intends to unveil the underlying relationship between an outcome variable and an exposure variable through one or more intermediate variables called mediators. In recent decades, research on mediation analysis has been focusing on multivariate mediation models, where the number of mediating variables is possibly of high dimension. This paper concerns high-dimensional mediation analysis and proposes a three-step algorithm that extracts and utilizes inter-connectivity among candidate mediators. More specifically, the proposed methodology starts with a screening procedure to reduce the dimensionality of the initial set of candidate mediators, followed by a penalized regression model that incorporates both parameter- and group-wise regularization, and ends with fitting a multivariate mediation model and identifying active mediating variables through a joint significance test. To showcase the performance of the proposed algorithm, we conducted two simulation studies in high-dimensional and ultra-high-dimensional settings, respectively. Furthermore, we demonstrate the practical applications of the proposal using a real data set that uncovers the possible impact of environmental toxicants on women's gestational age at delivery through 61 biomarkers that belong to 7 biological pathways.
中介分析旨在通过一个或多个被称为中介变量的中间变量,揭示结果变量与暴露变量之间的内在关系。近几十年来,中介分析的研究主要集中在多变量中介模型上,中介变量的数量可能是高维的。本文关注高维中介分析,并提出了一种三步算法,用于提取和利用候选中介变量之间的相互联系。更具体地说,所提出的方法首先是筛选程序,以降低初始候选中介变量集的维度,然后是包含参数正则化和分组正则化的惩罚回归模型,最后是拟合多元中介模型,并通过联合显著性检验确定活跃的中介变量。为了展示所提算法的性能,我们分别在高维和超高维环境下进行了两次模拟研究。此外,我们还利用一个真实数据集展示了该建议的实际应用,该数据集通过隶属于 7 条生物通路的 61 个生物标志物揭示了环境毒物对妇女分娩时胎龄的可能影响。
{"title":"Utilizing latent connectivity among mediators in high-dimensional mediation analysis","authors":"Jia Yuan Hu, Marley DeSimone, Qing Wang","doi":"10.1002/sta4.675","DOIUrl":"https://doi.org/10.1002/sta4.675","url":null,"abstract":"Mediation analysis intends to unveil the underlying relationship between an outcome variable and an exposure variable through one or more intermediate variables called mediators. In recent decades, research on mediation analysis has been focusing on multivariate mediation models, where the number of mediating variables is possibly of high dimension. This paper concerns high-dimensional mediation analysis and proposes a three-step algorithm that extracts and utilizes inter-connectivity among candidate mediators. More specifically, the proposed methodology starts with a screening procedure to reduce the dimensionality of the initial set of candidate mediators, followed by a penalized regression model that incorporates both parameter- and group-wise regularization, and ends with fitting a multivariate mediation model and identifying active mediating variables through a joint significance test. To showcase the performance of the proposed algorithm, we conducted two simulation studies in high-dimensional and ultra-high-dimensional settings, respectively. Furthermore, we demonstrate the practical applications of the proposal using a real data set that uncovers the possible impact of environmental toxicants on women's gestational age at delivery through 61 biomarkers that belong to 7 biological pathways.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"57 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140583973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High‐dimensional feature screening for nonlinear associations with survival outcome using restricted mean survival time 利用受限平均存活时间对与存活结果非线性关联的高维特征进行筛选
IF 1.7 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-04-07 DOI: 10.1002/sta4.673
Yaxian Chen, Kwok Fai Lam, Zhonghua Liu
SummaryFeature screening is an important tool in analysing ultrahigh‐dimensional data, particularly in the field of Omics and oncology studies. However, most attention has been focused on identifying features that have a linear or monotonic impact on the response variable. Detecting a sparse set of variables that have a nonlinear or nonmonotonic relationship with the response variable is still a challenging task. To fill the gap, this paper proposed a robust model‐free screening approach for right‐censored survival data by providing a new perspective of quantifying the covariate effect on the restricted mean survival time, rather than the routinely used hazard function. The proposed measure, based on the difference between the restricted mean survival time of covariate‐stratified and overall data, is able to identify comprehensive types of associations including linear, nonlinear, nonmonotone and even local dependencies like change points. The sure screening property is established, and a more flexible iterative screening procedure is developed to increase the accuracy of the variable screening. Simulation studies are carried out to demonstrate the superiority of the proposed method in selecting important features with a complex association with the response variable. The potential of applying the proposed method to handle interval‐censored failure time data has also been explored in simulations, and the results have been promising. The method is applied to a breast cancer dataset to identify potential prognostic factors, which reveals potential associations between breast cancer and lymphoma.
摘要特征筛选是分析超高维数据的重要工具,尤其是在分子生物学和肿瘤学研究领域。然而,大多数注意力都集中在识别对响应变量有线性或单调影响的特征上。检测与响应变量具有非线性或非单调关系的稀疏变量集仍然是一项具有挑战性的任务。为了填补这一空白,本文提出了一种针对右删失生存数据的稳健无模型筛选方法,提供了一个量化协变量对受限平均生存时间影响的新视角,而不是常规使用的危险函数。所提出的测量方法基于协变量分层的受限平均生存时间与整体数据之间的差异,能够识别包括线性、非线性、非单调甚至局部依赖性(如变化点)在内的各种类型的关联。建立了确定的筛选属性,并开发了更灵活的迭代筛选程序,以提高变量筛选的准确性。通过模拟研究,证明了所提方法在筛选与响应变量有复杂关联的重要特征方面的优越性。模拟研究还探讨了应用所提方法处理间隔删失失效时间数据的潜力,结果令人鼓舞。该方法被应用于乳腺癌数据集,以确定潜在的预后因素,从而揭示乳腺癌和淋巴瘤之间的潜在关联。
{"title":"High‐dimensional feature screening for nonlinear associations with survival outcome using restricted mean survival time","authors":"Yaxian Chen, Kwok Fai Lam, Zhonghua Liu","doi":"10.1002/sta4.673","DOIUrl":"https://doi.org/10.1002/sta4.673","url":null,"abstract":"SummaryFeature screening is an important tool in analysing ultrahigh‐dimensional data, particularly in the field of Omics and oncology studies. However, most attention has been focused on identifying features that have a linear or monotonic impact on the response variable. Detecting a sparse set of variables that have a nonlinear or nonmonotonic relationship with the response variable is still a challenging task. To fill the gap, this paper proposed a robust model‐free screening approach for right‐censored survival data by providing a new perspective of quantifying the covariate effect on the restricted mean survival time, rather than the routinely used hazard function. The proposed measure, based on the difference between the restricted mean survival time of covariate‐stratified and overall data, is able to identify comprehensive types of associations including linear, nonlinear, nonmonotone and even local dependencies like change points. The sure screening property is established, and a more flexible iterative screening procedure is developed to increase the accuracy of the variable screening. Simulation studies are carried out to demonstrate the superiority of the proposed method in selecting important features with a complex association with the response variable. The potential of applying the proposed method to handle interval‐censored failure time data has also been explored in simulations, and the results have been promising. The method is applied to a breast cancer dataset to identify potential prognostic factors, which reveals potential associations between breast cancer and lymphoma.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"39 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140583941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Stat
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1