首页 > 最新文献

Journal of Statistical Software最新文献

英文 中文
microsynth: Synthetic Control Methods for Disaggregated and Micro-Level Data in R microsynth: R中分解和微观级数据的综合控制方法
IF 5.8 2区 计算机科学 Q1 Mathematics Pub Date : 2021-01-14 DOI: 10.18637/JSS.V097.I02
Michael W Robbins, Steven Davenport
The R package microsynth has been developed for implementation of the synthetic control methodology for comparative case studies involving micro- or meso-level data. The methodology implemented within microsynth is designed to assess the efficacy of a treatment or intervention within a well-defined geographic region that is itself a composite of several smaller regions (where data are available at the more granular level for comparison regions as well). The effect of the intervention on one or more time-varying outcomes is evaluated by determining a synthetic control region that resembles the treatment region across pre-intervention values of the outcome(s) and time-invariant covariates and that is a weighted composite of many untreated comparison regions. The microsynth procedure includes functionality that enables its user to (1) calculate weights for synthetic control, (2) tabulate results for statistical inferences, and (3) create time series plots of outcomes for treatment and synthetic control. In this article, microsynth is described in detail and its application is illustrated using data from a drug market intervention in Seattle, WA.
R包microsynth已开发用于实施涉及微观或中观水平数据的比较案例研究的综合控制方法。在microsynth中实现的方法旨在评估在定义明确的地理区域内的治疗或干预措施的效果,该地理区域本身是几个较小区域的组合(其中的数据可以在更细粒度的级别上用于比较区域)。干预对一个或多个时变结果的影响是通过确定一个合成控制区来评估的,该控制区与干预前结果值和时不变协变量的治疗区相似,该控制区是许多未经处理的比较区域的加权组合。microsynth程序包括使用户能够(1)计算合成控制的权重,(2)将统计推断结果制表,以及(3)创建治疗和合成控制结果的时间序列图的功能。在本文中,详细描述了microsynth,并使用来自华盛顿州西雅图药品市场干预的数据说明了其应用。
{"title":"microsynth: Synthetic Control Methods for Disaggregated and Micro-Level Data in R","authors":"Michael W Robbins, Steven Davenport","doi":"10.18637/JSS.V097.I02","DOIUrl":"https://doi.org/10.18637/JSS.V097.I02","url":null,"abstract":"The R package microsynth has been developed for implementation of the synthetic control methodology for comparative case studies involving micro- or meso-level data. The methodology implemented within microsynth is designed to assess the efficacy of a treatment or intervention within a well-defined geographic region that is itself a composite of several smaller regions (where data are available at the more granular level for comparison regions as well). The effect of the intervention on one or more time-varying outcomes is evaluated by determining a synthetic control region that resembles the treatment region across pre-intervention values of the outcome(s) and time-invariant covariates and that is a weighted composite of many untreated comparison regions. The microsynth procedure includes functionality that enables its user to (1) calculate weights for synthetic control, (2) tabulate results for statistical inferences, and (3) create time series plots of outcomes for treatment and synthetic control. In this article, microsynth is described in detail and its application is illustrated using data from a drug market intervention in Seattle, WA.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2021-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79200084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Simulating Survival Data Using the simsurv R Package 使用simsurv R包模拟生存数据
IF 5.8 2区 计算机科学 Q1 Mathematics Pub Date : 2021-01-14 DOI: 10.18637/JSS.V097.I03
S. Brilleman, R. Wolfe, M. Moreno-Betancur, M. Crowther
The simsurv R package allows users to simulate survival (i.e., time-to-event) data from standard parametric distributions (exponential, Weibull, and Gompertz), two-component mixture distributions, or a user-defined hazard function. Baseline covariates can be included under a proportional hazards assumption. Clustered event times, for example individuals within a family, are also easily accommodated. Time-dependent effects (i.e., nonproportional hazards) can be included by interacting covariates with linear time or a user-defined function of time. Under a user-defined hazard function, event times can be generated for a variety of complex models such as flexible (spline-based) baseline hazards, models with time-varying covariates, or joint longitudinal-survival models.
simsurv R包允许用户从标准参数分布(指数分布、Weibull分布和Gompertz分布)、双组分混合分布或用户定义的风险函数中模拟生存(即事件发生时间)数据。基线协变量可以包含在比例风险假设下。群集事件时间,例如家庭中的个体,也很容易适应。时间相关效应(即非比例风险)可以通过与线性时间或用户定义的时间函数相互作用的协变量来包含。在用户定义的危险函数下,可以为各种复杂模型生成事件时间,例如灵活的(基于样条的)基线危险、具有时变协变量的模型或联合纵向生存模型。
{"title":"Simulating Survival Data Using the simsurv R Package","authors":"S. Brilleman, R. Wolfe, M. Moreno-Betancur, M. Crowther","doi":"10.18637/JSS.V097.I03","DOIUrl":"https://doi.org/10.18637/JSS.V097.I03","url":null,"abstract":"The simsurv R package allows users to simulate survival (i.e., time-to-event) data from standard parametric distributions (exponential, Weibull, and Gompertz), two-component mixture distributions, or a user-defined hazard function. Baseline covariates can be included under a proportional hazards assumption. Clustered event times, for example individuals within a family, are also easily accommodated. Time-dependent effects (i.e., nonproportional hazards) can be included by interacting covariates with linear time or a user-defined function of time. Under a user-defined hazard function, event times can be generated for a variety of complex models such as flexible (spline-based) baseline hazards, models with time-varying covariates, or joint longitudinal-survival models.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2021-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76268703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
A Bayesian Approach for Model-Based Clustering of Several Binary Dissimilarity Matrices: The dmbc Package in R 几种二元不相似矩阵基于模型聚类的贝叶斯方法:R中的dmbc包
IF 5.8 2区 计算机科学 Q1 Mathematics Pub Date : 2021-01-01 DOI: 10.18637/jss.v100.i16
S. Venturini, R. Piccarreta
We introduce the new package dmbc that implements a Bayesian algorithm for clustering a set of binary dissimilarity matrices within a model-based framework. Specifically, we consider the case when S matrices are available, each describing the dissimilarities among the same n objects, possibly expressed by S subjects (judges), or measured under different experimental conditions, or with reference to different characteristics of the objects themselves. In particular, we focus on binary dissimilarities, taking values 0 or 1 depending on whether or not two objects are deemed as dissimilar. We are interested in analyzing such data using multidimensional scaling (MDS). Differently from standard MDS algorithms, our goal is to cluster the dissimilarity matrices and, simultaneously, to extract an MDS configuration specific for each cluster. To this end, we develop a fully Bayesian three-way MDS approach, where the elements of each dissimilarity matrix are modeled as a mixture of Bernoulli random vectors. The parameter estimates and the MDS configurations are derived using a hybrid Metropolis-Gibbs Markov Chain Monte Carlo algorithm. We also propose a BIC-like criterion for jointly selecting the optimal number of clusters and latent space dimensions. We illustrate our approach referring both to synthetic data and to a publicly available data set taken from the literature. For the sake of efficiency, the core computations in the package are implemented in C/C++. The package also allows the simulation of multiple chains through the support of the parallel package.
我们介绍了新的包dmbc,它实现了贝叶斯算法,用于在基于模型的框架内聚类一组二进制不相似矩阵。具体来说,我们考虑有S个矩阵的情况,每个矩阵描述相同n个对象之间的差异,可能由S个受试者(裁判)表示,或者在不同的实验条件下测量,或者参考对象本身的不同特征。特别是,我们关注二进制不相似度,根据两个对象是否被视为不相似,取0或1的值。我们对使用多维缩放(MDS)分析这些数据感兴趣。与标准MDS算法不同,我们的目标是对不相似矩阵进行聚类,同时为每个集群提取特定的MDS配置。为此,我们开发了一种完全贝叶斯三向MDS方法,其中每个不相似矩阵的元素被建模为伯努利随机向量的混合物。采用Metropolis-Gibbs马尔可夫链蒙特卡罗混合算法推导了参数估计和MDS配置。我们还提出了一个类似bic的标准来共同选择最优簇数和潜在空间维度。我们说明了我们的方法引用合成数据和从文献中获取的公开可用的数据集。为了提高效率,包中的核心计算都是用C/ c++实现的。该包还允许通过并行包的支持模拟多个链。
{"title":"A Bayesian Approach for Model-Based Clustering of Several Binary Dissimilarity Matrices: The dmbc Package in R","authors":"S. Venturini, R. Piccarreta","doi":"10.18637/jss.v100.i16","DOIUrl":"https://doi.org/10.18637/jss.v100.i16","url":null,"abstract":"We introduce the new package dmbc that implements a Bayesian algorithm for clustering a set of binary dissimilarity matrices within a model-based framework. Specifically, we consider the case when S matrices are available, each describing the dissimilarities among the same n objects, possibly expressed by S subjects (judges), or measured under different experimental conditions, or with reference to different characteristics of the objects themselves. In particular, we focus on binary dissimilarities, taking values 0 or 1 depending on whether or not two objects are deemed as dissimilar. We are interested in analyzing such data using multidimensional scaling (MDS). Differently from standard MDS algorithms, our goal is to cluster the dissimilarity matrices and, simultaneously, to extract an MDS configuration specific for each cluster. To this end, we develop a fully Bayesian three-way MDS approach, where the elements of each dissimilarity matrix are modeled as a mixture of Bernoulli random vectors. The parameter estimates and the MDS configurations are derived using a hybrid Metropolis-Gibbs Markov Chain Monte Carlo algorithm. We also propose a BIC-like criterion for jointly selecting the optimal number of clusters and latent space dimensions. We illustrate our approach referring both to synthetic data and to a publicly available data set taken from the literature. For the sake of efficiency, the core computations in the package are implemented in C/C++. The package also allows the simulation of multiple chains through the support of the parallel package.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76029407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Data Science in Stata 16: Frames, Lasso, and Python Integration Stata中的数据科学16:框架、套索和Python集成
IF 5.8 2区 计算机科学 Q1 Mathematics Pub Date : 2021-01-01 DOI: 10.18637/jss.v098.s01
A. T. Ho, Kim P. Huynh, David T. Jacho-Chávez, Diego Rojas-Baez
Stata (StataCorp 2019) is one of the most widely used software for data analysis, statistics, and model fitting by economists, public policy researchers, epidemiologists, among others. Stata’s recent release of version 16 in June 2019 includes an up-to-date methodological library and a user-friendly version of various cutting edge techniques. In the newest release, Stata has implemented several changes and additions (see https://www.stata.com/new-in-stata/) that include lasso, multiple data sets in memory, meta-analysis, choice models, Python integration, Bayes-multiple chains, panel-data extended regression models, sample-size analysis for confidence intervals, panel-data mixed logit, nonlinear dynamic stochastic general equilibrium (DSGE) models, numerical integration. This review covers the most salient innovations in Stata 16. It is the first release that brings along an implementation of machine-learning tools. The three innovations we consider in this review are: (1) Multiple data sets in Memory, (2) Lasso for causal inference, and (3) Python integration. The following three sections are used to describe each one of these innovations. The last section are the final thoughts and conclusions of our review.
Stata (StataCorp 2019)是经济学家、公共政策研究人员、流行病学家等最广泛使用的数据分析、统计和模型拟合软件之一。Stata最近于2019年6月发布的第16版包括最新的方法库和各种尖端技术的用户友好版本。在最新的版本中,Stata实现了一些变化和添加(参见https://www.stata.com/new-in-stata/),包括lasso,内存中的多个数据集,元分析,选择模型,Python集成,贝叶斯多链,面板数据扩展回归模型,置信区间的样本大小分析,面板数据混合logit,非线性动态随机一般均衡(DSGE)模型,数值积分。这篇综述涵盖了Stata 16中最显著的创新。这是第一个带来机器学习工具实现的版本。我们在这篇综述中考虑的三个创新是:(1)内存中的多个数据集,(2)Lasso用于因果推理,(3)Python集成。下面的三个部分将分别描述这些创新。最后一部分是我们回顾的最后想法和结论。
{"title":"Data Science in Stata 16: Frames, Lasso, and Python Integration","authors":"A. T. Ho, Kim P. Huynh, David T. Jacho-Chávez, Diego Rojas-Baez","doi":"10.18637/jss.v098.s01","DOIUrl":"https://doi.org/10.18637/jss.v098.s01","url":null,"abstract":"Stata (StataCorp 2019) is one of the most widely used software for data analysis, statistics, and model fitting by economists, public policy researchers, epidemiologists, among others. Stata’s recent release of version 16 in June 2019 includes an up-to-date methodological library and a user-friendly version of various cutting edge techniques. In the newest release, Stata has implemented several changes and additions (see https://www.stata.com/new-in-stata/) that include lasso, multiple data sets in memory, meta-analysis, choice models, Python integration, Bayes-multiple chains, panel-data extended regression models, sample-size analysis for confidence intervals, panel-data mixed logit, nonlinear dynamic stochastic general equilibrium (DSGE) models, numerical integration. This review covers the most salient innovations in Stata 16. It is the first release that brings along an implementation of machine-learning tools. The three innovations we consider in this review are: (1) Multiple data sets in Memory, (2) Lasso for causal inference, and (3) Python integration. The following three sections are used to describe each one of these innovations. The last section are the final thoughts and conclusions of our review.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80646023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
subtee: An R Package for Subgroup Treatment Effect Estimation in Clinical Trials 临床试验中亚组治疗效果评估的R包
IF 5.8 2区 计算机科学 Q1 Mathematics Pub Date : 2021-01-01 DOI: 10.18637/jss.v099.i14
Nicolás M Ballarini, Marius Thomas, G. Rosenkranz, B. Bornkamp
The investigation of subgroups is an integral part of randomized clinical trials. Exploration of treatment effect heterogeneity is typically performed by covariate-adjusted analyses including treatment-by-covariate interactions. Several statistical techniques, such as model averaging and bagging, were proposed recently to address the problem of selection bias in treatment effect estimates for subgroups. In this paper, we describe the subtee R package for subgroup treatment effect estimation. The package can be used for all commonly encountered type of outcomes in clinical trials (continuous, binary, survival, count). We also provide additional functions to build the subgroup variables to be used and to plot the results using forest plots. The functions are demonstrated using data from a clinical trial investigating a treatment for prostate cancer with a survival endpoint.
亚组研究是随机临床试验的重要组成部分。治疗效果异质性的探索通常通过协变量调整分析进行,包括治疗与协变量的相互作用。最近提出了几种统计技术,如模型平均和套袋,以解决亚组治疗效果估计中的选择偏倚问题。在本文中,我们描述了子组治疗效果估计的子组R包。该包可用于临床试验中所有常见的结果类型(连续、二进制、生存、计数)。我们还提供了其他函数来构建要使用的子组变量,并使用森林图绘制结果。这些功能是用一项临床试验的数据来证明的,该试验研究了一种具有生存终点的前列腺癌治疗方法。
{"title":"subtee: An R Package for Subgroup Treatment Effect Estimation in Clinical Trials","authors":"Nicolás M Ballarini, Marius Thomas, G. Rosenkranz, B. Bornkamp","doi":"10.18637/jss.v099.i14","DOIUrl":"https://doi.org/10.18637/jss.v099.i14","url":null,"abstract":"The investigation of subgroups is an integral part of randomized clinical trials. Exploration of treatment effect heterogeneity is typically performed by covariate-adjusted analyses including treatment-by-covariate interactions. Several statistical techniques, such as model averaging and bagging, were proposed recently to address the problem of selection bias in treatment effect estimates for subgroups. In this paper, we describe the subtee R package for subgroup treatment effect estimation. The package can be used for all commonly encountered type of outcomes in clinical trials (continuous, binary, survival, count). We also provide additional functions to build the subgroup variables to be used and to plot the results using forest plots. The functions are demonstrated using data from a clinical trial investigating a treatment for prostate cancer with a survival endpoint.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82936817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
IncDTW: An R Package for Incremental Calculation of Dynamic Time Warping IncDTW:一个动态时间翘曲增量计算的R包
IF 5.8 2区 计算机科学 Q1 Mathematics Pub Date : 2021-01-01 DOI: 10.18637/jss.v099.i09
Maximilian Leodolter, C. Plant, Norbert Brändle
{"title":"IncDTW: An R Package for Incremental Calculation of Dynamic Time Warping","authors":"Maximilian Leodolter, C. Plant, Norbert Brändle","doi":"10.18637/jss.v099.i09","DOIUrl":"https://doi.org/10.18637/jss.v099.i09","url":null,"abstract":"","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78058745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
BNPmix: An R Package for Bayesian Nonparametric Modeling via Pitman-Yor Mixtures 一个基于Pitman-Yor混合物的贝叶斯非参数建模的R包
IF 5.8 2区 计算机科学 Q1 Mathematics Pub Date : 2021-01-01 DOI: 10.18637/jss.v100.i15
R. Corradin, A. Canale, Bernardo Nipoti
This introduction to the R package BNPmix is currently in press in the Journal of Statistical Software. BNPmix is an R package for Bayesian nonparametric multivariate density estimation, clustering, and regression, using Pitman-Yor mixture models, a flexible and robust generalization of the popular class of Dirichlet process mixture models. A variety of model specifications and state-of-the-art posterior samplers are implemented. In order to achieve computational efficiency, all sampling methods are written in C++ and seamless integrated into R by means of the Rcpp and RcppArmadillo packages. BNPmix exploits the ggplot2 capabilities and implements a series of generic functions to plot and print summaries of posterior densities and induced clustering of the data.
这篇关于R包BNPmix的介绍目前正在统计软件杂志上出版。BNPmix是一个R软件包,用于贝叶斯非参数多元密度估计,聚类和回归,使用Pitman-Yor混合模型,这是流行的Dirichlet过程混合模型的灵活而稳健的推广。各种模型规格和国家的最先进的后验抽样实施。为了提高计算效率,所有的采样方法都是用c++编写的,并通过Rcpp和RcppArmadillo包无缝集成到R中。BNPmix利用ggplot2功能,实现了一系列通用函数来绘制和打印后验密度和数据的诱导聚类的摘要。
{"title":"BNPmix: An R Package for Bayesian Nonparametric Modeling via Pitman-Yor Mixtures","authors":"R. Corradin, A. Canale, Bernardo Nipoti","doi":"10.18637/jss.v100.i15","DOIUrl":"https://doi.org/10.18637/jss.v100.i15","url":null,"abstract":"This introduction to the R package BNPmix is currently in press in the Journal of Statistical Software. BNPmix is an R package for Bayesian nonparametric multivariate density estimation, clustering, and regression, using Pitman-Yor mixture models, a flexible and robust generalization of the popular class of Dirichlet process mixture models. A variety of model specifications and state-of-the-art posterior samplers are implemented. In order to achieve computational efficiency, all sampling methods are written in C++ and seamless integrated into R by means of the Rcpp and RcppArmadillo packages. BNPmix exploits the ggplot2 capabilities and implements a series of generic functions to plot and print summaries of posterior densities and induced clustering of the data.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83762513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
dynamichazard: Dynamic Hazard Models Using State Space Models 动态危害:使用状态空间模型的动态危害模型
IF 5.8 2区 计算机科学 Q1 Mathematics Pub Date : 2021-01-01 DOI: 10.18637/jss.v099.i07
Benjamin Christoffersen
{"title":"dynamichazard: Dynamic Hazard Models Using State Space Models","authors":"Benjamin Christoffersen","doi":"10.18637/jss.v099.i07","DOIUrl":"https://doi.org/10.18637/jss.v099.i07","url":null,"abstract":"","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85183686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Flexible Scan Statistics for Detecting Spatial Disease Clusters: The rflexscan R Package 用于检测空间疾病簇的灵活扫描统计:rflexscan R包
IF 5.8 2区 计算机科学 Q1 Mathematics Pub Date : 2021-01-01 DOI: 10.18637/jss.v099.i13
Takahiro Otani, Kunihiko Takahashi
The spatial scan statistic is commonly used to detect spatial disease clusters in epidemiological studies. Among the various types of scan statistics, the flexible scan statistic proposed by Tango and Takahashi (2005) is one of the most promising methods to detect arbitrarily-shaped clusters. In this paper, we introduce a new R package, rflexscan (Otani and Takahashi 2021), that provides efficient and easy-to-use methods for analyses of spatial count data using the flexible spatial scan statistic. The package is designed for any of the following interrelated purposes: to evaluate whether reported spatial disease clusters are statistically significant, to test whether a disease is randomly distributed over space, and to perform geographical surveillance of disease to detect areas of significantly high rates. The functionality of the package is demonstrated through an application to a public-domain small-area cancer incidence dataset in New York State, USA, between 2005 and 2009.
空间扫描统计量在流行病学研究中常用来检测空间疾病聚集。在各种类型的扫描统计量中,Tango和Takahashi(2005)提出的柔性扫描统计量是最有希望检测任意形状聚类的方法之一。在本文中,我们介绍了一个新的R包,rflexscan (Otani和Takahashi 2021),它提供了使用灵活的空间扫描统计分析空间计数数据的高效和易于使用的方法。该一揽子方案是为下列任何相互关联的目的而设计的:评估所报告的空间疾病群集是否具有统计显著性,检验一种疾病是否在空间上随机分布,以及对疾病进行地理监测,以发现发病率极高的地区。通过对2005年至2009年期间美国纽约州公共领域小区域癌症发病率数据集的应用程序演示了该软件包的功能。
{"title":"Flexible Scan Statistics for Detecting Spatial Disease Clusters: The rflexscan R Package","authors":"Takahiro Otani, Kunihiko Takahashi","doi":"10.18637/jss.v099.i13","DOIUrl":"https://doi.org/10.18637/jss.v099.i13","url":null,"abstract":"The spatial scan statistic is commonly used to detect spatial disease clusters in epidemiological studies. Among the various types of scan statistics, the flexible scan statistic proposed by Tango and Takahashi (2005) is one of the most promising methods to detect arbitrarily-shaped clusters. In this paper, we introduce a new R package, rflexscan (Otani and Takahashi 2021), that provides efficient and easy-to-use methods for analyses of spatial count data using the flexible spatial scan statistic. The package is designed for any of the following interrelated purposes: to evaluate whether reported spatial disease clusters are statistically significant, to test whether a disease is randomly distributed over space, and to perform geographical surveillance of disease to detect areas of significantly high rates. The functionality of the package is demonstrated through an application to a public-domain small-area cancer incidence dataset in New York State, USA, between 2005 and 2009.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81783097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
NScluster: An R Package for Maximum Palm Likelihood Estimation for Cluster Point Process Models Using OpenMP NScluster:一个使用OpenMP对聚类点过程模型进行最大手掌似然估计的R包
IF 5.8 2区 计算机科学 Q1 Mathematics Pub Date : 2021-01-01 DOI: 10.18637/jss.v098.i06
U. Tanaka, Masami Saga, Junji Nakano
NScluster is an R package used for simulation and parameter estimation for NeymanScott cluster point process models and their extensions. For parameter estimation, NScluster uses the maximum Palm likelihood estimation procedure. As some estimation procedures proposed herein require heavy calculation, NScluster can use parallel computation via OpenMP and achieve significant speedup in some cases. In this paper, we discuss results obtained using a laptop PC and a shared memory supercomputer. In addition, we examine the performance characteristics of parallel computation via OpenMP.
NScluster是一个R软件包,用于NeymanScott聚类点过程模型及其扩展的仿真和参数估计。对于参数估计,NScluster使用最大Palm似然估计过程。由于本文提出的一些估计过程需要大量的计算,NScluster可以通过OpenMP使用并行计算,在某些情况下可以获得显著的加速。本文讨论了在笔记本电脑和共享内存超级计算机上得到的结果。此外,我们通过OpenMP检查并行计算的性能特征。
{"title":"NScluster: An R Package for Maximum Palm Likelihood Estimation for Cluster Point Process Models Using OpenMP","authors":"U. Tanaka, Masami Saga, Junji Nakano","doi":"10.18637/jss.v098.i06","DOIUrl":"https://doi.org/10.18637/jss.v098.i06","url":null,"abstract":"NScluster is an R package used for simulation and parameter estimation for NeymanScott cluster point process models and their extensions. For parameter estimation, NScluster uses the maximum Palm likelihood estimation procedure. As some estimation procedures proposed herein require heavy calculation, NScluster can use parallel computation via OpenMP and achieve significant speedup in some cases. In this paper, we discuss results obtained using a laptop PC and a shared memory supercomputer. In addition, we examine the performance characteristics of parallel computation via OpenMP.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85885022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Statistical Software
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1