首页 > 最新文献

R Journal最新文献

英文 中文
What's for dynr: A Package for Linear and Nonlinear Dynamic Modeling in R. 什么是dynr:一个在R中的线性和非线性动态建模包。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2019-06-01 DOI: 10.32614/rj-2019-012
Lu Ou, Michael D Hunter, Sy-Miin Chow

Intensive longitudinal data in the behavioral sciences are often noisy, multivariate in nature, and may involve multiple units undergoing regime switches by showing discontinuities interspersed with continuous dynamics. Despite increasing interest in using linear and nonlinear differential/difference equation models with regime switches, there has been a scarcity of software packages that are fast and freely accessible. We have created an R package called dynr that can handle a broad class of linear and nonlinear discrete- and continuous-time models, with regime-switching properties and linear Gaussian measurement functions, in C, while maintaining simple and easy-to-learn model specification functions in R. We present the mathematical and computational bases used by the dynr R package, and present two illustrative examples to demonstrate the unique features of dynr.

行为科学中密集的纵向数据通常是嘈杂的、多变量的,并且可能涉及多个单元,通过显示穿插在连续动态中的不连续性来进行状态切换。尽管人们对使用具有状态切换的线性和非线性微分/差分方程模型越来越感兴趣,但缺乏快速且可自由访问的软件包。我们创建了一个名为dynr的R包,它可以用C处理一系列线性和非线性离散和连续时间模型,具有状态切换特性和线性高斯测量函数,同时在R中保持简单易学的模型规范函数。我们介绍了dynr R包使用的数学和计算基础,并给出了两个示例来说明dynr的独特特性。
{"title":"What's for dynr: A Package for Linear and Nonlinear Dynamic Modeling in R.","authors":"Lu Ou, Michael D Hunter, Sy-Miin Chow","doi":"10.32614/rj-2019-012","DOIUrl":"10.32614/rj-2019-012","url":null,"abstract":"<p><p>Intensive longitudinal data in the behavioral sciences are often noisy, multivariate in nature, and may involve multiple units undergoing regime switches by showing discontinuities interspersed with continuous dynamics. Despite increasing interest in using linear and nonlinear differential/difference equation models with regime switches, there has been a scarcity of software packages that are fast and freely accessible. We have created an R package called <b>dynr</b> that can handle a broad class of linear and nonlinear discrete- and continuous-time models, with regime-switching properties and linear Gaussian measurement functions, in C, while maintaining simple and easy-to-learn model specification functions in R. We present the mathematical and computational bases used by the <b>dynr</b> R package, and present two illustrative examples to demonstrate the unique features of <b>dynr</b>.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"11 1","pages":"91-111"},"PeriodicalIF":2.1,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8297742/pdf/nihms-1719194.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39220219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
rFSA: An R Package for Finding Best Subsets and Interactions. rFSA:一个寻找最佳子集和交互的R包。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2018-12-01 Epub Date: 2018-12-08 DOI: 10.32614/rj-2018-059
Joshua Lambert, Liyu Gong, Corrine F Elliott, Katherine Thompson, Arnold Stromberg

Herein we present the R package rFSA, which implements an algorithm for improved variable selection. The algorithm searches a data space for models of a user-specified form that are statistically optimal under a measure of model quality. Many iterations afford a set of feasible solutions (or candidate models) that the researcher can evaluate for relevance to his or her questions of interest. The algorithm can be used to formulate new or to improve upon existing models in bioinformatics, health care, and myriad other fields in which the volume of available data has outstripped researchers' practical and computational ability to explore larger subsets or higher-order interaction terms. The package accommodates linear and generalized linear models, as well as a variety of criterion functions such as Allen's PRESS and AIC. New modeling strategies and criterion functions can be adapted easily to work with rFSA.

本文提出了R包rFSA,它实现了一种改进的变量选择算法。该算法在数据空间中搜索在模型质量度量下统计上最优的用户指定形式的模型。许多迭代提供了一组可行的解决方案(或候选模型),研究人员可以评估与他或她感兴趣的问题的相关性。该算法可用于在生物信息学、医疗保健和无数其他领域制定新的或改进现有模型,这些领域的可用数据量已经超过了研究人员探索更大子集或高阶交互项的实际和计算能力。该软件包可容纳线性和广义线性模型,以及各种标准函数,如Allen's PRESS和AIC。新的建模策略和标准函数可以很容易地适应rFSA。
{"title":"rFSA: An R Package for Finding Best Subsets and Interactions.","authors":"Joshua Lambert,&nbsp;Liyu Gong,&nbsp;Corrine F Elliott,&nbsp;Katherine Thompson,&nbsp;Arnold Stromberg","doi":"10.32614/rj-2018-059","DOIUrl":"https://doi.org/10.32614/rj-2018-059","url":null,"abstract":"<p><p>Herein we present the R package rFSA, which implements an algorithm for improved variable selection. The algorithm searches a data space for models of a user-specified form that are statistically optimal under a measure of model quality. Many iterations afford a set of <i>feasible solutions</i> (or candidate models) that the researcher can evaluate for relevance to his or her questions of interest. The algorithm can be used to formulate new or to improve upon existing models in bioinformatics, health care, and myriad other fields in which the volume of available data has outstripped researchers' practical and computational ability to explore larger subsets or higher-order interaction terms. The package accommodates linear and generalized linear models, as well as a variety of criterion functions such as Allen's PRESS and AIC. New modeling strategies and criterion functions can be adapted easily to work with <b>rFSA</b>.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"10 2","pages":"295-308"},"PeriodicalIF":2.1,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9205535/pdf/nihms-1811126.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40012840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Semiparametric Generalized Linear Models with the gldrm Package. 带有gldrm软件包的半参数广义线性模型。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2018-07-01
Michael J Wurm, Paul J Rathouz

This paper introduces a new algorithm to estimate and perform inferences on a recently proposed and developed semiparametric generalized linear model (glm). Rather than selecting a particular parametric exponential family model, such as the Poisson distribution, this semiparametric glm assumes that the response is drawn from the more general exponential tilt family. The regression coefficients and unspecified reference distribution are estimated by maximizing a semiparametric likelihood. The new algorithm incorporates several computational stability and efficiency improvements over the algorithm originally proposed. In particular, the new algorithm performs well for either small or large support for the nonparametric response distribution. The algorithm is implemented in a new R package called gldrm.

本文介绍了一种新的算法来估计和推断最近提出和发展的半参数广义线性模型(glm)。这种半参数glm不是选择特定的参数指数族模型,例如泊松分布,而是假设响应来自更一般的指数倾斜族。回归系数和未指定的参考分布是通过最大化半参数似然来估计的。与最初提出的算法相比,新算法包含了一些计算稳定性和效率的改进。特别地,新算法对于非参数响应分布的小支持或大支持都表现良好。该算法在一个名为gldrm的新R包中实现。
{"title":"Semiparametric Generalized Linear Models with the gldrm Package.","authors":"Michael J Wurm,&nbsp;Paul J Rathouz","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper introduces a new algorithm to estimate and perform inferences on a recently proposed and developed semiparametric generalized linear model (glm). Rather than selecting a particular parametric exponential family model, such as the Poisson distribution, this semiparametric glm assumes that the response is drawn from the more general exponential tilt family. The regression coefficients and unspecified reference distribution are estimated by maximizing a semiparametric likelihood. The new algorithm incorporates several computational stability and efficiency improvements over the algorithm originally proposed. In particular, the new algorithm performs well for either small or large support for the nonparametric response distribution. The algorithm is implemented in a new R package called <b>gldrm</b>.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"10 1","pages":"288-307"},"PeriodicalIF":2.1,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6414059/pdf/nihms-1011992.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41158463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MGLM: An R Package for Multivariate Categorical Data Analysis. 多变量分类数据分析的R包。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2018-07-01 DOI: 10.32614/rj-2018-015
Juhyun Kim, Yiwen Zhang, Joshua Day, Hua Zhou

Data with multiple responses is ubiquitous in modern applications. However, few tools are available for regression analysis of multivariate counts. The most popular multinomial-logit model has a very restrictive mean-variance structure, limiting its applicability to many data sets. This article introduces an R package MGLM, short for multivariate response generalized linear models, that expands the current tools for regression analysis of polytomous data. Distribution fitting, random number generation, regression, and sparse regression are treated in a unifying framework. The algorithm, usage, and implementation details are discussed.

具有多重响应的数据在现代应用中无处不在。然而,很少有工具可用于多元计数的回归分析。最流行的多项式-logit模型具有非常严格的均值-方差结构,限制了它对许多数据集的适用性。本文介绍了一个R包MGLM,即多元响应广义线性模型(multivariate response generalized linear models)的缩写,它扩展了当前用于多元数据回归分析的工具。分布拟合、随机数生成、回归和稀疏回归在一个统一的框架中处理。讨论了算法、用法和实现细节。
{"title":"MGLM: An R Package for Multivariate Categorical Data Analysis.","authors":"Juhyun Kim,&nbsp;Yiwen Zhang,&nbsp;Joshua Day,&nbsp;Hua Zhou","doi":"10.32614/rj-2018-015","DOIUrl":"https://doi.org/10.32614/rj-2018-015","url":null,"abstract":"<p><p>Data with multiple responses is ubiquitous in modern applications. However, few tools are available for regression analysis of multivariate counts. The most popular multinomial-logit model has a very restrictive mean-variance structure, limiting its applicability to many data sets. This article introduces an R package <b>MGLM</b>, short for multivariate response generalized linear models, that expands the current tools for regression analysis of polytomous data. Distribution fitting, random number generation, regression, and sparse regression are treated in a unifying framework. The algorithm, usage, and implementation details are discussed.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"10 1","pages":"73-90"},"PeriodicalIF":2.1,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7286576/pdf/nihms-1562404.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38035686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Semiparametric Generalized Linear Models with the gldrm Package 具有gldrm包的半参数广义线性模型
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2018-07-01 DOI: 10.32614/RJ-2018-027
Mike Wurm, P. Rathouz
This paper introduces a new algorithm to estimate and perform inferences on a recently proposed and developed semiparametric generalized linear model (glm). Rather than selecting a particular parametric exponential family model, such as the Poisson distribution, this semiparametric glm assumes that the response is drawn from the more general exponential tilt family. The regression coefficients and unspecified reference distribution are estimated by maximizing a semiparametric likelihood. The new algorithm incorporates several computational stability and efficiency improvements over the algorithm originally proposed. In particular, the new algorithm performs well for either small or large support for the nonparametric response distribution. The algorithm is implemented in a new R package called gldrm.
本文介绍了一种新的估计和推理半参数广义线性模型(glm)的算法。而不是选择一个特定的参数指数族模型,如泊松分布,这种半参数glm假设响应是从更一般的指数倾斜族中提取的。通过最大化半参数似然估计回归系数和未指定参考分布。新算法在原有算法的基础上提高了计算稳定性和效率。特别是对于非参数响应分布的小支持和大支持,新算法都表现良好。该算法在一个名为gldrm的新R包中实现。
{"title":"Semiparametric Generalized Linear Models with the gldrm Package","authors":"Mike Wurm, P. Rathouz","doi":"10.32614/RJ-2018-027","DOIUrl":"https://doi.org/10.32614/RJ-2018-027","url":null,"abstract":"This paper introduces a new algorithm to estimate and perform inferences on a recently proposed and developed semiparametric generalized linear model (glm). Rather than selecting a particular parametric exponential family model, such as the Poisson distribution, this semiparametric glm assumes that the response is drawn from the more general exponential tilt family. The regression coefficients and unspecified reference distribution are estimated by maximizing a semiparametric likelihood. The new algorithm incorporates several computational stability and efficiency improvements over the algorithm originally proposed. In particular, the new algorithm performs well for either small or large support for the nonparametric response distribution. The algorithm is implemented in a new R package called gldrm.","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"10 1 1","pages":"288-307"},"PeriodicalIF":2.1,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46719073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A System for an Accountable Data Analysis Process in R. 一个负责任的数据分析过程系统。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2018-07-01 Epub Date: 2018-05-15
Jonathan Gelfond, Martin Goros, Brian Hernandez, Alex Bokov

Efficiently producing transparent analyses may be difficult for beginners or tedious for the experienced. This implies a need for computing systems and environments that can efficiently satisfy reproducibility and accountability standards. To this end, we have developed a system, R package, and R Shiny application called adapr (Accountable Data Analysis Process in R) that is built on the principle of accountable units. An accountable unit is a data file (statistic, table or graphic) that can be associated with a provenance, meaning how it was created, when it was created and who created it, and this is similar to the 'verifiable computational results' (VCR) concept proposed by Gavish and Donoho. Both accountable units and VCRs are version controlled, sharable, and can be incorporated into a collaborative project. However, accountable units use file hashes and do not involve watermarking or public repositories like VCRs. Reproducing collaborative work may be highly complex, requiring repeating computations on multiple systems from multiple authors; however, determining the provenance of each unit is simpler, requiring only a search using file hashes and version control systems.

有效地生成透明的分析对于初学者来说可能是困难的,对于有经验的人来说则是乏味的。这意味着需要能够有效地满足再现性和责任标准的计算系统和环境。为此,我们开发了一个系统,R包和R Shiny应用程序,称为adapr (R中的可问责数据分析过程),它建立在可问责单元的原则之上。可问责单位是一个数据文件(统计数据、表格或图形),可以与出处相关联,这意味着它是如何创建的,何时创建的以及谁创建的,这类似于Gavish和Donoho提出的“可验证计算结果”(VCR)概念。责任制单元和vcr都是版本控制的、可共享的,并且可以合并到协作项目中。然而,责任单元使用文件哈希,不涉及水印或vcr等公共存储库。再现协作工作可能非常复杂,需要在多个作者的多个系统上重复计算;然而,确定每个单元的来源更简单,只需要使用文件散列和版本控制系统进行搜索。
{"title":"A System for an Accountable Data Analysis Process in R.","authors":"Jonathan Gelfond,&nbsp;Martin Goros,&nbsp;Brian Hernandez,&nbsp;Alex Bokov","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Efficiently producing transparent analyses may be difficult for beginners or tedious for the experienced. This implies a need for computing systems and environments that can efficiently satisfy reproducibility and accountability standards. To this end, we have developed a system, R package, and R Shiny application called adapr (Accountable Data Analysis Process in R) that is built on the principle of accountable units. An accountable unit is a data file (statistic, table or graphic) that can be associated with a provenance, meaning how it was created, when it was created and who created it, and this is similar to the 'verifiable computational results' (VCR) concept proposed by Gavish and Donoho. Both accountable units and VCRs are version controlled, sharable, and can be incorporated into a collaborative project. However, accountable units use file hashes and do not involve watermarking or public repositories like VCRs. Reproducing collaborative work may be highly complex, requiring repeating computations on multiple systems from multiple authors; however, determining the provenance of each unit is simpler, requiring only a search using file hashes and version control systems.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"10 1","pages":"6-21"},"PeriodicalIF":2.1,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6261481/pdf/nihms962940.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36787790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A System for an Accountable Data Analysis Process in R R中负责数据分析过程的系统
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2018-05-15 DOI: 10.32614/RJ-2018-001
J. Gelfond, M. Goros, B. Hernandez, A. Bokov
Efficiently producing transparent analyses may be difficult for beginners or tedious for the experienced. This implies a need for computing systems and environments that can efficiently satisfy reproducibility and accountability standards. To this end, we have developed a system, R package, and R Shiny application called adapr (Accountable Data Analysis Process in R) that is built on the principle of accountable units. An accountable unit is a data file (statistic, table or graphic) that can be associated with a provenance, meaning how it was created, when it was created and who created it, and this is similar to the 'verifiable computational results' (VCR) concept proposed by Gavish and Donoho. Both accountable units and VCRs are version controlled, sharable, and can be incorporated into a collaborative project. However, accountable units use file hashes and do not involve watermarking or public repositories like VCRs. Reproducing collaborative work may be highly complex, requiring repeating computations on multiple systems from multiple authors; however, determining the provenance of each unit is simpler, requiring only a search using file hashes and version control systems.
高效地进行透明分析对初学者来说可能很困难,对有经验的人来说可能很乏味。这意味着需要能够有效地满足再现性和责任标准的计算系统和环境。为此,我们开发了一个基于责任单位原则的系统、R包和R Shiny应用程序,称为adapr(R中的责任数据分析过程)。责任单位是一种数据文件(统计数据、表格或图形),可以与出处相关联,也就是说它是如何创建的,何时创建以及由谁创建的,这类似于Gavish和Donoho提出的“可验证计算结果”(VCR)概念。责任单位和风险控制报告都是版本控制的、可共享的,并且可以合并到一个协作项目中。然而,责任单位使用文件哈希,不涉及水印或VCR等公共存储库。复制协作工作可能非常复杂,需要多个作者在多个系统上重复计算;然而,确定每个单元的来源更简单,只需要使用文件哈希和版本控制系统进行搜索。
{"title":"A System for an Accountable Data Analysis Process in R","authors":"J. Gelfond, M. Goros, B. Hernandez, A. Bokov","doi":"10.32614/RJ-2018-001","DOIUrl":"https://doi.org/10.32614/RJ-2018-001","url":null,"abstract":"Efficiently producing transparent analyses may be difficult for beginners or tedious for the experienced. This implies a need for computing systems and environments that can efficiently satisfy reproducibility and accountability standards. To this end, we have developed a system, R package, and R Shiny application called adapr (Accountable Data Analysis Process in R) that is built on the principle of accountable units. An accountable unit is a data file (statistic, table or graphic) that can be associated with a provenance, meaning how it was created, when it was created and who created it, and this is similar to the 'verifiable computational results' (VCR) concept proposed by Gavish and Donoho. Both accountable units and VCRs are version controlled, sharable, and can be incorporated into a collaborative project. However, accountable units use file hashes and do not involve watermarking or public repositories like VCRs. Reproducing collaborative work may be highly complex, requiring repeating computations on multiple systems from multiple authors; however, determining the provenance of each unit is simpler, requiring only a search using file hashes and version control systems.","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"10 1 1","pages":"6-21"},"PeriodicalIF":2.1,"publicationDate":"2018-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49470970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
R Package imputeTestbench to Compare Imputation Methods for Univariate Time Series. R包imputeTestbench来比较单变量时间序列的Imputation方法。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2018-01-01
Marcus W Beck, Neeraj Bokde, Gualberto Asencio-Cortés, Kishore Kulat

Missing observations are common in time series data and several methods are available to impute these values prior to analysis. Variation in statistical characteristics of univariate time series can have a profound effect on characteristics of missing observations and, therefore, the accuracy of different imputation methods. The imputeTestbench package can be used to compare the prediction accuracy of different methods as related to the amount and type of missing data for a user-supplied dataset. Missing data are simulated by removing observations completely at random or in blocks of different sizes depending on characteristics of the data. Several imputation algorithms are included with the package that vary from simple replacement with means to more complex interpolation methods. The testbench is not limited to the default functions and users can add or remove methods as needed. Plotting functions also allow comparative visualization of the behavior and effectiveness of different algorithms. We present example applications that demonstrate how the package can be used to understand differences in prediction accuracy between methods as affected by characteristics of a dataset and the nature of missing data.

缺失观测值在时间序列数据中很常见,有几种方法可用于在分析之前推断这些值。单变量时间序列统计特征的变化会对缺失观测值的特征产生深远的影响,从而影响不同估算方法的准确性。对于用户提供的数据集,可以使用imputeTestbench包来比较与缺失数据的数量和类型相关的不同方法的预测精度。通过完全随机地或根据数据的特征以不同大小的块移除观测值来模拟缺失的数据。几种插值算法包含在包中,从简单的替换手段到更复杂的插值方法。测试平台不局限于默认函数,用户可以根据需要添加或删除方法。绘图函数还允许对不同算法的行为和有效性进行比较可视化。我们给出的示例应用程序演示了如何使用该包来理解受数据集特征和缺失数据性质影响的方法之间的预测准确性差异。
{"title":"R Package imputeTestbench to Compare Imputation Methods for Univariate Time Series.","authors":"Marcus W Beck,&nbsp;Neeraj Bokde,&nbsp;Gualberto Asencio-Cortés,&nbsp;Kishore Kulat","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Missing observations are common in time series data and several methods are available to impute these values prior to analysis. Variation in statistical characteristics of univariate time series can have a profound effect on characteristics of missing observations and, therefore, the accuracy of different imputation methods. The <b>imputeTestbench</b> package can be used to compare the prediction accuracy of different methods as related to the amount and type of missing data for a user-supplied dataset. Missing data are simulated by removing observations completely at random or in blocks of different sizes depending on characteristics of the data. Several imputation algorithms are included with the package that vary from simple replacement with means to more complex interpolation methods. The testbench is not limited to the default functions and users can add or remove methods as needed. Plotting functions also allow comparative visualization of the behavior and effectiveness of different algorithms. We present example applications that demonstrate how the package can be used to understand differences in prediction accuracy between methods as affected by characteristics of a dataset and the nature of missing data.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"10 1","pages":"218-233"},"PeriodicalIF":2.1,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6309171/pdf/nihms-1507947.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36822605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PanJen: An R package for Ranking Transformations in a Linear Regression PanJen:一个用于线性回归中排序变换的R包
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2018-01-01 DOI: 10.32614/RJ-2018-018
C. U. Jensen, T. Panduro
{"title":"PanJen: An R package for Ranking Transformations in a Linear Regression","authors":"C. U. Jensen, T. Panduro","doi":"10.32614/RJ-2018-018","DOIUrl":"https://doi.org/10.32614/RJ-2018-018","url":null,"abstract":"","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"9 1","pages":"109-121"},"PeriodicalIF":2.1,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82002838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
rpsftm: An R Package for Rank Preserving Structural Failure Time Models rpsftm:保秩结构失效时间模型的R包
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2017-12-04 DOI: 10.32614/RJ-2017-068
Annabel Allison, I. White, S. Bond
Treatment switching in a randomised controlled trial occurs when participants change from their randomised treatment to the other trial treatment during the study. Failure to account for treatment switching in the analysis (i.e. by performing a standard intention-to-treat analysis) can lead to biased estimates of treatment efficacy. The rank preserving structural failure time model (RPSFTM) is a method used to adjust for treatment switching in trials with survival outcomes. The RPSFTM is due to Robins and Tsiatis (1991) and has been developed by White et al. (1997, 1999). The method is randomisation based and uses only the randomised treatment group, observed event times, and treatment history in order to estimate a causal treatment effect. The treatment effect, ψ, is estimated by balancing counter-factual event times (that would be observed if no treatment were received) between treatment groups. G-estimation is used to find the value of ψ such that a test statistic Z(ψ) = 0. This is usually the test statistic used in the intention-to-treat analysis, for example, the log rank test statistic. We present an R package that implements the method of rpsftm.
在随机对照试验中,当参与者在研究期间从随机治疗改为其他试验治疗时,就会发生治疗切换。未能在分析中考虑治疗转换(即通过执行标准意向治疗分析)可能导致对治疗疗效的估计存在偏差。保秩结构失效时间模型(RPSFTM)是一种用于在有生存结果的试验中调整治疗转换的方法。RPSFTM是由Robins和Tsiatis(1991)提出的,由White等人开发(19971999)。该方法基于随机化,仅使用随机治疗组、观察到的事件时间和治疗史来估计因果治疗效果。治疗效果ψ是通过平衡治疗组之间的反事实事件时间(如果没有接受治疗,则会观察到)来估计的。使用G-估计来找到ψ的值,使得检验统计量Z(ψ)=0。这通常是意向治疗分析中使用的检验统计量,例如,对数秩检验统计量。我们提出了一个R包,它实现了rpsftm的方法。
{"title":"rpsftm: An R Package for Rank Preserving Structural Failure Time Models","authors":"Annabel Allison, I. White, S. Bond","doi":"10.32614/RJ-2017-068","DOIUrl":"https://doi.org/10.32614/RJ-2017-068","url":null,"abstract":"Treatment switching in a randomised controlled trial occurs when participants change from their randomised treatment to the other trial treatment during the study. Failure to account for treatment switching in the analysis (i.e. by performing a standard intention-to-treat analysis) can lead to biased estimates of treatment efficacy. The rank preserving structural failure time model (RPSFTM) is a method used to adjust for treatment switching in trials with survival outcomes. The RPSFTM is due to Robins and Tsiatis (1991) and has been developed by White et al. (1997, 1999). The method is randomisation based and uses only the randomised treatment group, observed event times, and treatment history in order to estimate a causal treatment effect. The treatment effect, ψ, is estimated by balancing counter-factual event times (that would be observed if no treatment were received) between treatment groups. G-estimation is used to find the value of ψ such that a test statistic Z(ψ) = 0. This is usually the test statistic used in the intention-to-treat analysis, for example, the log rank test statistic. We present an R package that implements the method of rpsftm.","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"9 2 1","pages":"342-353"},"PeriodicalIF":2.1,"publicationDate":"2017-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45358153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
R Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1