Treatment switching in a randomised controlled trial occurs when participants change from their randomised treatment to the other trial treatment during the study. Failure to account for treatment switching in the analysis (i.e. by performing a standard intention-to-treat analysis) can lead to biased estimates of treatment efficacy. The rank preserving structural failure time model (RPSFTM) is a method used to adjust for treatment switching in trials with survival outcomes. The RPSFTM is due to Robins and Tsiatis (1991) and has been developed by White et al. (1997, 1999). The method is randomisation based and uses only the randomised treatment group, observed event times, and treatment history in order to estimate a causal treatment effect. The treatment effect, ψ, is estimated by balancing counter-factual event times (that would be observed if no treatment were received) between treatment groups. G-estimation is used to find the value of ψ such that a test statistic Z(ψ) = 0. This is usually the test statistic used in the intention-to-treat analysis, for example, the log rank test statistic. We present an R package that implements the method of rpsftm.
{"title":"rpsftm: An R Package for Rank Preserving Structural Failure Time Models.","authors":"Annabel Allison, Ian R White, Simon Bond","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Treatment switching in a randomised controlled trial occurs when participants change from their randomised treatment to the other trial treatment during the study. Failure to account for treatment switching in the analysis (i.e. by performing a standard intention-to-treat analysis) can lead to biased estimates of treatment efficacy. The rank preserving structural failure time model (RPSFTM) is a method used to adjust for treatment switching in trials with survival outcomes. The RPSFTM is due to Robins and Tsiatis (1991) and has been developed by White et al. (1997, 1999). The method is randomisation based and uses only the randomised treatment group, observed event times, and treatment history in order to estimate a causal treatment effect. The treatment effect, <i>ψ</i>, is estimated by balancing counter-factual event times (that would be observed if no treatment were received) between treatment groups. G-estimation is used to find the value of <i>ψ</i> such that a test statistic <i>Z</i>(<i>ψ</i>) = 0. This is usually the test statistic used in the intention-to-treat analysis, for example, the log rank test statistic. We present an R package that implements the method of rpsftm.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"9 2","pages":"342-353"},"PeriodicalIF":2.1,"publicationDate":"2017-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5858764/pdf/emss-76528.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35937148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The MDplot package provides plotting functions to allow for automated visualisation of molecular dynamics simulation output. It is especially useful in cases where the plot generation is rather tedious due to complex file formats or when a large number of plots are generated. The graphs that are supported range from those which are standard, such as RMsD/RMsF (root-mean-square deviation and root-mean-square fluctuation, respectively) to less standard, such as thermodynamic integration analysis and hydrogen bond monitoring over time. All told, they address many commonly used analyses. In this article, we set out the MDplot package's functions, give examples of the function calls, and show the associated plots. Plotting and data parsing is separated in all cases, i.e. the respective functions can be used independently. Thus, data manipulation and the integration of additional file formats is fairly easy. Currently, the loading functions support GROMOS, GROMACS, and AMBER file formats. Moreover, we also provide a Bash interface that allows simple embedding of MDplot into Bash scripts as the final analysis step. AVAILABILITY The package can be obtained in the latest major version from CRAN (https://cran.r-project.org/package=MDplot) or in the most recent version from the project's GitHub page at https://github.com/MDplot/MDplot, where feedback is also most welcome. MDplot is published under the GPL-3 license.
{"title":"MDplot: Visualise Molecular Dynamics","authors":"Christian Margreitter, C. Oostenbrink","doi":"10.32614/RJ-2017-007","DOIUrl":"https://doi.org/10.32614/RJ-2017-007","url":null,"abstract":"The MDplot package provides plotting functions to allow for automated visualisation of molecular dynamics simulation output. It is especially useful in cases where the plot generation is rather tedious due to complex file formats or when a large number of plots are generated. The graphs that are supported range from those which are standard, such as RMsD/RMsF (root-mean-square deviation and root-mean-square fluctuation, respectively) to less standard, such as thermodynamic integration analysis and hydrogen bond monitoring over time. All told, they address many commonly used analyses. In this article, we set out the MDplot package's functions, give examples of the function calls, and show the associated plots. Plotting and data parsing is separated in all cases, i.e. the respective functions can be used independently. Thus, data manipulation and the integration of additional file formats is fairly easy. Currently, the loading functions support GROMOS, GROMACS, and AMBER file formats. Moreover, we also provide a Bash interface that allows simple embedding of MDplot into Bash scripts as the final analysis step.\u0000\u0000\u0000AVAILABILITY\u0000The package can be obtained in the latest major version from CRAN (https://cran.r-project.org/package=MDplot) or in the most recent version from the project's GitHub page at https://github.com/MDplot/MDplot, where feedback is also most welcome. MDplot is published under the GPL-3 license.","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"9 1 1","pages":"164-186"},"PeriodicalIF":2.1,"publicationDate":"2017-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44420577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The MDplot package provides plotting functions to allow for automated visualisation of molecular dynamics simulation output. It is especially useful in cases where the plot generation is rather tedious due to complex file formats or when a large number of plots are generated. The graphs that are supported range from those which are standard, such as RMsD/RMsF (root-mean-square deviation and root-mean-square fluctuation, respectively) to less standard, such as thermodynamic integration analysis and hydrogen bond monitoring over time. All told, they address many commonly used analyses. In this article, we set out the MDplot package's functions, give examples of the function calls, and show the associated plots. Plotting and data parsing is separated in all cases, i.e. the respective functions can be used independently. Thus, data manipulation and the integration of additional file formats is fairly easy. Currently, the loading functions support GROMOS, GROMACS, and AMBER file formats. Moreover, we also provide a Bash interface that allows simple embedding of MDplot into Bash scripts as the final analysis step.
Availability: The package can be obtained in the latest major version from CRAN (https://cran.r-project.org/package=MDplot) or in the most recent version from the project's GitHub page at https://github.com/MDplot/MDplot, where feedback is also most welcome. MDplot is published under the GPL-3 license.
{"title":"MDplot: Visualise Molecular Dynamics.","authors":"Christian Margreitter, Chris Oostenbrink","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The <b>MDplot</b> package provides plotting functions to allow for automated visualisation of molecular dynamics simulation output. It is especially useful in cases where the plot generation is rather tedious due to complex file formats or when a large number of plots are generated. The graphs that are supported range from those which are standard, such as RMsD/RMsF (root-mean-square deviation and root-mean-square fluctuation, respectively) to less standard, such as thermodynamic integration analysis and hydrogen bond monitoring over time. All told, they address many commonly used analyses. In this article, we set out the <b>MDplot</b> package's functions, give examples of the function calls, and show the associated plots. Plotting and data parsing is separated in all cases, i.e. the respective functions can be used independently. Thus, data manipulation and the integration of additional file formats is fairly easy. Currently, the loading functions support GROMOS, GROMACS, and AMBER file formats. Moreover, we also provide a Bash interface that allows simple embedding of MDplot into Bash scripts as the final analysis step.</p><p><strong>Availability: </strong>The package can be obtained in the latest major version from CRAN (https://cran.r-project.org/package=MDplot) or in the most recent version from the project's GitHub page at https://github.com/MDplot/MDplot, where feedback is also most welcome. MDplot is published under the GPL-3 license.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"9 1","pages":"164-186"},"PeriodicalIF":2.1,"publicationDate":"2017-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5570379/pdf/emss-73867.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35450358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heat maps are commonly used to display the spatial distribution of a response observed on a two-dimensional grid. The autoimage package provides convenient functions for constructing multiple heat maps in unified, seamless way, particularly when working with projected coordinates. The autoimage package natively supports: 1. automatic inclusion of a color scale with the plotted image, 2. construction of heat maps for responses observed on regular or irregular grids, as well as non-gridded data, 3. construction of a matrix of heat maps with a common color scale, 4. construction of a matrix of heat maps with individual color scales, 5. projecting coordinates before plotting, 6. easily adding geographic borders, points, and other features to the heat maps. After comparing the autoimage package's capabilities for constructing heat maps to those of existing tools, a carefully selected set of examples is used to highlight the capabilities of the autoimage package.
{"title":"autoimage: Multiple Heat Maps for Projected Coordinates.","authors":"Joshua P French","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Heat maps are commonly used to display the spatial distribution of a response observed on a two-dimensional grid. The <b>autoimage</b> package provides convenient functions for constructing multiple heat maps in unified, seamless way, particularly when working with projected coordinates. The <b>autoimage</b> package natively supports: 1. automatic inclusion of a color scale with the plotted image, 2. construction of heat maps for responses observed on regular or irregular grids, as well as non-gridded data, 3. construction of a matrix of heat maps with a common color scale, 4. construction of a matrix of heat maps with individual color scales, 5. projecting coordinates before plotting, 6. easily adding geographic borders, points, and other features to the heat maps. After comparing the <b>autoimage</b> package's capabilities for constructing heat maps to those of existing tools, a carefully selected set of examples is used to highlight the capabilities of the <b>autoimage</b> package.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"9 1","pages":"284-297"},"PeriodicalIF":2.1,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5685525/pdf/nihms883798.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35614220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-01-01Epub Date: 2017-06-08DOI: 10.32614/rj-2017-032
G Brooke Anderson, Colin Eason, Elizabeth A Barnes
Research on climate change impacts can require extensive processing of climate model output, especially when using ensemble techniques to incorporate output from multiple climate models and multiple simulations of each model. This processing can be particularly extensive when identifying and characterizing multi-day extreme events like heat waves and frost day spells, as these must be processed from model output with daily time steps. Further, climate model output is in a format and follows standards that may be unfamiliar to most R users. Here, we provide an overview of working with daily climate model output data in R. We then present the futureheatwaves package, which we developed to ease the process of identifying, characterizing, and exploring multi-day extreme events in climate model output. This package can input a directory of climate model output files, identify all extreme events using customizable event definitions, and summarize the output using user-specified functions.
{"title":"Working with Daily Climate Model Output Data in R and the futureheatwaves Package.","authors":"G Brooke Anderson, Colin Eason, Elizabeth A Barnes","doi":"10.32614/rj-2017-032","DOIUrl":"https://doi.org/10.32614/rj-2017-032","url":null,"abstract":"<p><p>Research on climate change impacts can require extensive processing of climate model output, especially when using ensemble techniques to incorporate output from multiple climate models and multiple simulations of each model. This processing can be particularly extensive when identifying and characterizing multi-day extreme events like heat waves and frost day spells, as these must be processed from model output with daily time steps. Further, climate model output is in a format and follows standards that may be unfamiliar to most R users. Here, we provide an overview of working with daily climate model output data in R. We then present the futureheatwaves package, which we developed to ease the process of identifying, characterizing, and exploring multi-day extreme events in climate model output. This package can input a directory of climate model output files, identify all extreme events using customizable event definitions, and summarize the output using user-specified functions.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"9 1","pages":"124-137"},"PeriodicalIF":2.1,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8048414/pdf/nihms-1061093.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38885023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-01-01Epub Date: 2017-06-10DOI: 10.32614/rj-2017-026
G Brooke Anderson, Dirk Eddelbuettel
Data-only packages offer a way to provide extended functionality for other R users. However, such packages can be large enough to exceed the package size limit (5 megabytes) for the Comprehensive R Archive Network (CRAN). As an alternative, large data packages can be posted to additional repostiories beyond CRAN itself in a way that allows smaller code packages on CRAN to access and use the data. The drat package facilitates creation and use of such alternative repositories and makes it particularly simple to host them via GitHub. CRAN packages can draw on packages posted to drat repositories through the use of the 'Additonal_repositories' field in the DESCRIPTION file. This paper describes how R users can create a suite of coordinated packages, in which larger data packages are hosted in an alternative repository created with drat, while a smaller code package that interacts with this data is created that can be submitted to CRAN.
{"title":"Hosting Data Packages via drat: A Case Study with Hurricane Exposure Data.","authors":"G Brooke Anderson, Dirk Eddelbuettel","doi":"10.32614/rj-2017-026","DOIUrl":"https://doi.org/10.32614/rj-2017-026","url":null,"abstract":"<p><p>Data-only packages offer a way to provide extended functionality for other R users. However, such packages can be large enough to exceed the package size limit (5 megabytes) for the Comprehensive R Archive Network (CRAN). As an alternative, large data packages can be posted to additional repostiories beyond CRAN itself in a way that allows smaller code packages on CRAN to access and use the data. The drat package facilitates creation and use of such alternative repositories and makes it particularly simple to host them via GitHub. CRAN packages can draw on packages posted to drat repositories through the use of the 'Additonal_repositories' field in the DESCRIPTION file. This paper describes how R users can create a suite of coordinated packages, in which larger data packages are hosted in an alternative repository created with drat, while a smaller code package that interacts with this data is created that can be submitted to CRAN.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"9 1","pages":"486-497"},"PeriodicalIF":2.1,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8048413/pdf/nihms-1061092.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38885025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heat maps are commonly used to display the spatial distribution of a response observed on a two-dimensional grid. The autoimage package provides convenient functions for constructing multiple heat maps in unified, seamless way, particularly when working with projected coordinates. The autoimage package natively supports: 1. automatic inclusion of a color scale with the plotted image, 2. construction of heat maps for responses observed on regular or irregular grids, as well as non-gridded data, 3. construction of a matrix of heat maps with a common color scale, 4. construction of a matrix of heat maps with individual color scales, 5. projecting coordinates before plotting, 6. easily adding geographic borders, points, and other features to the heat maps. After comparing the autoimage package's capabilities for constructing heat maps to those of existing tools, a carefully selected set of examples is used to highlight the capabilities of the autoimage package.
{"title":"autoimage: Multiple Heat Maps for Projected Coordinates","authors":"J. French","doi":"10.32614/RJ-2017-025","DOIUrl":"https://doi.org/10.32614/RJ-2017-025","url":null,"abstract":"Heat maps are commonly used to display the spatial distribution of a response observed on a two-dimensional grid. The autoimage package provides convenient functions for constructing multiple heat maps in unified, seamless way, particularly when working with projected coordinates. The autoimage package natively supports: 1. automatic inclusion of a color scale with the plotted image, 2. construction of heat maps for responses observed on regular or irregular grids, as well as non-gridded data, 3. construction of a matrix of heat maps with a common color scale, 4. construction of a matrix of heat maps with individual color scales, 5. projecting coordinates before plotting, 6. easily adding geographic borders, points, and other features to the heat maps. After comparing the autoimage package's capabilities for constructing heat maps to those of existing tools, a carefully selected set of examples is used to highlight the capabilities of the autoimage package.","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"9 1 1","pages":"284-297"},"PeriodicalIF":2.1,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"69958652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe a new package called pseval that implements the core methods for the evaluation of principal surrogates in a single clinical trial. It provides a flexible interface for defining models for the risk given treatment and the surrogate, the models for integration over the missing counterfactual surrogate responses, and the estimation methods. Estimated maximum likelihood and pseudo-score can be used for estimation, and the bootstrap for inference. A variety of post-estimation methods are provided, including print, summary, plot, and testing. We summarize the main statistical methods that are implemented in the package and illustrate its use from the perspective of a novice R user.
{"title":"An Introduction to Principal Surrogate Evaluation with the pseval Package.","authors":"Michael C Sachs, Erin E Gabriel","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We describe a new package called <b>pseval</b> that implements the core methods for the evaluation of principal surrogates in a single clinical trial. It provides a flexible interface for defining models for the risk given treatment and the surrogate, the models for integration over the missing counterfactual surrogate responses, and the estimation methods. Estimated maximum likelihood and pseudo-score can be used for estimation, and the bootstrap for inference. A variety of post-estimation methods are provided, including print, summary, plot, and testing. We summarize the main statistical methods that are implemented in the package and illustrate its use from the perspective of a novice R user.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"8 2","pages":"277-292"},"PeriodicalIF":2.1,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5774631/pdf/nihms860175.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35753795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe a new package called pseval that implements the core methods for the evaluation of principal surrogates in a single clinical trial. It provides a flexible interface for defining models for the risk given treatment and the surrogate, the models for integration over the missing counterfactual surrogate responses, and the estimation methods. Estimated maximum likelihood and pseudo-score can be used for estimation, and the bootstrap for inference. A variety of post-estimation methods are provided, including print, summary, plot, and testing. We summarize the main statistical methods that are implemented in the package and illustrate its use from the perspective of a novice R user.
{"title":"An Introduction to Principal Surrogate Evaluation with the pseval Package","authors":"M. Sachs, E. Gabriel","doi":"10.32614/RJ-2016-046","DOIUrl":"https://doi.org/10.32614/RJ-2016-046","url":null,"abstract":"We describe a new package called pseval that implements the core methods for the evaluation of principal surrogates in a single clinical trial. It provides a flexible interface for defining models for the risk given treatment and the surrogate, the models for integration over the missing counterfactual surrogate responses, and the estimation methods. Estimated maximum likelihood and pseudo-score can be used for estimation, and the bootstrap for inference. A variety of post-estimation methods are provided, including print, summary, plot, and testing. We summarize the main statistical methods that are implemented in the package and illustrate its use from the perspective of a novice R user.","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"8 2 1","pages":"277-292"},"PeriodicalIF":2.1,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"69958541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Missing observations are common in time series data and several methods are available to impute these values prior to analysis. Variation in statistical characteristics of univariate time series can have a profound effect on characteristics of missing observations and, therefore, the accuracy of different imputation methods. The imputeTestbench package can be used to compare the prediction accuracy of different methods as related to the amount and type of missing data for a user-supplied dataset. Missing data are simulated by removing observations completely at random or in blocks of different sizes depending on characteristics of the data. Several imputation algorithms are included with the package that vary from simple replacement with means to more complex interpolation methods. The testbench is not limited to the default functions and users can add or remove methods as needed. Plotting functions also allow comparative visualization of the behavior and effectiveness of different algorithms. We present example applications that demonstrate how the package can be used to understand differences in prediction accuracy between methods as affected by characteristics of a dataset and the nature of missing data.
{"title":"R Package imputeTestbench to Compare Imputation Methods for Univariate Time Series","authors":"M. Beck, N. Bokde, G. Asencio-Cortés, K. Kulat","doi":"10.32614/RJ-2018-024","DOIUrl":"https://doi.org/10.32614/RJ-2018-024","url":null,"abstract":"Missing observations are common in time series data and several methods are available to impute these values prior to analysis. Variation in statistical characteristics of univariate time series can have a profound effect on characteristics of missing observations and, therefore, the accuracy of different imputation methods. The imputeTestbench package can be used to compare the prediction accuracy of different methods as related to the amount and type of missing data for a user-supplied dataset. Missing data are simulated by removing observations completely at random or in blocks of different sizes depending on characteristics of the data. Several imputation algorithms are included with the package that vary from simple replacement with means to more complex interpolation methods. The testbench is not limited to the default functions and users can add or remove methods as needed. Plotting functions also allow comparative visualization of the behavior and effectiveness of different algorithms. We present example applications that demonstrate how the package can be used to understand differences in prediction accuracy between methods as affected by characteristics of a dataset and the nature of missing data.","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"10 1 1","pages":"218-233"},"PeriodicalIF":2.1,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"69958671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}