Temporal data often has a hierarchical structure, defined by categorical variables describing different levels, such as political regions or sales products. Nesting of categorical variables produces a hierarchical structure. The tsibbletalk package is developed to allow a user to interactively explore temporal data, relative to the nested or crossed structures. It can help to discover differences between category levels, and uncover interesting periodic or aperiodic slices. The package implements a shared tsibble object that allows for linked brushing between coordinated views, and a shiny module that aids in wrapping time lines for seasonal patterns. The tools are demonstrated using two data examples: domestic tourism in Australia and pedestrian traffic in Melbourne.
{"title":"Conversations in Time: Interactive Visualization to Explore Structured Temporal Data","authors":"Earo Wang, D. Cook","doi":"10.32614/rj-2021-050","DOIUrl":"https://doi.org/10.32614/rj-2021-050","url":null,"abstract":"Temporal data often has a hierarchical structure, defined by categorical variables describing different levels, such as political regions or sales products. Nesting of categorical variables produces a hierarchical structure. The tsibbletalk package is developed to allow a user to interactively explore temporal data, relative to the nested or crossed structures. It can help to discover differences between category levels, and uncover interesting periodic or aperiodic slices. The package implements a shared tsibble object that allows for linked brushing between coordinated views, and a shiny module that aids in wrapping time lines for seasonal patterns. The tools are demonstrated using two data examples: domestic tourism in Australia and pedestrian traffic in Melbourne.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"26 1","pages":"516"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76757537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Linear transformation models constitute a general family of parametric regression models for discrete and continuous responses. To accommodate correlated responses, the model is extended by incorporating mixed effects. This article presents the R package tramME , which builds on existing implementations of transformation models ( mlt and tram packages) as well as Laplace approximation and automatic differentiation (using the TMB package), to calculate estimates and perform likelihood inference in mixed-effects transformation models. The resulting framework can be readily applied to a wide range of regression problems with grouped data structures.
{"title":"tramME: Mixed-Effects Transformation Models Using Template Model Builder","authors":"Bálint Tamási, T. Hothorn","doi":"10.32614/rj-2021-075","DOIUrl":"https://doi.org/10.32614/rj-2021-075","url":null,"abstract":"Linear transformation models constitute a general family of parametric regression models for discrete and continuous responses. To accommodate correlated responses, the model is extended by incorporating mixed effects. This article presents the R package tramME , which builds on existing implementations of transformation models ( mlt and tram packages) as well as Laplace approximation and automatic differentiation (using the TMB package), to calculate estimates and perform likelihood inference in mixed-effects transformation models. The resulting framework can be readily applied to a wide range of regression problems with grouped data structures.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"21 1","pages":"306"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74286049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
There is a strong software engineering culture in the R developer community. We recommend creating, updating and vetting packages as well as keeping up with community standards. We invite contributions to the rOpenSci project, where participants can gain experience that will shape their work and that of their peers.
{"title":"The R Developer Community Does Have a Strong Software Engineering Culture","authors":"M. Salmon, Karthik Ram","doi":"10.32614/rj-2021-110","DOIUrl":"https://doi.org/10.32614/rj-2021-110","url":null,"abstract":"There is a strong software engineering culture in the R developer community. We recommend creating, updating and vetting packages as well as keeping up with community standards. We invite contributions to the rOpenSci project, where participants can gain experience that will shape their work and that of their peers.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"2 1","pages":"673"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80457642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe the penPHcure R package, which implements the semi-parametric proportionalhazards (PH) cure model of Sy and Taylor (2000) extended to time-varying covariates and the variable selection technique based on its SCAD-penalized likelihood proposed by Beretta and Heuchenne (2019a). In survival analysis, cure models are a useful tool when a fraction of the population is likely to be immune from the event of interest. They can separate the effects of certain factors on the probability to be susceptible and on the time until the occurrence of the event. Moreover, the penPHcure package allows the user to simulate data from a PH cure model, where the event-times are generated on a continuous scale from a piecewise exponential distribution conditional on time-varying covariates, with a method similar to Hendry (2014). We present the results of a simulation study to assess the finite sample performance of the methodology and we illustrate the functionalities of the penPHcure package using criminal recidivism data.
{"title":"penPHcure: Variable Selection in Proportional Hazards Cure Model with Time-Varying Covariates","authors":"Alessandro Beretta, C. Heuchenne","doi":"10.32614/rj-2021-061","DOIUrl":"https://doi.org/10.32614/rj-2021-061","url":null,"abstract":"We describe the penPHcure R package, which implements the semi-parametric proportionalhazards (PH) cure model of Sy and Taylor (2000) extended to time-varying covariates and the variable selection technique based on its SCAD-penalized likelihood proposed by Beretta and Heuchenne (2019a). In survival analysis, cure models are a useful tool when a fraction of the population is likely to be immune from the event of interest. They can separate the effects of certain factors on the probability to be susceptible and on the time until the occurrence of the event. Moreover, the penPHcure package allows the user to simulate data from a PH cure model, where the event-times are generated on a continuous scale from a piecewise exponential distribution conditional on time-varying covariates, with a method similar to Hendry (2014). We present the results of a simulation study to assess the finite sample performance of the methodology and we illustrate the functionalities of the penPHcure package using criminal recidivism data.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"26 1","pages":"116"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77821085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Although R programming has been a part of research since its origins in the 1990s, few studies address scientific software development from a Software Engineering (SE) perspective. The past few years have seen unparalleled growth in the R community, and it is time to push the boundaries of SE research and R programming forwards. This paper discusses relevant studies that close this gap Additionally, it proposes a set of good practices derived from those findings aiming to act as a call-to-arms for both the R and RSE (Research SE) community to explore specific, interdisciplinary paths of research.
{"title":"Software Engineering and R Programming: A Call for Research","authors":"M. Vidoni","doi":"10.32614/rj-2021-108","DOIUrl":"https://doi.org/10.32614/rj-2021-108","url":null,"abstract":"Although R programming has been a part of research since its origins in the 1990s, few studies address scientific software development from a Software Engineering (SE) perspective. The past few years have seen unparalleled growth in the R community, and it is time to push the boundaries of SE research and R programming forwards. This paper discusses relevant studies that close this gap Additionally, it proposes a set of good practices derived from those findings aiming to act as a call-to-arms for both the R and RSE (Research SE) community to explore specific, interdisciplinary paths of research.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"25 1","pages":"600"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81047549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In ophthalmology, the early detection of keratoconus is still a crucial problem. Placido disk corneal topographers are an essential tool in clinical practice, and many indices for diagnosing corneal irregularities exist. The main goal of this work is to present the R package rPACI , providing several functions to handle and analyze corneal data. This package implements primary indices of corneal irregularity (based on geometrical properties) and compound indices built from the primary ones, either using a generalized linear model, or as a Bayesian classifier using a hybrid Bayesian network and performing approximate inference. rPACI aims to make the analysis of corneal data accessible for practitioners and researchers in the field. Moreover, a shiny app was developed so that rPACI can be used in any web browser, in a truly user-friendly graphical interface, without installing R or writing any R code. It is openly deployed at https://admaldonado.shinyapps.io/rPACI/ .
{"title":"Analysis of Corneal Data in R with the rPACI Package","authors":"D. Ramos-López, A. D. Maldonado","doi":"10.32614/rj-2021-099","DOIUrl":"https://doi.org/10.32614/rj-2021-099","url":null,"abstract":"In ophthalmology, the early detection of keratoconus is still a crucial problem. Placido disk corneal topographers are an essential tool in clinical practice, and many indices for diagnosing corneal irregularities exist. The main goal of this work is to present the R package rPACI , providing several functions to handle and analyze corneal data. This package implements primary indices of corneal irregularity (based on geometrical properties) and compound indices built from the primary ones, either using a generalized linear model, or as a Bayesian classifier using a hybrid Bayesian network and performing approximate inference. rPACI aims to make the analysis of corneal data accessible for practitioners and researchers in the field. Moreover, a shiny app was developed so that rPACI can be used in any web browser, in a truly user-friendly graphical interface, without installing R or writing any R code. It is openly deployed at https://admaldonado.shinyapps.io/rPACI/ .","PeriodicalId":20974,"journal":{"name":"R J.","volume":"30 1","pages":"253"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90484390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The exPrior package implements a procedure for formulating informative priors of geostatistical properties for a target field site, called ex-situ priors and introduced in ?. The procedure uses a Bayesian hierarchical model to assimilate multiple types of data coming from multiple sites considered as similar to the target site. This prior summarizes the information contained in the data in the form of a probability density function that can be used to better inform further geostatistical investigations at the site. The formulation of the prior uses ex-situ data; where the data set can either be gathered by the user or come in the form of a structured database. The package is designed to be flexible to that regard. For illustration purposes and for easiness of use, the package is ready to be used with the worldwide hydrogeological parameter database (WWHYPDA) ?.
{"title":"exPrior: An R Package for the Formulation of Ex-Situ Priors","authors":"F. Heße, K. Cucchi, Nura Kawa, Y. Rubin","doi":"10.32614/rj-2021-031","DOIUrl":"https://doi.org/10.32614/rj-2021-031","url":null,"abstract":"The exPrior package implements a procedure for formulating informative priors of geostatistical properties for a target field site, called ex-situ priors and introduced in ?. The procedure uses a Bayesian hierarchical model to assimilate multiple types of data coming from multiple sites considered as similar to the target site. This prior summarizes the information contained in the data in the form of a probability density function that can be used to better inform further geostatistical investigations at the site. The formulation of the prior uses ex-situ data; where the data set can either be gathered by the user or come in the form of a structured database. The package is designed to be flexible to that regard. For illustration purposes and for easiness of use, the package is ready to be used with the worldwide hydrogeological parameter database (WWHYPDA) ?.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"57 1","pages":"101"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89380241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eigenvector-based spatial filtering constitutes a highly flexible semiparametric approach to account for spatial autocorrelation in a regression framework. It combines judiciously selected eigenvectors from a transformed connectivity matrix to construct a synthetic spatial filter and remove spatial patterns from model residuals. This article introduces the spfilteR package that provides several useful and flexible tools to estimate spatially filtered linear and generalized linear models in R. While the package features functions to identify relevant eigenvectors based on different selection criteria in an unsupervised fashion, it also helps users to perform supervised spatial filtering and to select eigenvectors based on alternative user-defined criteria. After a brief discussion of the eigenvectorbased spatial filtering approach, this article presents the main functions of the package and illustrates their usage. A comparison to alternative implementations in other R packages highlights the added value of the spfilteR package.
{"title":"spfilteR: An R package for Semiparametric Spatial Filtering with Eigenvectors in (Generalized) Linear Models","authors":"Sebastian Juhl","doi":"10.32614/rj-2021-085","DOIUrl":"https://doi.org/10.32614/rj-2021-085","url":null,"abstract":"Eigenvector-based spatial filtering constitutes a highly flexible semiparametric approach to account for spatial autocorrelation in a regression framework. It combines judiciously selected eigenvectors from a transformed connectivity matrix to construct a synthetic spatial filter and remove spatial patterns from model residuals. This article introduces the spfilteR package that provides several useful and flexible tools to estimate spatially filtered linear and generalized linear models in R. While the package features functions to identify relevant eigenvectors based on different selection criteria in an unsupervised fashion, it also helps users to perform supervised spatial filtering and to select eigenvectors based on alternative user-defined criteria. After a brief discussion of the eigenvectorbased spatial filtering approach, this article presents the main functions of the package and illustrates their usage. A comparison to alternative implementations in other R packages highlights the added value of the spfilteR package.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"95 1","pages":"380"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79127336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article illustrates the use of the bcmixed package and focuses on the two main functions: bcmarg and bcmmrm. The bcmarg function provides inference results for a marginal model of a mixed effect model using the Box–Cox transformation. The bcmmrm function provides model median inferences based on the mixed effect models for repeated measures analysis using the Box–Cox transformation for longitudinal randomized clinical trials. Using the bcmmrm function, analysis results with high power and high interpretability for treatment effects can be obtained for longitudinal randomized clinical trials with skewed outcomes. Further, the bcmixed package provides summarizing and visualization tools, which would be helpful to write clinical trial reports. Introduction Longitudinal data are often observed in medical or biological research. One of the most popular statistical models for studying longitudinal continuous outcomes is the linear mixed effect model. Several packages are available from CRAN that allow for the implementation of linear mixed effects models (e.g., nlme (Pinheiro et al., 2019), glme (Sam Weerahandi et al., 2021), lme4 (Bates et al., 2015), CLME (Jelsema and Peddada, 2016), PLmixed (Rockwood and Jeon, 2019), MCMCglmm (Hadfield, 2010)).The linear mixed effect models assume that longitudinal outcomes follow multivariate normal distribution. However, the distribution of the outcome is often right skewed in the medical and biological fields. Therefore, evaluating fixed effects based on the normal distribution theory may result in inefficient inferences such as power loss for some statistical tests. In addition, although a model-based mean for a certain level of the categorical exploratory variables is often estimated when applying the linear mixed effect model (e.g., the model-based mean for each treatment group of a last visit in a randomized clinical trial), the mean may be inadequate as a representative value for the skewed data. The Box–Cox transformation (Box and Cox, 1964) is often applied to skewed longitudinal data (Lipsitz et al., 2000) to reduce the skewness of a skewed outcome and apply existing statistical models based on a normal distribution. However, it is difficult to directly interpret the model mean estimated on the scale after applying some transformations to the outcome variable. For the sake of the interpretability of the analysis results, Maruo et al. (2015) propose a model median inference method on the original scale based on the Box–Cox transformation in the context of randomized clinical trials. Maruo et al. (2017) extend this method to the framework of mixed effects models for repeated measures (MMRM) analysis (Mallinckrodt et al., 2001) in the context of longitudinal randomized clinical trials. The bcmixed package (Maruo et al., 2020) contains functions to estimate model medians for longitudinal data proposed by Maruo et al. (2017) as well as a sample data set that is used in Maruo et al. (2017). In this package
{"title":"bcmixed: A Package for Median Inference on Longitudinal Data with the Box-Cox Transformation","authors":"K. Maruo, Ryota Ishii, Y. Yamaguchi, M. Gosho","doi":"10.32614/rj-2021-083","DOIUrl":"https://doi.org/10.32614/rj-2021-083","url":null,"abstract":"This article illustrates the use of the bcmixed package and focuses on the two main functions: bcmarg and bcmmrm. The bcmarg function provides inference results for a marginal model of a mixed effect model using the Box–Cox transformation. The bcmmrm function provides model median inferences based on the mixed effect models for repeated measures analysis using the Box–Cox transformation for longitudinal randomized clinical trials. Using the bcmmrm function, analysis results with high power and high interpretability for treatment effects can be obtained for longitudinal randomized clinical trials with skewed outcomes. Further, the bcmixed package provides summarizing and visualization tools, which would be helpful to write clinical trial reports. Introduction Longitudinal data are often observed in medical or biological research. One of the most popular statistical models for studying longitudinal continuous outcomes is the linear mixed effect model. Several packages are available from CRAN that allow for the implementation of linear mixed effects models (e.g., nlme (Pinheiro et al., 2019), glme (Sam Weerahandi et al., 2021), lme4 (Bates et al., 2015), CLME (Jelsema and Peddada, 2016), PLmixed (Rockwood and Jeon, 2019), MCMCglmm (Hadfield, 2010)).The linear mixed effect models assume that longitudinal outcomes follow multivariate normal distribution. However, the distribution of the outcome is often right skewed in the medical and biological fields. Therefore, evaluating fixed effects based on the normal distribution theory may result in inefficient inferences such as power loss for some statistical tests. In addition, although a model-based mean for a certain level of the categorical exploratory variables is often estimated when applying the linear mixed effect model (e.g., the model-based mean for each treatment group of a last visit in a randomized clinical trial), the mean may be inadequate as a representative value for the skewed data. The Box–Cox transformation (Box and Cox, 1964) is often applied to skewed longitudinal data (Lipsitz et al., 2000) to reduce the skewness of a skewed outcome and apply existing statistical models based on a normal distribution. However, it is difficult to directly interpret the model mean estimated on the scale after applying some transformations to the outcome variable. For the sake of the interpretability of the analysis results, Maruo et al. (2015) propose a model median inference method on the original scale based on the Box–Cox transformation in the context of randomized clinical trials. Maruo et al. (2017) extend this method to the framework of mixed effects models for repeated measures (MMRM) analysis (Mallinckrodt et al., 2001) in the context of longitudinal randomized clinical trials. The bcmixed package (Maruo et al., 2020) contains functions to estimate model medians for longitudinal data proposed by Maruo et al. (2017) as well as a sample data set that is used in Maruo et al. (2017). In this package","PeriodicalId":20974,"journal":{"name":"R J.","volume":"38 1","pages":"153"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79196592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It is a pleasure to take part in such fruitful discussion about the relationship between Software Engineering and R programming, and what could be gain by allowing each to look more closely at the other. Several discussants make valuable arguments that ought to be further discussed.
{"title":"Rejoinder: Software Engineering and R Programming","authors":"M. Vidoni","doi":"10.32614/rj-2021-112","DOIUrl":"https://doi.org/10.32614/rj-2021-112","url":null,"abstract":"It is a pleasure to take part in such fruitful discussion about the relationship between Software Engineering and R programming, and what could be gain by allowing each to look more closely at the other. Several discussants make valuable arguments that ought to be further discussed.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"6 1","pages":"713"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84048677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}