{"title":"etrm: Energy Trading and Risk Management in R","authors":"Anders D. Sleire","doi":"10.32614/rj-2022-013","DOIUrl":"https://doi.org/10.32614/rj-2022-013","url":null,"abstract":"","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"14 1","pages":"320-341"},"PeriodicalIF":2.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"69958723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multivariate Subgaussian Stable Distributions in R","authors":"B. Swihart, J. P. Nolan","doi":"10.32614/rj-2022-056","DOIUrl":"https://doi.org/10.32614/rj-2022-056","url":null,"abstract":"","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"14 1","pages":"293-302"},"PeriodicalIF":2.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"69959039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01Epub Date: 2021-08-17DOI: 10.32614/rj-2021-072
Andrew G Allmon, J S Marron, Michael G Hudgens
High-dimensional low sample size (HDLSS) data sets frequently emerge in many biomedical applications. The direction-projection-permutation (DiProPerm) test is a two-sample hypothesis test for comparing two high-dimensional distributions. The DiProPerm test is exact, i.e., the type I error is guaranteed to be controlled at the nominal level for any sample size, and thus is applicable in the HDLSS setting. This paper discusses the key components of the DiProPerm test, introduces the diproperm R package, and demonstrates the package on a real-world data set.
高维低样本量(HDLSS)数据集经常出现在许多生物医学应用中。方向-投影-畸变(DiProPerm)检验是一种双样本假设检验,用于比较两个高维分布。DiProPerm 检验是精确的,即在任何样本量下都能保证 I 型误差控制在标称水平,因此适用于 HDLSS 设置。本文讨论了 DiProPerm 检验的关键组成部分,介绍了 diproperm R 软件包,并在实际数据集上演示了该软件包。
{"title":"diproperm: An R Package for the DiProPerm Test.","authors":"Andrew G Allmon, J S Marron, Michael G Hudgens","doi":"10.32614/rj-2021-072","DOIUrl":"10.32614/rj-2021-072","url":null,"abstract":"<p><p>High-dimensional low sample size (HDLSS) data sets frequently emerge in many biomedical applications. The direction-projection-permutation (DiProPerm) test is a two-sample hypothesis test for comparing two high-dimensional distributions. The DiProPerm test is exact, i.e., the type I error is guaranteed to be controlled at the nominal level for any sample size, and thus is applicable in the HDLSS setting. This paper discusses the key components of the DiProPerm test, introduces the diproperm R package, and demonstrates the package on a real-world data set.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"13 2","pages":"266-272"},"PeriodicalIF":2.3,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9202909/pdf/nihms-1809552.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40026217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The R Journal: Changes in R 4.0–4.1","authors":"T. Kalibera, S. Meyer, K. Hornik","doi":"10.32614/CORE","DOIUrl":"https://doi.org/10.32614/CORE","url":null,"abstract":"","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"139 1","pages":"631-633"},"PeriodicalIF":2.1,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86262117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01Epub Date: 2021-06-07DOI: 10.32614/RJ-2021-033
Eashwar V Somasundaram, Shael E Brown, Adam Litzler, Jacob G Scott, Raoul R Wadhwa
Several persistent homology software libraries have been implemented in R. Specifically, the Dionysus, GUDHI, and Ripser libraries have been wrapped by the TDA and TDAstats CRAN packages. These software represent powerful analysis tools that are computationally expensive and, to our knowledge, have not been formally benchmarked. Here, we analyze runtime and memory growth for the 2 R packages and the 3 underlying libraries. We find that datasets with less than 3 dimensions can be evaluated with persistent homology fastest by the GUDHI library in the TDA package. For higher-dimensional datasets, the Ripser library in the TDAstats package is the fastest. Ripser and TDAstats are also the most memory-efficient tools to calculate persistent homology.
{"title":"Benchmarking R packages for Calculation of Persistent Homology.","authors":"Eashwar V Somasundaram, Shael E Brown, Adam Litzler, Jacob G Scott, Raoul R Wadhwa","doi":"10.32614/RJ-2021-033","DOIUrl":"https://doi.org/10.32614/RJ-2021-033","url":null,"abstract":"<p><p>Several persistent homology software libraries have been implemented in R. Specifically, the Dionysus, GUDHI, and Ripser libraries have been wrapped by the <b>TDA</b> and <b>TDAstats</b> CRAN packages. These software represent powerful analysis tools that are computationally expensive and, to our knowledge, have not been formally benchmarked. Here, we analyze runtime and memory growth for the 2 R packages and the 3 underlying libraries. We find that datasets with less than 3 dimensions can be evaluated with persistent homology fastest by the GUDHI library in the <b>TDA</b> package. For higher-dimensional datasets, the Ripser library in the TDAstats package is the fastest. Ripser and <b>TDAstats</b> are also the most memory-efficient tools to calculate persistent homology.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"13 1","pages":"184-193"},"PeriodicalIF":2.1,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8434812/pdf/nihms-1733366.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39409270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Çavuş, Olgun Aydın, Ozan Evkaya, Ozancan Özdemir, Deniz Bezer, Ugur Dar
The Why R? Turkey 2021 as a three-day online conference was organized to bring together researchers and professionals from Turkey on April 16-17-18, 2021. We hereby aimed to promote the R community in Turkey by bringing R users with different backgrounds such as genetics, sociology, finance, economy, bio-statistics. There were 8 thematic sessions and 18 invited speakers. In this article, it is aimed to describe the preparation phase, technical details, and the impact of the conference on audience.
{"title":"Conference Report of Why R? Turkey 2021","authors":"M. Çavuş, Olgun Aydın, Ozan Evkaya, Ozancan Özdemir, Deniz Bezer, Ugur Dar","doi":"10.32614/RJ-2021","DOIUrl":"https://doi.org/10.32614/RJ-2021","url":null,"abstract":"The Why R? Turkey 2021 as a three-day online conference was organized to bring together researchers and professionals from Turkey on April 16-17-18, 2021. We hereby aimed to promote the R community in Turkey by bringing R users with different backgrounds such as genetics, sociology, finance, economy, bio-statistics. There were 8 thematic sessions and 18 invited speakers. In this article, it is aimed to describe the preparation phase, technical details, and the impact of the conference on audience.","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"32 1","pages":"648-652"},"PeriodicalIF":2.1,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72897054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emily Morris, Kevin He, Yanming Li, Yi Li, Jian Kang
High-dimensional variable selection in the proportional hazards (PH) model has many successful applications in different areas. In practice, data may involve confounding variables that do not satisfy the PH assumption, in which case the stratified proportional hazards (SPH) model can be adopted to control the confounding effects by stratification without directly modeling the confounding effects. However, there is a lack of computationally efficient statistical software for high-dimensional variable selection in the SPH model. In this work an R package, SurvBoost, is developed to implement the gradient boosting algorithm for fitting the SPH model with high-dimensional covariate variables. Simulation studies demonstrate that in many scenarios SurvBoost can achieve better selection accuracy and reduce computational time substantially compared to the existing R package that implements boosting algorithms without stratification. The proposed R package is also illustrated by an analysis of gene expression data with survival outcome in The Cancer Genome Atlas study. In addition, a detailed hands-on tutorial for SurvBoost is provided.
{"title":"SurvBoost: An R Package for High-Dimensional Variable Selection in the Stratified Proportional Hazards Model via Gradient Boosting.","authors":"Emily Morris, Kevin He, Yanming Li, Yi Li, Jian Kang","doi":"10.32614/rj-2020-018","DOIUrl":"https://doi.org/10.32614/rj-2020-018","url":null,"abstract":"<p><p>High-dimensional variable selection in the proportional hazards (PH) model has many successful applications in different areas. In practice, data may involve confounding variables that do not satisfy the PH assumption, in which case the stratified proportional hazards (SPH) model can be adopted to control the confounding effects by stratification without directly modeling the confounding effects. However, there is a lack of computationally efficient statistical software for high-dimensional variable selection in the SPH model. In this work an R package, <b>SurvBoost</b>, is developed to implement the gradient boosting algorithm for fitting the SPH model with high-dimensional covariate variables. Simulation studies demonstrate that in many scenarios <b>SurvBoost</b> can achieve better selection accuracy and reduce computational time substantially compared to the existing R package that implements boosting algorithms without stratification. The proposed R package is also illustrated by an analysis of gene expression data with survival outcome in The Cancer Genome Atlas study. In addition, a detailed hands-on tutorial for <b>SurvBoost</b> is provided.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"12 1","pages":"105-117"},"PeriodicalIF":2.1,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8174798/pdf/nihms-1656432.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39084202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chi Hyun Lee, Heng Zhou, Jing Ning, Diane D Liu, Yu Shen
Data subject to length-biased sampling are frequently encountered in various applications including prevalent cohort studies and are considered as a special case of left-truncated data under the stationarity assumption. Many semiparametric regression methods have been proposed for length-biased data to model the association between covariates and the survival outcome of interest. In this paper, we present a brief review of the statistical methodologies established for the analysis of length-biased data under the Cox model, which is the most commonly adopted semiparametric model, and introduce an R package CoxPhLb that implements these methods. Specifically, the package includes features such as fitting the Cox model to explore covariate effects on survival times and checking the proportional hazards model assumptions and the stationarity assumption. We illustrate usage of the package with a simulated data example and a real dataset, the Channing House data, which are publicly available.
{"title":"CoxPhLb: An R Package for Analyzing Length Biased Data under Cox Model.","authors":"Chi Hyun Lee, Heng Zhou, Jing Ning, Diane D Liu, Yu Shen","doi":"10.32614/rj-2020-024","DOIUrl":"https://doi.org/10.32614/rj-2020-024","url":null,"abstract":"<p><p>Data subject to length-biased sampling are frequently encountered in various applications including prevalent cohort studies and are considered as a special case of left-truncated data under the stationarity assumption. Many semiparametric regression methods have been proposed for length-biased data to model the association between covariates and the survival outcome of interest. In this paper, we present a brief review of the statistical methodologies established for the analysis of length-biased data under the Cox model, which is the most commonly adopted semiparametric model, and introduce an R package <b>CoxPhLb</b> that implements these methods. Specifically, the package includes features such as fitting the Cox model to explore covariate effects on survival times and checking the proportional hazards model assumptions and the stationarity assumption. We illustrate usage of the package with a simulated data example and a real dataset, the Channing House data, which are publicly available.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"12 1","pages":"118-130"},"PeriodicalIF":2.1,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7595345/pdf/nihms-1638580.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38657972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01Epub Date: 2019-08-20DOI: 10.32614/rj-2019-038
Danilo Alvares, Sebastien Haneuse, Catherine Lee, Kyu Ha Lee
Semi-competing risks refer to the setting where primary scientific interest lies in estimation and inference with respect to a non-terminal event, the occurrence of which is subject to a terminal event. In this paper, we present the R package SemiCompRisks that provides functions to perform the analysis of independent/clustered semi-competing risks data under the illness-death multi-state model. The package allows the user to choose the specification for model components from a range of options giving users substantial flexibility, including: accelerated failure time or proportional hazards regression models; parametric or non-parametric specifications for baseline survival functions; parametric or non-parametric specifications for random effects distributions when the data are cluster-correlated; and, a Markov or semi-Markov specification for terminal event following non-terminal event. While estimation is mainly performed within the Bayesian paradigm, the package also provides the maximum likelihood estimation for select parametric models. The package also includes functions for univariate survival analysis as complementary analysis tools.
{"title":"SemiCompRisks: An R Package for the Analysis of Independent and Cluster-correlated Semi-competing Risks Data.","authors":"Danilo Alvares, Sebastien Haneuse, Catherine Lee, Kyu Ha Lee","doi":"10.32614/rj-2019-038","DOIUrl":"https://doi.org/10.32614/rj-2019-038","url":null,"abstract":"<p><p>Semi-competing risks refer to the setting where primary scientific interest lies in estimation and inference with respect to a non-terminal event, the occurrence of which is subject to a terminal event. In this paper, we present the R package <b>SemiCompRisks</b> that provides functions to perform the analysis of independent/clustered semi-competing risks data under the illness-death multi-state model. The package allows the user to choose the specification for model components from a range of options giving users substantial flexibility, including: accelerated failure time or proportional hazards regression models; parametric or non-parametric specifications for baseline survival functions; parametric or non-parametric specifications for random effects distributions when the data are cluster-correlated; and, a Markov or semi-Markov specification for terminal event following non-terminal event. While estimation is mainly performed within the Bayesian paradigm, the package also provides the maximum likelihood estimation for select parametric models. The package also includes functions for univariate survival analysis as complementary analysis tools.</p>","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"11 1","pages":"376-400"},"PeriodicalIF":2.1,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7889044/pdf/nihms-1668679.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25382986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}