The smacof package offers a comprehensive implementation of multidimensional scaling (MDS) techniques in R . Since its first publication (De Leeuw and Mair 2009b) the functionality of the package has been enhanced, and several additional methods, features and utilities were added. Major updates include a complete re-implementation of multidimensional unfolding allowing for monotone dissimilarity transformations, including row-conditional, circular, and external unfolding. Additionally, the constrained MDS implementation was extended in terms of optimal scaling of the external variables. Further package additions include various tools and functions for goodness-of-fit assessment, unidimensional scaling, gravity MDS, asymmetric MDS, Procrustes, and MDS biplots. All these new package functionalities are illustrated using a variety of real-life applications.
smacof包在R中提供了多维缩放(MDS)技术的全面实现。自从第一次发布(De Leeuw and maair 2009b)以来,包的功能得到了增强,并添加了一些额外的方法、特性和实用程序。主要的更新包括完全重新实现多维展开,允许单调的不相似转换,包括行条件展开、循环展开和外部展开。此外,根据外部变量的最优缩放对约束MDS实现进行了扩展。进一步增加的包包括各种工具和功能,用于拟合优度评估、一维缩放、重力MDS、非对称MDS、Procrustes和MDS双标图。所有这些新的包功能都使用各种实际应用程序进行了说明。
{"title":"More on Multidimensional Scaling and Unfolding in R: smacof Version 2","authors":"P. Mair, P. Groenen, J. Leeuw","doi":"10.18637/jss.v102.i10","DOIUrl":"https://doi.org/10.18637/jss.v102.i10","url":null,"abstract":"The smacof package offers a comprehensive implementation of multidimensional scaling (MDS) techniques in R . Since its first publication (De Leeuw and Mair 2009b) the functionality of the package has been enhanced, and several additional methods, features and utilities were added. Major updates include a complete re-implementation of multidimensional unfolding allowing for monotone dissimilarity transformations, including row-conditional, circular, and external unfolding. Additionally, the constrained MDS implementation was extended in terms of optimal scaling of the external variables. Further package additions include various tools and functions for goodness-of-fit assessment, unidimensional scaling, gravity MDS, asymmetric MDS, Procrustes, and MDS biplots. All these new package functionalities are illustrated using a variety of real-life applications.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"113 ","pages":""},"PeriodicalIF":5.8,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72544488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Due to tradition and ease of estimation, the vast majority of clinical and epidemiological papers with time-to-event data report hazard ratios from Cox proportional hazards regression models. Although hazard ratios are well known, they can be difficult to interpret, particularly as causal contrasts, in many settings. Nonparametric or fully parametric estimators allow for the direct estimation of more easily causally interpretable estimands such as the cumulative incidence and restricted mean survival. However, modeling these quantities as functions of covariates is limited to a few categorical covariates with nonparametric estimators, and often requires simulation or numeric integration with parametric estimators. Combining pseudo-observations based on non-parametric estimands with parametric regression on the pseudo-observations allows for the best of these two approaches and has many nice properties. In this paper, we develop a user friendly, easy to understand way of doing event history regression for the cumulative incidence and the restricted mean survival, using the pseudo-observation framework for estimation. The interface uses the well known formulation of a generalized linear model and allows for features including plotting of residuals, the use of sampling weights, and correct variance estimation.
{"title":"Event History Regression with Pseudo-Observations: Computational Approaches and an Implementation in R","authors":"M. Sachs, E. Gabriel","doi":"10.18637/jss.v102.i09","DOIUrl":"https://doi.org/10.18637/jss.v102.i09","url":null,"abstract":"Due to tradition and ease of estimation, the vast majority of clinical and epidemiological papers with time-to-event data report hazard ratios from Cox proportional hazards regression models. Although hazard ratios are well known, they can be difficult to interpret, particularly as causal contrasts, in many settings. Nonparametric or fully parametric estimators allow for the direct estimation of more easily causally interpretable estimands such as the cumulative incidence and restricted mean survival. However, modeling these quantities as functions of covariates is limited to a few categorical covariates with nonparametric estimators, and often requires simulation or numeric integration with parametric estimators. Combining pseudo-observations based on non-parametric estimands with parametric regression on the pseudo-observations allows for the best of these two approaches and has many nice properties. In this paper, we develop a user friendly, easy to understand way of doing event history regression for the cumulative incidence and the restricted mean survival, using the pseudo-observation framework for estimation. The interface uses the well known formulation of a generalized linear model and allows for features including plotting of residuals, the use of sampling weights, and correct variance estimation.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"74 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72655071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Dunnington, Nell Libera, J. Kurek, I. Spooner, G. Gagnon
This paper presents the tidypaleo package for R, which enables high-quality reproducible visualizations of time-stratigraphic multivariate data that is common to several disciplines of the natural sciences. Rather than introduce new plotting functions, the tidypaleo package defines several orthogonal components of the ggplot2 package that, when combined, enable most types of stratigraphic diagrams to be created. We do so by conceptualizing multi-parameter data as a series of measurements (rows) with attributes (columns), enabling the use of the ggplot2 facet mechanism to display multi-parameter data. The orthogonal components include (1) scales that represent relative abundance and concentration values, (2) geometries that are commonly used in paleoenvironmental diagrams created elsewhere, (3) facets that correctly assign scales and sizes to panels representing multiple parameters, and (4) theme elements that enable tidypaleo to create elegant graphics. Collectively, this approach demonstrates the efficacy of a minimal ggplot2 wrapper to create domain-specific plots.
{"title":"tidypaleo: Visualizing Paleoenvironmental Archives Using ggplot2","authors":"D. Dunnington, Nell Libera, J. Kurek, I. Spooner, G. Gagnon","doi":"10.18637/jss.v101.i07","DOIUrl":"https://doi.org/10.18637/jss.v101.i07","url":null,"abstract":"This paper presents the tidypaleo package for R, which enables high-quality reproducible visualizations of time-stratigraphic multivariate data that is common to several disciplines of the natural sciences. Rather than introduce new plotting functions, the tidypaleo package defines several orthogonal components of the ggplot2 package that, when combined, enable most types of stratigraphic diagrams to be created. We do so by conceptualizing multi-parameter data as a series of measurements (rows) with attributes (columns), enabling the use of the ggplot2 facet mechanism to display multi-parameter data. The orthogonal components include (1) scales that represent relative abundance and concentration values, (2) geometries that are commonly used in paleoenvironmental diagrams created elsewhere, (3) facets that correctly assign scales and sizes to panels representing multiple parameters, and (4) theme elements that enable tidypaleo to create elegant graphics. Collectively, this approach demonstrates the efficacy of a minimal ggplot2 wrapper to create domain-specific plots.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"54 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79585340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Battaglini, Valerio Leone Sciabolazza, Eleonora Patacchini, Sida Peng
The R package econet provides methods for estimating parameter-dependent network centrality measures with linear-in-means models. Both nonlinear least squares and maximum likelihood estimators are implemented. The methods allow for both link and node heterogeneity in network effects, endogenous network formation and the presence of unconnected nodes. The routines also compare the explanatory power of parameter-dependent network centrality measures with those of standard measures of network centrality. Ben-efits and features of the econet package are illustrated using data from Battaglini and Patacchini (2018) and Battaglini, Leone Sciabolazza, and Patacchini (2020).
{"title":"econet: An R Package for Parameter-Dependent Network Centrality Measures","authors":"M. Battaglini, Valerio Leone Sciabolazza, Eleonora Patacchini, Sida Peng","doi":"10.18637/jss.v102.i08","DOIUrl":"https://doi.org/10.18637/jss.v102.i08","url":null,"abstract":"The R package econet provides methods for estimating parameter-dependent network centrality measures with linear-in-means models. Both nonlinear least squares and maximum likelihood estimators are implemented. The methods allow for both link and node heterogeneity in network effects, endogenous network formation and the presence of unconnected nodes. The routines also compare the explanatory power of parameter-dependent network centrality measures with those of standard measures of network centrality. Ben-efits and features of the econet package are illustrated using data from Battaglini and Patacchini (2018) and Battaglini, Leone Sciabolazza, and Patacchini (2020).","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"26 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85853674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
modelsummary is a package to summarize data and statistical models in R . It supports over one hundred types of models out-of-the-box, and allows users to report the results of those models side-by-side in a table, or in coefficient plots. It makes it easy to execute common tasks such as computing robust standard errors, adding significance stars, and manipulating coefficient and model labels. Beyond model summaries, the package also includes a suite of tools to produce highly flexible data summary tables, such as dataset overviews, correlation matrices, (multi-level) cross-tabulations, and balance tables (also known as “Table 1”). The appearance of the tables produced by modelsummary can be customized using external packages such as kableExtra , gt , flextable , or huxtable ; the plots can be customized using ggplot2 . Tables can be exported to many output formats, including HTML, L A TEX, Text/Markdown, Microsoft Word, Powerpoint, Excel, RTF, PDF, and image files. Tables and plots can be embedded seamlessly in rmarkdown , knitr , or Sweave dynamic documents. The modelsummary package is designed to be simple, robust, modular, and extensible.
{"title":"modelsummary: Data and Model Summaries in R","authors":"Vincent Arel‐Bundock","doi":"10.18637/jss.v103.i01","DOIUrl":"https://doi.org/10.18637/jss.v103.i01","url":null,"abstract":"modelsummary is a package to summarize data and statistical models in R . It supports over one hundred types of models out-of-the-box, and allows users to report the results of those models side-by-side in a table, or in coefficient plots. It makes it easy to execute common tasks such as computing robust standard errors, adding significance stars, and manipulating coefficient and model labels. Beyond model summaries, the package also includes a suite of tools to produce highly flexible data summary tables, such as dataset overviews, correlation matrices, (multi-level) cross-tabulations, and balance tables (also known as “Table 1”). The appearance of the tables produced by modelsummary can be customized using external packages such as kableExtra , gt , flextable , or huxtable ; the plots can be customized using ggplot2 . Tables can be exported to many output formats, including HTML, L A TEX, Text/Markdown, Microsoft Word, Powerpoint, Excel, RTF, PDF, and image files. Tables and plots can be embedded seamlessly in rmarkdown , knitr , or Sweave dynamic documents. The modelsummary package is designed to be simple, robust, modular, and extensible.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"10 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86938793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present the features and implementation of the R package nvmix for the class of normal variance mixtures including Student t and normal distributions. The package provides functionalities for such distributions, notably the evaluation of the distribution and density function as well as likelihood-based parameter estimation. The distributional family is specified through the quantile function of the underlying mixing random variable. The R package nvmix thus allows one to model multivariate distributions well beyond the classical multivariate normal and t case. Additional functionalities include graphical goodness-of-fit assessment, the estimation of the risk measures value-at-risk and expected shortfall for univariate normal variance mixture distributions and functions to work with normal variance mixture copulas, such as sampling and the evaluation of normal variance mixture copulas and their densities. Furthermore, the package nvmix also provides functionalities for the evaluation of the distribution and density function as well as random variate generation for the more general class of grouped normal variance mixtures.
{"title":"Multivariate Normal Variance Mixtures in R: The R Package nvmix","authors":"Erik Hintz, M. Hofert, C. Lemieux","doi":"10.18637/jss.v102.i02","DOIUrl":"https://doi.org/10.18637/jss.v102.i02","url":null,"abstract":"We present the features and implementation of the R package nvmix for the class of normal variance mixtures including Student t and normal distributions. The package provides functionalities for such distributions, notably the evaluation of the distribution and density function as well as likelihood-based parameter estimation. The distributional family is specified through the quantile function of the underlying mixing random variable. The R package nvmix thus allows one to model multivariate distributions well beyond the classical multivariate normal and t case. Additional functionalities include graphical goodness-of-fit assessment, the estimation of the risk measures value-at-risk and expected shortfall for univariate normal variance mixture distributions and functions to work with normal variance mixture copulas, such as sampling and the evaluation of normal variance mixture copulas and their densities. Furthermore, the package nvmix also provides functionalities for the evaluation of the distribution and density function as well as random variate generation for the more general class of grouped normal variance mixtures.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"2 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89029165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pro Data Visualization Using R and JavaScript: Analyze and Visualize Key Data on the Web","authors":"U. Grömping","doi":"10.18637/jss.v102.b01","DOIUrl":"https://doi.org/10.18637/jss.v102.b01","url":null,"abstract":"","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"57 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87480454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Efficient coding and improvements in the execution order of the up-and-down-blocks algorithm for monotone or isotonic regression leads to a significant increase in speed as well as a short and simple O ( n ) implementation. Algorithms that use monotone regression as a subroutine, e.g., unimodal or bivariate monotone regression, also benefit from the acceleration. A substantive comparison with and characterization of currently available implementations provides an extensive overview of up-and-down-blocks implementations for the pool-adjacent-violators algorithm for simple linear ordered monotone regression.
{"title":"Monotone Regression: A Simple and Fast O(n) PAVA Implementation","authors":"F. Busing","doi":"10.18637/jss.v102.c01","DOIUrl":"https://doi.org/10.18637/jss.v102.c01","url":null,"abstract":"Efficient coding and improvements in the execution order of the up-and-down-blocks algorithm for monotone or isotonic regression leads to a significant increase in speed as well as a short and simple O ( n ) implementation. Algorithms that use monotone regression as a subroutine, e.g., unimodal or bivariate monotone regression, also benefit from the acceleration. A substantive comparison with and characterization of currently available implementations provides an extensive overview of up-and-down-blocks implementations for the pool-adjacent-violators algorithm for simple linear ordered monotone regression.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"1 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67679178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper introduces the usage and performance of the R package tlrmvnmvt, aimed at computing high-dimensional multivariate normal and Student-t probabilities. The package implements the tile-low-rank methods with block reordering and the separationof-variable methods with univariate reordering. The performance is compared with two other state-of-the-art R packages, namely the mvtnorm and the TruncatedNormal packages. Our package has the best scalability and is likely to be the only option in thousands of dimensions. However, for applications with high accuracy requirements, the TruncatedNormal package is more suitable. As an application example, we show that the excursion sets of a latent Gaussian random field can be computed with the tlrmvnmvt package without any model approximation and hence, the accuracy of the produced excursion sets is improved.
{"title":"tlrmvnmvt: Computing High-Dimensional Multivariate Normal and Student- t Probabilities with Low-Rank Methods in R","authors":"Jian Cao, M. Genton, D. Keyes, G. Turkiyyah","doi":"10.18637/jss.v101.i04","DOIUrl":"https://doi.org/10.18637/jss.v101.i04","url":null,"abstract":"This paper introduces the usage and performance of the R package tlrmvnmvt, aimed at computing high-dimensional multivariate normal and Student-t probabilities. The package implements the tile-low-rank methods with block reordering and the separationof-variable methods with univariate reordering. The performance is compared with two other state-of-the-art R packages, namely the mvtnorm and the TruncatedNormal packages. Our package has the best scalability and is likely to be the only option in thousands of dimensions. However, for applications with high accuracy requirements, the TruncatedNormal package is more suitable. As an application example, we show that the excursion sets of a latent Gaussian random field can be computed with the tlrmvnmvt package without any model approximation and hence, the accuracy of the produced excursion sets is improved.","PeriodicalId":17237,"journal":{"name":"Journal of Statistical Software","volume":"188 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76051787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}