首页 > 最新文献

R J.最新文献

英文 中文
rassta: Raster-Based Spatial Stratification Algorithms 基于栅格的空间分层算法
Pub Date : 2021-11-17 DOI: 10.31223/x50s57
B. Fuentes, Minerva J. Dorantes, John R. Tipton
Spatial stratification of landscapes allows for the development of efficient sampling surveys,the inclusion of domain knowledge in data-driven modeling frameworks, and the production of information relating the spatial variability of response phenomena to that of landscape processes. This work presents the rassta package as a collection of algorithms dedicated to the spatial stratification of landscapes, the calculation of landscape correspondence metrics across geographic space, and the application of these metrics for spatial sampling and modeling of environmental phenomena. The theoretical background of rassta is presented through references to several studies which have benefited from landscape stratification routines. The functionality of rassta is presented through code examples which are complemented with the geographic visualization of their outputs.
景观的空间分层允许开展有效的抽样调查,将领域知识纳入数据驱动的建模框架,并产生与景观过程的响应现象的空间变异性有关的信息。这项工作将rassta包作为一系列算法的集合,专门用于景观的空间分层,跨地理空间的景观对应度量的计算,以及这些度量在空间采样和环境现象建模中的应用。通过参考几项得益于景观分层惯例的研究,介绍了景观分层的理论背景。rassta的功能通过代码示例展示,并辅以其输出的地理可视化。
{"title":"rassta: Raster-Based Spatial Stratification Algorithms","authors":"B. Fuentes, Minerva J. Dorantes, John R. Tipton","doi":"10.31223/x50s57","DOIUrl":"https://doi.org/10.31223/x50s57","url":null,"abstract":"Spatial stratification of landscapes allows for the development of efficient sampling surveys,the inclusion of domain knowledge in data-driven modeling frameworks, and the production of information relating the spatial variability of response phenomena to that of landscape processes. This work presents the rassta package as a collection of algorithms dedicated to the spatial stratification of landscapes, the calculation of landscape correspondence metrics across geographic space, and the application of these metrics for spatial sampling and modeling of environmental phenomena. The theoretical background of rassta is presented through references to several studies which have benefited from landscape stratification routines. The functionality of rassta is presented through code examples which are complemented with the geographic visualization of their outputs.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"11 1","pages":"288-309"},"PeriodicalIF":0.0,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82378366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
did2s: Two-Stage Difference-in-Differences did2:两阶段差异中的差异
Pub Date : 2021-09-10 DOI: 10.32614/RJ-2022-048
K. Butts, J. Gardner
Recent work has highlighted the difficulties of estimating difference-in-differences models when treatment timing occurs at different times for different units. This article introduces the R package did2s which implements the estimator introduced in Gardner (2021). The article provides an approachable review of the underlying econometric theory and introduces the syntax for the function did2s. Further, the package introduces a function, event_study, that provides a common syntax for all the modern event-study estimators and plot_event_study to plot the results of each estimator.
最近的工作强调了当治疗时间发生在不同单位的不同时间时,估计差异中差异模型的困难。本文介绍了R包did2s,它实现了Gardner(2021)中介绍的估计器。本文提供了一个平易近人的基础计量经济学理论的回顾,并介绍了函数did2的语法。此外,该包还引入了一个函数event_study,它为所有现代事件研究估计器和plot_event_study提供了通用语法,用于绘制每个估计器的结果。
{"title":"did2s: Two-Stage Difference-in-Differences","authors":"K. Butts, J. Gardner","doi":"10.32614/RJ-2022-048","DOIUrl":"https://doi.org/10.32614/RJ-2022-048","url":null,"abstract":"Recent work has highlighted the difficulties of estimating difference-in-differences models when treatment timing occurs at different times for different units. This article introduces the R package did2s which implements the estimator introduced in Gardner (2021). The article provides an approachable review of the underlying econometric theory and introduces the syntax for the function did2s. Further, the package introduces a function, event_study, that provides a common syntax for all the modern event-study estimators and plot_event_study to plot the results of each estimator.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"27 1","pages":"162-173"},"PeriodicalIF":0.0,"publicationDate":"2021-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83234916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
dbcsp: User-friendly R package for Distance-Based Common Spacial Patterns dbcsp:基于距离的公共空间模式的用户友好R包
Pub Date : 2021-09-02 DOI: 10.32614/rj-2022-044
Itsaso Rodríguez-Moreno, I. Irigoien, B. Sierra, C. Arenas
Common Spacial Patterns (CSP) is a widely used method to analyse electroencephalography (EEG) data, concerning the supervised classification of brain's activity. More generally, it can be useful to distinguish between multivariate signals recorded during a time span for two different classes. CSP is based on the simultaneous diagonalization of the average covariance matrices of signals from both classes and it allows to project the data into a low-dimensional subspace. Once data are represented in a low-dimensional subspace, a classification step must be carried out. The original CSP method is based on the Euclidean distance between signals and here, we extend it so that it can be applied on any appropriate distance for data at hand. Both, the classical CSP and the new Distance-Based CSP (DB-CSP) are implemented in an R package, called dbcsp.
共同空间模式(CSP)是一种广泛应用于脑电图数据分析的方法,涉及脑活动的监督分类。更一般地说,它可以用于区分在两个不同类别的时间跨度内记录的多变量信号。CSP基于两类信号的平均协方差矩阵的同时对角化,它允许将数据投影到低维子空间中。一旦数据在低维子空间中表示,就必须执行分类步骤。原来的CSP方法是基于信号之间的欧几里得距离,在这里,我们扩展了它,使它可以应用于任何适当的距离的数据。经典的CSP和新的基于距离的CSP (DB-CSP)都是在一个名为dbcsp的R包中实现的。
{"title":"dbcsp: User-friendly R package for Distance-Based Common Spacial Patterns","authors":"Itsaso Rodríguez-Moreno, I. Irigoien, B. Sierra, C. Arenas","doi":"10.32614/rj-2022-044","DOIUrl":"https://doi.org/10.32614/rj-2022-044","url":null,"abstract":"Common Spacial Patterns (CSP) is a widely used method to analyse electroencephalography (EEG) data, concerning the supervised classification of brain's activity. More generally, it can be useful to distinguish between multivariate signals recorded during a time span for two different classes. CSP is based on the simultaneous diagonalization of the average covariance matrices of signals from both classes and it allows to project the data into a low-dimensional subspace. Once data are represented in a low-dimensional subspace, a classification step must be carried out. The original CSP method is based on the Euclidean distance between signals and here, we extend it so that it can be applied on any appropriate distance for data at hand. Both, the classical CSP and the new Distance-Based CSP (DB-CSP) are implemented in an R package, called dbcsp.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"135 1","pages":"80-94"},"PeriodicalIF":0.0,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89071464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalized Linear Randomized Response Modeling using GLMMRR 基于GLMMRR的广义线性随机响应建模
Pub Date : 2021-06-18 DOI: 10.32614/rj-2021-104
J. Fox, K. Klotzke, D. Veen
Randomized response (RR) designs are used to collect response data about sensitive behaviors (e.g., criminal behavior, sexual desires). The modeling of RR data is more complex since it requires a description of the RR process. For the class of generalized linear mixed models (GLMMs), the RR process can be represented by an adjusted link function, which relates the expected RR to the linear predictor for most common RR designs. The package GLMMRR includes modified link functions for four different cumulative distributions (i.e., logistic, cumulative normal, Gumbel, Cauchy) for GLMs and GLMMs, where the package lme4 facilitates ML and REML estimation. The mixed modeling framework in GLMMRR can be used to jointly analyze data collected under different designs (e.g., dual questioning, multilevel, mixed mode, repeated measurements designs, multiple-group designs). Model-fit tests, tools for residual analyses, and plot functions to give support to a profound RR data analysis are added to the well-known features of the GLM and GLMM software (package lme4). Data of Höglinger and Jann (2018) and Höglinger, Jann, and Diekmann (2014) are used to illustrate the methodology and software.
随机反应(RR)设计用于收集敏感行为(如犯罪行为、性欲)的反应数据。RR数据的建模更为复杂,因为它需要对RR过程进行描述。对于广义线性混合模型(glmm), RR过程可以用一个调整后的链接函数来表示,该函数将期望RR与大多数常见RR设计的线性预测器联系起来。GLMMRR包包含针对glm和glmm的四种不同累积分布(即logistic、累积正态、Gumbel、Cauchy)的修改链接函数,其中包lme4促进了ML和REML的估计。GLMMRR中的混合建模框架可用于联合分析不同设计(如双问、多级、混合模式、重复测量设计、多组设计)下收集的数据。模型拟合检验、残差分析工具和支持深度RR数据分析的绘图函数被添加到GLM和GLMM软件(软件包lme4)的众所周知的功能中。使用Höglinger and Jann(2018)和Höglinger, Jann, and Diekmann(2014)的数据来说明方法和软件。
{"title":"Generalized Linear Randomized Response Modeling using GLMMRR","authors":"J. Fox, K. Klotzke, D. Veen","doi":"10.32614/rj-2021-104","DOIUrl":"https://doi.org/10.32614/rj-2021-104","url":null,"abstract":"Randomized response (RR) designs are used to collect response data about sensitive behaviors (e.g., criminal behavior, sexual desires). The modeling of RR data is more complex since it requires a description of the RR process. For the class of generalized linear mixed models (GLMMs), the RR process can be represented by an adjusted link function, which relates the expected RR to the linear predictor for most common RR designs. The package GLMMRR includes modified link functions for four different cumulative distributions (i.e., logistic, cumulative normal, Gumbel, Cauchy) for GLMs and GLMMs, where the package lme4 facilitates ML and REML estimation. The mixed modeling framework in GLMMRR can be used to jointly analyze data collected under different designs (e.g., dual questioning, multilevel, mixed mode, repeated measurements designs, multiple-group designs). Model-fit tests, tools for residual analyses, and plot functions to give support to a profound RR data analysis are added to the well-known features of the GLM and GLMM software (package lme4). Data of Höglinger and Jann (2018) and Höglinger, Jann, and Diekmann (2014) are used to illustrate the methodology and software.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"50 1","pages":"507"},"PeriodicalIF":0.0,"publicationDate":"2021-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82372808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RFpredInterval: An R Package for Prediction Intervals with Random Forests and Boosted Forests RFpredInterval:一个R包,用于随机森林和提升森林的预测区间
Pub Date : 2021-06-15 DOI: 10.32614/rj-2022-012
Cansu Alakus, Denis Larocque, A. Labbe
Like many predictive models, random forests provide point predictions for new observations. Besides the point prediction, it is important to quantify the uncertainty in the prediction. Prediction intervals provide information about the reliability of the point predictions. We have developed a comprehensive R package, RFpredInterval, that integrates 16 methods to build prediction intervals with random forests and boosted forests. The set of methods implemented in the package includes a new method to build prediction intervals with boosted forests (PIBF) and 15 method variations to produce prediction intervals with random forests, as proposed by Roy and Larocque (2020). We perform an extensive simulation study and apply real data analyses to compare the performance of the proposed method to ten existing methods for building prediction intervals with random forests. The results show that the proposed method is very competitive and, globally, outperforms competing methods.
像许多预测模型一样,随机森林为新的观测提供点预测。除了点预测之外,对预测中的不确定性进行量化也很重要。预测区间提供了关于点预测可靠性的信息。我们开发了一个全面的R包RFpredInterval,它集成了16种方法来构建随机森林和增强森林的预测区间。包中实现的一组方法包括Roy和Larocque(2020)提出的一种使用增强森林(PIBF)构建预测区间的新方法和15种使用随机森林生成预测区间的方法变体。我们进行了广泛的模拟研究,并应用实际数据分析来比较所提出的方法与十种现有的随机森林预测区间构建方法的性能。结果表明,该方法具有很强的竞争力,并且在全局范围内优于其他竞争方法。
{"title":"RFpredInterval: An R Package for Prediction Intervals with Random Forests and Boosted Forests","authors":"Cansu Alakus, Denis Larocque, A. Labbe","doi":"10.32614/rj-2022-012","DOIUrl":"https://doi.org/10.32614/rj-2022-012","url":null,"abstract":"Like many predictive models, random forests provide point predictions for new observations. Besides the point prediction, it is important to quantify the uncertainty in the prediction. Prediction intervals provide information about the reliability of the point predictions. We have developed a comprehensive R package, RFpredInterval, that integrates 16 methods to build prediction intervals with random forests and boosted forests. The set of methods implemented in the package includes a new method to build prediction intervals with boosted forests (PIBF) and 15 method variations to produce prediction intervals with random forests, as proposed by Roy and Larocque (2020). We perform an extensive simulation study and apply real data analyses to compare the performance of the proposed method to ten existing methods for building prediction intervals with random forests. The results show that the proposed method is very competitive and, globally, outperforms competing methods.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"37 1","pages":"300-320"},"PeriodicalIF":0.0,"publicationDate":"2021-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74752246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Bootstrapping Clustered Data in R using lmeresampler 在R中使用lmeresampler引导聚类数据
Pub Date : 2021-06-11 DOI: 10.32614/rj-2023-015
A. Loy, J. Korobova
Linear mixed-effects models are commonly used to analyze clustered data structures. There are numerous packages to fit these models in R and conduct likelihood-based inference. The implementation of resampling-based procedures for inference are more limited. In this paper, we introduce the lmeresampler package for bootstrapping nested linear mixed-effects models fit via lme4 or nlme. Bootstrap estimation allows for bias correction, adjusted standard errors and confidence intervals for small samples sizes and when distributional assumptions break down. We will also illustrate how bootstrap resampling can be used to diagnose this model class. In addition, lmeresampler makes it easy to construct interval estimates of functions of model parameters.
线性混合效应模型通常用于分析聚类数据结构。在R中有许多包可以拟合这些模型并进行基于似然的推断。基于重采样的推理程序的实现更加有限。在本文中,我们介绍了lmeresampler包,用于引导通过lme4或nlme拟合的嵌套线性混合效应模型。自举估计允许偏差校正,调整标准误差和小样本量的置信区间,当分布假设打破。我们还将说明如何使用自举重新采样来诊断这类模型。此外,该方法使模型参数函数的区间估计易于构造。
{"title":"Bootstrapping Clustered Data in R using lmeresampler","authors":"A. Loy, J. Korobova","doi":"10.32614/rj-2023-015","DOIUrl":"https://doi.org/10.32614/rj-2023-015","url":null,"abstract":"Linear mixed-effects models are commonly used to analyze clustered data structures. There are numerous packages to fit these models in R and conduct likelihood-based inference. The implementation of resampling-based procedures for inference are more limited. In this paper, we introduce the lmeresampler package for bootstrapping nested linear mixed-effects models fit via lme4 or nlme. Bootstrap estimation allows for bias correction, adjusted standard errors and confidence intervals for small samples sizes and when distributional assumptions break down. We will also illustrate how bootstrap resampling can be used to diagnose this model class. In addition, lmeresampler makes it easy to construct interval estimates of functions of model parameters.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"33 1","pages":"103-120"},"PeriodicalIF":0.0,"publicationDate":"2021-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84458796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
FMM: An R Package for Modeling Rhythmic Patterns in Oscillatory Systems FMM:一个用于振荡系统节奏模式建模的R包
Pub Date : 2021-05-21 DOI: 10.32614/RJ-2022-015
Itziar Fernández, Alejandro Rodríguez-Collado, Yolanda Larriba, Adrián Lamela, Christian Canedo, Cristina Rueda
This paper is dedicated to the R package FMM which implements a novel approach to describe rhythmic patterns in oscillatory signals. The frequency modulated M"obius (FMM) model is defined as a parametric signal plus a gaussian noise, where the signal can be described as a single or a sum of waves. The FMM approach is flexible enough to describe a great variety of rhythmic patterns. The FMM package includes all required functions to fit and explore single and multi-wave FMM models, as well as a restricted version that allows equality constraints between parameters representing a priori knowledge about the shape to be included. Moreover, the FMM package can generate synthetic data and visualize the results of the fitting process. The potential of this methodology is illustrated with examples of such biological oscillations as the circadian rhythm in gene expression, the electrical activity of the heartbeat and neuronal activity.
本文介绍了R包FMM,它实现了一种描述振荡信号节奏模式的新方法。调频M obius (FMM)模型被定义为一个参数信号加上高斯噪声,其中信号可以被描述为单个或多个波。FMM方法足够灵活,可以描述各种各样的节奏模式。FMM包包括拟合和探索单波和多波FMM模型所需的所有功能,以及一个限制版本,允许在表示有关形状的先验知识的参数之间进行相等约束。此外,FMM包可以生成合成数据,并将拟合过程的结果可视化。这种方法的潜力是通过诸如基因表达的昼夜节律、心跳的电活动和神经元活动等生物振荡的例子来说明的。
{"title":"FMM: An R Package for Modeling Rhythmic Patterns in Oscillatory Systems","authors":"Itziar Fernández, Alejandro Rodríguez-Collado, Yolanda Larriba, Adrián Lamela, Christian Canedo, Cristina Rueda","doi":"10.32614/RJ-2022-015","DOIUrl":"https://doi.org/10.32614/RJ-2022-015","url":null,"abstract":"This paper is dedicated to the R package FMM which implements a novel approach to describe rhythmic patterns in oscillatory signals. The frequency modulated M\"obius (FMM) model is defined as a parametric signal plus a gaussian noise, where the signal can be described as a single or a sum of waves. The FMM approach is flexible enough to describe a great variety of rhythmic patterns. The FMM package includes all required functions to fit and explore single and multi-wave FMM models, as well as a restricted version that allows equality constraints between parameters representing a priori knowledge about the shape to be included. Moreover, the FMM package can generate synthetic data and visualize the results of the fitting process. The potential of this methodology is illustrated with examples of such biological oscillations as the circadian rhythm in gene expression, the electrical activity of the heartbeat and neuronal activity.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"60 1","pages":"361-380"},"PeriodicalIF":0.0,"publicationDate":"2021-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80532832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Visual Diagnostics for Constrained Optimisation with Application to Guided Tours 约束优化的可视化诊断与导游应用
Pub Date : 2021-04-08 DOI: 10.32614/RJ-2021-105
H.Sherry Zhang, D. Cook, U. Laa, Nicolas Langren'e, Patricia Men'endez
A guided tour helps to visualise high-dimensional data by showing low-dimensional projections along a projection pursuit optimisation path. Projection pursuit is a generalisation of principal component analysis, in the sense that different indexes are used to define the interestingness of the projected data. While much work has been done in developing new indexes in the literature, less has been done on understanding the optimisation. Index functions can be noisy, might have multiple local maxima as well as an optimal maximum, and are constrained to generate orthonormal projection frames, which complicates the optimization. In addition, projection pursuit is primarily used for exploratory data analysis, and finding the local maxima is also useful. The guided tour is especially useful for exploration, because it conducts geodesic interpolation connecting steps in the optimisation and shows how the projected data changes as a maxima is approached. This work provides new visual diagnostics for examining a choice of optimisation procedure, based on the provision of a new data object which collects information throughout the optimisation. It has helped to diagnose and fix several problems with projection pursuit guided tour. This work might be useful more broadly for diagnosing optimisers, and comparing their performance. The diagnostics are implemented in the R package, ferrn.
导览通过显示沿投影追踪优化路径的低维投影,帮助可视化高维数据。投影寻踪是主成分分析的一种推广,从某种意义上说,不同的指标被用来定义投影数据的兴趣。虽然在文献中开发新索引方面已经做了很多工作,但在理解优化方面做得很少。索引函数可能有噪声,可能有多个局部最大值和最优最大值,并且约束生成标准正交投影帧,这会使优化变得复杂。此外,投影寻踪主要用于探索性数据分析,寻找局部最大值也很有用。导览对勘探特别有用,因为它在优化步骤中进行了测地线插值,并显示了预测数据在接近最大值时如何变化。这项工作为检查优化过程的选择提供了新的可视化诊断,基于在整个优化过程中收集信息的新数据对象的提供。它有助于诊断和解决投影寻踪导游的几个问题。这项工作对于诊断优化器和比较它们的性能可能更广泛地有用。诊断是在R包中实现的。
{"title":"Visual Diagnostics for Constrained Optimisation with Application to Guided Tours","authors":"H.Sherry Zhang, D. Cook, U. Laa, Nicolas Langren'e, Patricia Men'endez","doi":"10.32614/RJ-2021-105","DOIUrl":"https://doi.org/10.32614/RJ-2021-105","url":null,"abstract":"A guided tour helps to visualise high-dimensional data by showing low-dimensional projections along a projection pursuit optimisation path. Projection pursuit is a generalisation of principal component analysis, in the sense that different indexes are used to define the interestingness of the projected data. While much work has been done in developing new indexes in the literature, less has been done on understanding the optimisation. Index functions can be noisy, might have multiple local maxima as well as an optimal maximum, and are constrained to generate orthonormal projection frames, which complicates the optimization. In addition, projection pursuit is primarily used for exploratory data analysis, and finding the local maxima is also useful. The guided tour is especially useful for exploration, because it conducts geodesic interpolation connecting steps in the optimisation and shows how the projected data changes as a maxima is approached. This work provides new visual diagnostics for examining a choice of optimisation procedure, based on the provision of a new data object which collects information throughout the optimisation. It has helped to diagnose and fix several problems with projection pursuit guided tour. This work might be useful more broadly for diagnosing optimisers, and comparing their performance. The diagnostics are implemented in the R package, ferrn.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"88 1","pages":"542"},"PeriodicalIF":0.0,"publicationDate":"2021-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85947585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
fairmodels: A Flexible Tool For Bias Detection, Visualization, And Mitigation fairmodels:一个灵活的偏差检测、可视化和缓解工具
Pub Date : 2021-04-01 DOI: 10.32614/rj-2022-019
Jakub Wi'sniewski, P. Biecek
Machine learning decision systems are getting omnipresent in our lives. From dating apps to rating loan seekers, algorithms affect both our well-being and future. Typically, however, these systems are not infallible. Moreover, complex predictive models are really eager to learn social biases present in historical data that can lead to increasing discrimination. If we want to create models responsibly then we need tools for in-depth validation of models also from the perspective of potential discrimination. This article introduces an R package fairmodels that helps to validate fairness and eliminate bias in classification models in an easy and flexible fashion. The fairmodels package offers a model-agnostic approach to bias detection, visualization and mitigation. The implemented set of functions and fairness metrics enables model fairness validation from different perspectives. The package includes a series of methods for bias mitigation that aim to diminish the discrimination in the model. The package is designed not only to examine a single model, but also to facilitate comparisons between multiple models.
机器学习决策系统在我们的生活中无处不在。从约会应用到对贷款申请者进行评级,算法影响着我们的幸福和未来。然而,通常情况下,这些系统并非万无一失。此外,复杂的预测模型确实渴望学习历史数据中存在的社会偏见,这些偏见可能导致越来越多的歧视。如果我们想负责任地创建模型,那么我们需要从潜在歧视的角度对模型进行深入验证的工具。本文介绍了一个R包fairmodels,它有助于以一种简单灵活的方式验证公平性并消除分类模型中的偏见。fairmodels包为偏差检测、可视化和缓解提供了一种与模型无关的方法。实现的函数集和公平性指标支持从不同的角度验证模型公平性。该包包括一系列减轻偏见的方法,旨在减少模型中的歧视。该软件包不仅可以检查单个模型,还可以方便地对多个模型进行比较。
{"title":"fairmodels: A Flexible Tool For Bias Detection, Visualization, And Mitigation","authors":"Jakub Wi'sniewski, P. Biecek","doi":"10.32614/rj-2022-019","DOIUrl":"https://doi.org/10.32614/rj-2022-019","url":null,"abstract":"Machine learning decision systems are getting omnipresent in our lives. From dating apps to rating loan seekers, algorithms affect both our well-being and future. Typically, however, these systems are not infallible. Moreover, complex predictive models are really eager to learn social biases present in historical data that can lead to increasing discrimination. If we want to create models responsibly then we need tools for in-depth validation of models also from the perspective of potential discrimination. This article introduces an R package fairmodels that helps to validate fairness and eliminate bias in classification models in an easy and flexible fashion. The fairmodels package offers a model-agnostic approach to bias detection, visualization and mitigation. The implemented set of functions and fairness metrics enables model fairness validation from different perspectives. The package includes a series of methods for bias mitigation that aim to diminish the discrimination in the model. The package is designed not only to examine a single model, but also to facilitate comparisons between multiple models.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"51 1","pages":"227-243"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82663985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
krippendorffsalpha: An R Package for Measuring Agreement Using Krippendorff's Alpha Coefficient krippendorffsalpha:一个用Krippendorff Alpha系数衡量一致性的R包
Pub Date : 2021-03-22 DOI: 10.32614/rj-2021-046
John Hughes
R package krippendorffsalpha provides tools for measuring agreement using Krippendorff's Alpha coefficient, a well-known nonparametric measure of agreement (also called inter-rater reliability and various other names). This article first develops Krippendorff's Alpha in a natural way, and situates Alpha among statistical procedures. Then the usage of package krippendorffsalpha is illustrated via analyses of two datasets, the latter of which was collected during an imaging study of hip cartilage. The package permits users to apply the Alpha methodology using built-in distance functions for the nominal, ordinal, interval, or ratio levels of measurement. User-defined distance functions are also supported. The fitting function can accommodate any number of units, any number of coders, and missingness. Bootstrap inference is supported, and the bootstrap computation can be carried out in parallel.
R软件包krippendorffsalpha提供了使用Krippendorff的Alpha系数来测量一致性的工具,这是一种众所周知的一致性的非参数度量(也称为内部可靠性和各种其他名称)。本文首先以自然的方式发展了Krippendorff的Alpha,并将Alpha置于统计程序中。然后通过对两个数据集的分析说明了包装krippendorffsalpha的使用,后者是在髋关节软骨成像研究期间收集的。该软件包允许用户使用内置距离函数对标称、序数、间隔或比率水平的测量应用Alpha方法。还支持用户定义的距离函数。拟合功能可以容纳任意数量的单元,任意数量的编码器和缺失。支持自举推理,可并行进行自举计算。
{"title":"krippendorffsalpha: An R Package for Measuring Agreement Using Krippendorff's Alpha Coefficient","authors":"John Hughes","doi":"10.32614/rj-2021-046","DOIUrl":"https://doi.org/10.32614/rj-2021-046","url":null,"abstract":"R package krippendorffsalpha provides tools for measuring agreement using Krippendorff's Alpha coefficient, a well-known nonparametric measure of agreement (also called inter-rater reliability and various other names). This article first develops Krippendorff's Alpha in a natural way, and situates Alpha among statistical procedures. Then the usage of package krippendorffsalpha is illustrated via analyses of two datasets, the latter of which was collected during an imaging study of hip cartilage. The package permits users to apply the Alpha methodology using built-in distance functions for the nominal, ordinal, interval, or ratio levels of measurement. User-defined distance functions are also supported. The fitting function can accommodate any number of units, any number of coders, and missingness. Bootstrap inference is supported, and the bootstrap computation can be carried out in parallel.","PeriodicalId":20974,"journal":{"name":"R J.","volume":"26 1","pages":"413"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90806068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
期刊
R J.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1