对计算持久同源性的R包进行基准测试。

IF 1.1 4区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS R Journal Pub Date : 2021-06-01 Epub Date: 2021-06-07 DOI:10.32614/RJ-2021-033

Eashwar V Somasundaram, Shael E Brown, Adam Litzler, Jacob G Scott, Raoul R Wadhwa

{"title":"对计算持久同源性的R包进行基准测试。","authors":"Eashwar V Somasundaram, Shael E Brown, Adam Litzler, Jacob G Scott, Raoul R Wadhwa","doi":"10.32614/RJ-2021-033","DOIUrl":null,"url":null,"abstract":"Several persistent homology software libraries have been implemented in R. Specifically, the Dionysus, GUDHI, and Ripser libraries have been wrapped by the TDA and TDAstats CRAN packages. These software represent powerful analysis tools that are computationally expensive and, to our knowledge, have not been formally benchmarked. Here, we analyze runtime and memory growth for the 2 R packages and the 3 underlying libraries. We find that datasets with less than 3 dimensions can be evaluated with persistent homology fastest by the GUDHI library in the TDA package. For higher-dimensional datasets, the Ripser library in the TDAstats package is the fastest. Ripser and TDAstats are also the most memory-efficient tools to calculate persistent homology.","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"13 1","pages":"184-193"},"PeriodicalIF":1.1000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8434812/pdf/nihms-1733366.pdf","citationCount":"7","resultStr":"{\"title\":\"Benchmarking R packages for Calculation of Persistent Homology.\",\"authors\":\"Eashwar V Somasundaram, Shael E Brown, Adam Litzler, Jacob G Scott, Raoul R Wadhwa\",\"doi\":\"10.32614/RJ-2021-033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Several persistent homology software libraries have been implemented in R. Specifically, the Dionysus, GUDHI, and Ripser libraries have been wrapped by the TDA and TDAstats CRAN packages. These software represent powerful analysis tools that are computationally expensive and, to our knowledge, have not been formally benchmarked. Here, we analyze runtime and memory growth for the 2 R packages and the 3 underlying libraries. We find that datasets with less than 3 dimensions can be evaluated with persistent homology fastest by the GUDHI library in the TDA package. For higher-dimensional datasets, the Ripser library in the TDAstats package is the fastest. Ripser and TDAstats are also the most memory-efficient tools to calculate persistent homology.\",\"PeriodicalId\":51285,\"journal\":{\"name\":\"R Journal\",\"volume\":\"13 1\",\"pages\":\"184-193\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2021-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8434812/pdf/nihms-1733366.pdf\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"R Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.32614/RJ-2021-033\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2021/6/7 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"R Journal","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.32614/RJ-2021-033","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/6/7 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 7

摘要

几个持久化同源软件库已经在r中实现了。具体来说，Dionysus、GUDHI和Ripser库已经被TDA和TDAstats的CRAN包封装。这些软件代表了强大的分析工具，但它们在计算上很昂贵，而且据我们所知，还没有经过正式的基准测试。在这里，我们将分析两个R包和3个底层库的运行时和内存增长情况。我们发现，使用TDA包中的GUDHI库可以最快地对小于3维的数据集进行持久同源性计算。对于高维数据集，TDAstats包中的Ripser库是最快的。Ripser和TDAstats也是计算持久同源性的最节省内存的工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Benchmarking R packages for Calculation of Persistent Homology.

Several persistent homology software libraries have been implemented in R. Specifically, the Dionysus, GUDHI, and Ripser libraries have been wrapped by the TDA and TDAstats CRAN packages. These software represent powerful analysis tools that are computationally expensive and, to our knowledge, have not been formally benchmarked. Here, we analyze runtime and memory growth for the 2 R packages and the 3 underlying libraries. We find that datasets with less than 3 dimensions can be evaluated with persistent homology fastest by the GUDHI library in the TDA package. For higher-dimensional datasets, the Ripser library in the TDAstats package is the fastest. Ripser and TDAstats are also the most memory-efficient tools to calculate persistent homology.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

R Journal COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-STATISTICS & PROBABILITY

CiteScore

2.70

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： The R Journal is the open access, refereed journal of the R project for statistical computing. It features short to medium length articles covering topics that should be of interest to users or developers of R. The R Journal intends to reach a wide audience and have a thorough review process. Papers are expected to be reasonably short, clearly written, not too technical, and of course focused on R. Authors of refereed articles should take care to: - put their contribution in context, in particular discuss related R functions or packages; - explain the motivation for their contribution; - provide code examples that are reproducible.

期刊最新文献

Structured Bayesian Regression Tree Models for Estimating Distributed Lag Effects: The R Package dlmtree. binGroup2: Statistical Tools for Infection Identification via Group Testing. glmmPen: High Dimensional Penalized Generalized Linear Mixed Models. Three-Way Correspondence Analysis in R nlstac: Non-Gradient Separable Nonlinear Least Squares Fitting