HiBGT: High-Performance Bayesian Group Testing for COVID-19

2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC) Pub Date : 2022-12-01 DOI:10.1109/HiPC56025.2022.00033

Weicong Chen, C. Tatsuoka, Xiaoyi Lu

{"title":"HiBGT: High-Performance Bayesian Group Testing for COVID-19","authors":"Weicong Chen, C. Tatsuoka, Xiaoyi Lu","doi":"10.1109/HiPC56025.2022.00033","DOIUrl":null,"url":null,"abstract":"The COVID-19 pandemic has necessitated disease surveillance using group testing. Novel Bayesian methods using lattice models were proposed, which offer substantial improvements in group testing efficiency by precisely quantifying uncertainty in diagnoses, acknowledging varying individual risk and dilution effects, and guiding optimally convergent sequential pooled test selections. Computationally, however, Bayesian group testing poses considerable challenges as computational complexity grows exponentially with sample size. HPC and big data stacks are needed for assessing computational and statistical performance across fluctuating prevalence levels at large scales. Here, we study how to design and optimize critical computational components of Bayesian group testing, including lattice model representation, test selection algorithms, and statistical analysis schemes, under the context of parallel computing. To realize this, we propose a high-performance Bayesian group testing framework named HiBGT, based on Apache Spark, which systematically explores the design space of Bayesian group testing and provides comprehensive heuristics on how to achieve high-performance, highly scalable Bayesian group testing. We show that HiBGT can perform large-scale test selections (> 250 state iterations) and accelerate statistical analyzes up to 15.9x (up to 363x with little trade-offs) through a varied selection of sophisticated parallel computing techniques while achieving near linear scalability using up to 924 CPU cores.","PeriodicalId":119363,"journal":{"name":"2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC56025.2022.00033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

The COVID-19 pandemic has necessitated disease surveillance using group testing. Novel Bayesian methods using lattice models were proposed, which offer substantial improvements in group testing efficiency by precisely quantifying uncertainty in diagnoses, acknowledging varying individual risk and dilution effects, and guiding optimally convergent sequential pooled test selections. Computationally, however, Bayesian group testing poses considerable challenges as computational complexity grows exponentially with sample size. HPC and big data stacks are needed for assessing computational and statistical performance across fluctuating prevalence levels at large scales. Here, we study how to design and optimize critical computational components of Bayesian group testing, including lattice model representation, test selection algorithms, and statistical analysis schemes, under the context of parallel computing. To realize this, we propose a high-performance Bayesian group testing framework named HiBGT, based on Apache Spark, which systematically explores the design space of Bayesian group testing and provides comprehensive heuristics on how to achieve high-performance, highly scalable Bayesian group testing. We show that HiBGT can perform large-scale test selections (> 250 state iterations) and accelerate statistical analyzes up to 15.9x (up to 363x with little trade-offs) through a varied selection of sophisticated parallel computing techniques while achieving near linear scalability using up to 924 CPU cores.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

HiBGT:新型冠状病毒的高性能贝叶斯群检测

COVID-19大流行需要采用群体检测进行疾病监测。提出了新的贝叶斯方法，该方法通过精确量化诊断中的不确定性，承认不同的个体风险和稀释效应，并指导最优收敛的顺序池测试选择，大大提高了群体测试效率。然而，在计算上，贝叶斯组测试带来了相当大的挑战，因为计算复杂度随着样本量的增长呈指数增长。需要高性能计算和大数据堆栈来评估大规模波动流行水平的计算和统计性能。在此，我们研究了如何在并行计算的背景下设计和优化贝叶斯群测试的关键计算组件，包括格模型表示、测试选择算法和统计分析方案。为此，我们提出了一个基于Apache Spark的高性能贝叶斯组测试框架HiBGT，该框架系统地探索了贝叶斯组测试的设计空间，并为如何实现高性能、高可扩展性的贝叶斯组测试提供了全面的启发。我们展示了HiBGT可以执行大规模测试选择(> 250状态迭代)，并通过各种复杂的并行计算技术加速统计分析高达15.9倍(高达363x，几乎没有权衡)，同时使用多达924个CPU内核实现近线性可扩展性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)

自引率

0.00%

发文量