HiBGT: High-Performance Bayesian Group Testing for COVID-19

Weicong Chen, C. Tatsuoka, Xiaoyi Lu
{"title":"HiBGT: High-Performance Bayesian Group Testing for COVID-19","authors":"Weicong Chen, C. Tatsuoka, Xiaoyi Lu","doi":"10.1109/HiPC56025.2022.00033","DOIUrl":null,"url":null,"abstract":"The COVID-19 pandemic has necessitated disease surveillance using group testing. Novel Bayesian methods using lattice models were proposed, which offer substantial improvements in group testing efficiency by precisely quantifying uncertainty in diagnoses, acknowledging varying individual risk and dilution effects, and guiding optimally convergent sequential pooled test selections. Computationally, however, Bayesian group testing poses considerable challenges as computational complexity grows exponentially with sample size. HPC and big data stacks are needed for assessing computational and statistical performance across fluctuating prevalence levels at large scales. Here, we study how to design and optimize critical computational components of Bayesian group testing, including lattice model representation, test selection algorithms, and statistical analysis schemes, under the context of parallel computing. To realize this, we propose a high-performance Bayesian group testing framework named HiBGT, based on Apache Spark, which systematically explores the design space of Bayesian group testing and provides comprehensive heuristics on how to achieve high-performance, highly scalable Bayesian group testing. We show that HiBGT can perform large-scale test selections (> 250 state iterations) and accelerate statistical analyzes up to 15.9x (up to 363x with little trade-offs) through a varied selection of sophisticated parallel computing techniques while achieving near linear scalability using up to 924 CPU cores.","PeriodicalId":119363,"journal":{"name":"2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC56025.2022.00033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

The COVID-19 pandemic has necessitated disease surveillance using group testing. Novel Bayesian methods using lattice models were proposed, which offer substantial improvements in group testing efficiency by precisely quantifying uncertainty in diagnoses, acknowledging varying individual risk and dilution effects, and guiding optimally convergent sequential pooled test selections. Computationally, however, Bayesian group testing poses considerable challenges as computational complexity grows exponentially with sample size. HPC and big data stacks are needed for assessing computational and statistical performance across fluctuating prevalence levels at large scales. Here, we study how to design and optimize critical computational components of Bayesian group testing, including lattice model representation, test selection algorithms, and statistical analysis schemes, under the context of parallel computing. To realize this, we propose a high-performance Bayesian group testing framework named HiBGT, based on Apache Spark, which systematically explores the design space of Bayesian group testing and provides comprehensive heuristics on how to achieve high-performance, highly scalable Bayesian group testing. We show that HiBGT can perform large-scale test selections (> 250 state iterations) and accelerate statistical analyzes up to 15.9x (up to 363x with little trade-offs) through a varied selection of sophisticated parallel computing techniques while achieving near linear scalability using up to 924 CPU cores.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
HiBGT:新型冠状病毒的高性能贝叶斯群检测
COVID-19大流行需要采用群体检测进行疾病监测。提出了新的贝叶斯方法,该方法通过精确量化诊断中的不确定性,承认不同的个体风险和稀释效应,并指导最优收敛的顺序池测试选择,大大提高了群体测试效率。然而,在计算上,贝叶斯组测试带来了相当大的挑战,因为计算复杂度随着样本量的增长呈指数增长。需要高性能计算和大数据堆栈来评估大规模波动流行水平的计算和统计性能。在此,我们研究了如何在并行计算的背景下设计和优化贝叶斯群测试的关键计算组件,包括格模型表示、测试选择算法和统计分析方案。为此,我们提出了一个基于Apache Spark的高性能贝叶斯组测试框架HiBGT,该框架系统地探索了贝叶斯组测试的设计空间,并为如何实现高性能、高可扩展性的贝叶斯组测试提供了全面的启发。我们展示了HiBGT可以执行大规模测试选择(> 250状态迭代),并通过各种复杂的并行计算技术加速统计分析高达15.9倍(高达363x,几乎没有权衡),同时使用多达924个CPU内核实现近线性可扩展性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
HiPC 2022 Technical Program Committee A Deep Learning-Based In Situ Analysis Framework for Tropical Cyclogenesis Prediction COMPROF and COMPLACE: Shared-Memory Communication Profiling and Automated Thread Placement via Dynamic Binary Instrumentation Message from the HiPC 2022 General Co-Chairs Efficient Personalized and Non-Personalized Alltoall Communication for Modern Multi-HCA GPU-Based Clusters
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1