Classification under uncertainty: data analysis for diagnostic antibody testing

Paul N Patrone;Anthony J Kearsley
{"title":"Classification under uncertainty: data analysis for diagnostic antibody testing","authors":"Paul N Patrone;Anthony J Kearsley","doi":"10.1093/imammb/dqab007","DOIUrl":null,"url":null,"abstract":"Formulating accurate and robust classification strategies is a key challenge of developing diagnostic and antibody tests. Methods that do not explicitly account for disease prevalence and uncertainty therein can lead to significant classification errors. We present a novel method that leverages optimal decision theory to address this problem. As a preliminary step, we develop an analysis that uses an assumed prevalence and conditional probability models of diagnostic measurement outcomes to define optimal (in the sense of minimizing rates of false positives and false negatives) classification domains. Critically, we demonstrate how this strategy can be generalized to a setting in which the prevalence is unknown by either (i) defining a third class of hold-out samples that require further testing or (ii) using an adaptive algorithm to estimate prevalence prior to defining classification domains. We also provide examples for a recently published SARS-CoV-2 serology test and discuss how measurement uncertainty (e.g. associated with instrumentation) can be incorporated into the analysis. We find that our new strategy decreases classification error by up to a decade relative to more traditional methods based on confidence intervals. Moreover, it establishes a theoretical foundation for generalizing techniques such as receiver operating characteristics by connecting them to the broader field of optimization.","PeriodicalId":94130,"journal":{"name":"Mathematical medicine and biology : a journal of the IMA","volume":"38 3","pages":"396-416"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8016811/9579095/09579102.pdf","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical medicine and biology : a journal of the IMA","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/9579102/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Formulating accurate and robust classification strategies is a key challenge of developing diagnostic and antibody tests. Methods that do not explicitly account for disease prevalence and uncertainty therein can lead to significant classification errors. We present a novel method that leverages optimal decision theory to address this problem. As a preliminary step, we develop an analysis that uses an assumed prevalence and conditional probability models of diagnostic measurement outcomes to define optimal (in the sense of minimizing rates of false positives and false negatives) classification domains. Critically, we demonstrate how this strategy can be generalized to a setting in which the prevalence is unknown by either (i) defining a third class of hold-out samples that require further testing or (ii) using an adaptive algorithm to estimate prevalence prior to defining classification domains. We also provide examples for a recently published SARS-CoV-2 serology test and discuss how measurement uncertainty (e.g. associated with instrumentation) can be incorporated into the analysis. We find that our new strategy decreases classification error by up to a decade relative to more traditional methods based on confidence intervals. Moreover, it establishes a theoretical foundation for generalizing techniques such as receiver operating characteristics by connecting them to the broader field of optimization.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
不确定分类:诊断性抗体检测的数据分析
制定准确和稳健的分类策略是开发诊断和抗体测试的关键挑战。没有明确说明疾病流行率及其不确定性的方法可能会导致显著的分类错误。我们提出了一种利用最优决策理论来解决这个问题的新方法。作为初步步骤,我们开发了一种分析,该分析使用诊断测量结果的假设患病率和条件概率模型来定义最佳(在最小化假阳性和假阴性率的意义上)分类域。至关重要的是,我们展示了如何通过(i)定义需要进一步测试的第三类保留样本,或(ii)在定义分类域之前使用自适应算法来估计流行率,将该策略推广到流行率未知的环境中。我们还提供了最近发表的严重急性呼吸系统综合征冠状病毒2型血清学测试的例子,并讨论了如何将测量不确定性(例如与仪器相关)纳入分析。我们发现,与基于置信区间的更传统的方法相比,我们的新策略将分类误差减少了多达十年。此外,它通过将接收器操作特性等技术与更广泛的优化领域联系起来,为推广这些技术奠定了理论基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Mathematical modeling and analysis of emission and mitigation of methane from the integrated rice-livestock farming system. A signal processing tool adapted to the periodic biphasic phenomena: the Dynalet transform. Modelling the influence of vitamin D and probiotic supplementation on the microbiome and immune response. Effect of diffusivity of amyloid beta monomers on the formation of senile plaques. Genesis of intimal thickening due to hemodynamical shear stresses.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1