Identifying HIV sequences that escape antibody neutralization using random forests and collaborative targeted learning

IF 1.7 4区 医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Journal of Causal Inference Pub Date : 2022-01-01 DOI:10.1515/jci-2021-0053
Yutong Jin, D. Benkeser
{"title":"Identifying HIV sequences that escape antibody neutralization using random forests and collaborative targeted learning","authors":"Yutong Jin, D. Benkeser","doi":"10.1515/jci-2021-0053","DOIUrl":null,"url":null,"abstract":"Abstract Recent studies have indicated that it is possible to protect individuals from HIV infection using passive infusion of monoclonal antibodies. However, in order for monoclonal antibodies to confer robust protection, the antibodies must be capable of neutralizing many possible strains of the virus. This is particularly challenging in the context of a highly diverse pathogen like HIV. It is therefore of great interest to leverage existing observational data sources to discover antibodies that are able to neutralize HIV viruses via residues where existing antibodies show modest protection. Such information feeds directly into the clinical trial pipeline for monoclonal antibody therapies by providing information on (i) whether and to what extent combinations of antibodies can generate superior protection and (ii) strategies for analyzing past clinical trials to identify in vivo evidence of antibody resistance. These observational data include genetic features of many diverse HIV genetic sequences, as well as in vitro measures of antibody resistance. The statistical learning problem we are interested in is developing statistical methodology that can be used to analyze these data to identify important genetic features that are significantly associated with antibody resistance. This is a challenging problem owing to the high-dimensional and strongly correlated nature of the genetic sequence data. To overcome these challenges, we propose an outcome-adaptive, collaborative targeted minimum loss-based estimation approach using random forests. We demonstrate via simulation that the approach enjoys important statistical benefits over existing approaches in terms of bias, mean squared error, and type I error. We apply the approach to the Compile, Analyze, and Tally Nab Panels database to identify AA positions that are potentially causally related to resistance to neutralization by several different antibodies.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"43 1","pages":"280 - 295"},"PeriodicalIF":1.7000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Causal Inference","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1515/jci-2021-0053","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract Recent studies have indicated that it is possible to protect individuals from HIV infection using passive infusion of monoclonal antibodies. However, in order for monoclonal antibodies to confer robust protection, the antibodies must be capable of neutralizing many possible strains of the virus. This is particularly challenging in the context of a highly diverse pathogen like HIV. It is therefore of great interest to leverage existing observational data sources to discover antibodies that are able to neutralize HIV viruses via residues where existing antibodies show modest protection. Such information feeds directly into the clinical trial pipeline for monoclonal antibody therapies by providing information on (i) whether and to what extent combinations of antibodies can generate superior protection and (ii) strategies for analyzing past clinical trials to identify in vivo evidence of antibody resistance. These observational data include genetic features of many diverse HIV genetic sequences, as well as in vitro measures of antibody resistance. The statistical learning problem we are interested in is developing statistical methodology that can be used to analyze these data to identify important genetic features that are significantly associated with antibody resistance. This is a challenging problem owing to the high-dimensional and strongly correlated nature of the genetic sequence data. To overcome these challenges, we propose an outcome-adaptive, collaborative targeted minimum loss-based estimation approach using random forests. We demonstrate via simulation that the approach enjoys important statistical benefits over existing approaches in terms of bias, mean squared error, and type I error. We apply the approach to the Compile, Analyze, and Tally Nab Panels database to identify AA positions that are potentially causally related to resistance to neutralization by several different antibodies.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用随机森林和协作目标学习识别逃避抗体中和的HIV序列
最近的研究表明,被动输注单克隆抗体可以保护个体免受HIV感染。然而,为了使单克隆抗体具有强大的保护作用,抗体必须能够中和许多可能的病毒株。在艾滋病毒等高度多样化的病原体的背景下,这尤其具有挑战性。因此,利用现有的观察数据源来发现能够通过现有抗体显示适度保护的残基来中和艾滋病毒的抗体是非常有兴趣的。这些信息通过提供以下信息,直接输入到单克隆抗体治疗的临床试验管道中:(i)抗体组合是否以及在多大程度上可以产生更好的保护;(ii)分析过去临床试验的策略,以确定体内抗体耐药性的证据。这些观察数据包括许多不同HIV基因序列的遗传特征,以及抗体耐药性的体外测量。我们感兴趣的统计学习问题是开发统计方法,可以用来分析这些数据,以确定与抗体耐药性显著相关的重要遗传特征。由于基因序列数据的高维性和强相关性,这是一个具有挑战性的问题。为了克服这些挑战,我们提出了一种基于随机森林的结果自适应、协作目标最小损失估计方法。我们通过模拟证明,该方法在偏差、均方误差和I型误差方面比现有方法具有重要的统计优势。我们将该方法应用于编译、分析和计数Nab面板数据库,以确定可能与几种不同抗体的中和抗性有因果关系的AA位置。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Causal Inference
Journal of Causal Inference Decision Sciences-Statistics, Probability and Uncertainty
CiteScore
1.90
自引率
14.30%
发文量
15
审稿时长
86 weeks
期刊介绍: Journal of Causal Inference (JCI) publishes papers on theoretical and applied causal research across the range of academic disciplines that use quantitative tools to study causality.
期刊最新文献
Evaluating Boolean relationships in Configurational Comparative Methods Comparison of open-source software for producing directed acyclic graphs. LINGUISTIC FEATURES AND PRESENTATION OF MATERIALS ON ENGLISH TEXTBOOK “WHEN ENGLISH RINGS A BELL” BASED ON BSNP Heterogeneous interventional effects with multiple mediators: Semiparametric and nonparametric approaches Attributable fraction and related measures: Conceptual relations in the counterfactual framework
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1