首页 > 最新文献

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing最新文献

英文 中文
Leveraging 3D Echocardiograms to Evaluate AI Model Performance in Predicting Cardiac Function on Out-of-Distribution Data. 利用三维超声心动图评估人工智能模型在分布外数据上预测心功能的性能。
Grant Duffy, Kai Christensen, David Ouyang

Advancements in medical imaging and artificial intelligence (AI) have revolutionized the field of cardiac diagnostics, providing accurate and efficient tools for assessing cardiac function. AI diagnostics claims to improve upon the human-to-human variation that is known to be significant. However, when put in practice, for cardiac ultrasound, AI models are being run on images acquired by human sonographers whose quality and consistency may vary. With more variation than other medical imaging modalities, variation in image acquisition may lead to out-of-distribution (OOD) data and unpredictable performance of the AI tools. Recent advances in ultrasound technology has allowed the acquisition of both 3D as well as 2D data, however 3D has more limited temporal and spatial resolution and is still not routinely acquired. Because the training datasets used when developing AI algorithms are mostly developed using 2D images, it is difficult to determine the impact of human variation on the performance of AI tools in the real world. The objective of this project is to leverage 3D echos to simulate realistic human variation of image acquisition and better understand the OOD performance of a previously validated AI model. In doing so, we develop tools for interpreting 3D echo data and quantifiably recreating common variation in image acquisition between sonographers. We also developed a technique for finding good standard 2D views in 3D echo volumes. We found the performance of the AI model we evaluated to be as expected when the view is good, but variations in acquisition position degraded AI model performance. Performance on far from ideal views was poor, but still better than random, suggesting that there is some information being used that permeates the whole volume, not just a quality view. Additionally, we found that variations in foreshortening didn't result in the same errors that a human would make.

医学成像和人工智能(AI)的进步彻底改变了心脏诊断领域,为评估心脏功能提供了准确高效的工具。众所周知,人与人之间存在显著差异,而人工智能诊断技术则能改善这种差异。然而,在实际应用中,就心脏超声而言,人工智能模型是在人类超声技师获取的图像上运行的,而人类超声技师的图像质量和一致性可能存在差异。与其他医学成像模式相比,人工智能模型的质量和一致性可能会有差异,图像采集的差异可能会导致数据超出分布范围(OOD)和人工智能工具性能的不可预测性。超声技术的最新进展使得三维和二维数据的采集成为可能,但三维数据的时间和空间分辨率较为有限,目前仍未被常规采集。由于开发人工智能算法时使用的训练数据集大多是使用二维图像开发的,因此很难确定人为变化对人工智能工具在真实世界中的性能的影响。本项目的目标是利用三维回声模拟人类在获取图像时的真实变化,并更好地了解先前验证过的人工智能模型的 OOD 性能。在此过程中,我们开发了解释三维回波数据的工具,并以量化的方式再现了超声技师在图像采集方面的常见差异。我们还开发了一种在三维回波卷中寻找良好标准二维视图的技术。我们发现,当视图良好时,我们评估的人工智能模型的性能符合预期,但采集位置的变化会降低人工智能模型的性能。远非理想视图的性能较差,但仍优于随机视图,这表明所使用的某些信息渗透到整个容积中,而不仅仅是优质视图。此外,我们还发现,前缩的变化并不会导致与人类相同的错误。
{"title":"Leveraging 3D Echocardiograms to Evaluate AI Model Performance in Predicting Cardiac Function on Out-of-Distribution Data.","authors":"Grant Duffy, Kai Christensen, David Ouyang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Advancements in medical imaging and artificial intelligence (AI) have revolutionized the field of cardiac diagnostics, providing accurate and efficient tools for assessing cardiac function. AI diagnostics claims to improve upon the human-to-human variation that is known to be significant. However, when put in practice, for cardiac ultrasound, AI models are being run on images acquired by human sonographers whose quality and consistency may vary. With more variation than other medical imaging modalities, variation in image acquisition may lead to out-of-distribution (OOD) data and unpredictable performance of the AI tools. Recent advances in ultrasound technology has allowed the acquisition of both 3D as well as 2D data, however 3D has more limited temporal and spatial resolution and is still not routinely acquired. Because the training datasets used when developing AI algorithms are mostly developed using 2D images, it is difficult to determine the impact of human variation on the performance of AI tools in the real world. The objective of this project is to leverage 3D echos to simulate realistic human variation of image acquisition and better understand the OOD performance of a previously validated AI model. In doing so, we develop tools for interpreting 3D echo data and quantifiably recreating common variation in image acquisition between sonographers. We also developed a technique for finding good standard 2D views in 3D echo volumes. We found the performance of the AI model we evaluated to be as expected when the view is good, but variations in acquisition position degraded AI model performance. Performance on far from ideal views was poor, but still better than random, suggesting that there is some information being used that permeates the whole volume, not just a quality view. Additionally, we found that variations in foreshortening didn't result in the same errors that a human would make.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session Introduction: Artificial Intelligence in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface. 会议简介:临床医学中的人工智能:人机界面上的生成和交互系统。
Sajjad Fouladvand, Emma Pierson, Ivana Jankovic, David Ouyang, Jonathan H Chen, Roxana Daneshjou

Artificial Intelligence (AI) models are substantially enhancing the capability to analyze complex and multi-dimensional datasets. Generative AI and deep learning models have demonstrated significant advancements in extracting knowledge from unstructured text, imaging as well as structured and tabular data. This recent breakthrough in AI has inspired research in medicine, leading to the development of numerous tools for creating clinical decision support systems, monitoring tools, image interpretation, and triaging capabilities. Nevertheless, comprehensive research is imperative to evaluate the potential impact and implications of AI systems in healthcare. At the 2024 Pacific Symposium on Biocomputing (PSB) session entitled "Artificial Intelligence in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface", we spotlight research that develops and applies AI algorithms to solve real-world problems in healthcare.

人工智能(AI)模型大大提高了分析复杂和多维数据集的能力。生成式人工智能和深度学习模型在从非结构化文本、图像以及结构化和表格数据中提取知识方面取得了显著进步。人工智能领域的这一最新突破激发了医学研究的灵感,开发出了许多用于创建临床决策支持系统、监测工具、图像解读和分流功能的工具。然而,要评估人工智能系统在医疗保健领域的潜在影响和意义,全面的研究势在必行。在 2024 年太平洋生物计算研讨会(PSB)题为 "人工智能在临床医学中的应用 "的会议上,与会代表就人工智能在医疗保健领域的应用进行了深入探讨:人机界面上的生成和交互系统 "分会上,我们将重点介绍开发和应用人工智能算法解决医疗保健领域实际问题的研究。
{"title":"Session Introduction: Artificial Intelligence in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface.","authors":"Sajjad Fouladvand, Emma Pierson, Ivana Jankovic, David Ouyang, Jonathan H Chen, Roxana Daneshjou","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Artificial Intelligence (AI) models are substantially enhancing the capability to analyze complex and multi-dimensional datasets. Generative AI and deep learning models have demonstrated significant advancements in extracting knowledge from unstructured text, imaging as well as structured and tabular data. This recent breakthrough in AI has inspired research in medicine, leading to the development of numerous tools for creating clinical decision support systems, monitoring tools, image interpretation, and triaging capabilities. Nevertheless, comprehensive research is imperative to evaluate the potential impact and implications of AI systems in healthcare. At the 2024 Pacific Symposium on Biocomputing (PSB) session entitled \"Artificial Intelligence in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface\", we spotlight research that develops and applies AI algorithms to solve real-world problems in healthcare.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A deep neural network estimation of brain age is sensitive to cognitive impairment and decline. 深度神经网络对大脑年龄的估计对认知障碍和衰退很敏感。
Yisu Yang, Aditi Sathe, Kurt Schilling, Niranjana Shashikumar, Elizabeth Moore, Logan Dumitrescu, Kimberly R Pechman, Bennett A Landman, Katherine A Gifford, Timothy J Hohman, Angela L Jefferson, Derek B Archer

The greatest known risk factor for Alzheimer's disease (AD) is age. While both normal aging and AD pathology involve structural changes in the brain, their trajectories of atrophy are not the same. Recent developments in artificial intelligence have encouraged studies to leverage neuroimaging-derived measures and deep learning approaches to predict brain age, which has shown promise as a sensitive biomarker in diagnosing and monitoring AD. However, prior efforts primarily involved structural magnetic resonance imaging and conventional diffusion MRI (dMRI) metrics without accounting for partial volume effects. To address this issue, we post-processed our dMRI scans with an advanced free-water (FW) correction technique to compute distinct FW-corrected fractional anisotropy (FAFWcorr) and FW maps that allow for the separation of tissue from fluid in a scan. We built 3 densely connected neural networks from FW-corrected dMRI, T1-weighted MRI, and combined FW+T1 features, respectively, to predict brain age. We then investigated the relationship of actual age and predicted brain ages with cognition. We found that all models accurately predicted actual age in cognitively unimpaired (CU) controls (FW: r=0.66, p=1.62x10-32; T1: r=0.61, p=1.45x10-26, FW+T1: r=0.77, p=6.48x10-50) and distinguished between CU and mild cognitive impairment participants (FW: p=0.006; T1: p=0.048; FW+T1: p=0.003), with FW+T1-derived age showing best performance. Additionally, all predicted brain age models were significantly associated with cross-sectional cognition (memory, FW: β=-1.094, p=6.32x10-7; T1: β=-1.331, p=6.52x10-7; FW+T1: β=-1.476, p=2.53x10-10; executive function, FW: β=-1.276, p=1.46x10-9; T1: β=-1.337, p=2.52x10-7; FW+T1: β=-1.850, p=3.85x10-17) and longitudinal cognition (memory, FW: β=-0.091, p=4.62x10-11; T1: β=-0.097, p=1.40x10-8; FW+T1: β=-0.101, p=1.35x10-11; executive function, FW: β=-0.125, p=1.20x10-10; T1: β=-0.163, p=4.25x10-12; FW+T1: β=-0.158, p=1.65x10-14). Our findings provide evidence that both T1-weighted MRI and dMRI measures improve brain age prediction and support predicted brain age as a sensitive biomarker of cognition and cognitive decline.

阿尔茨海默病(AD)最大的已知风险因素是年龄。虽然正常衰老和阿尔茨海默病的病理过程都涉及大脑结构的变化,但它们的萎缩轨迹并不相同。人工智能的最新发展推动了利用神经影像衍生测量和深度学习方法来预测脑年龄的研究。然而,之前的研究主要涉及结构性磁共振成像和传统的弥散磁共振成像(dMRI)指标,没有考虑到部分容积效应。为了解决这个问题,我们采用先进的自由水(FW)校正技术对 dMRI 扫描进行后处理,计算出不同的 FW 校正分数各向异性(FAFWcorr)和 FW 图,从而在扫描中将组织和液体分离开来。我们从 FW 校正 dMRI、T1 加权 MRI 和 FW+T1 组合特征中分别构建了 3 个密集连接的神经网络来预测大脑年龄。然后,我们研究了实际年龄和预测脑年龄与认知的关系。我们发现,所有模型都能准确预测认知功能未受损(CU)对照组的实际年龄(FW:r=0.66,p=1.62x10-32;T1:r=0.61,p=1.45x10-26,FW+T1:r=0.77,p=6.48x10-50),并能区分CU和轻度认知障碍参与者(FW:p=0.006;T1:p=0.048;FW+T1:p=0.003),其中FW+T1得出的年龄表现最佳。此外,所有预测的脑年龄模型都与横截面认知能力显著相关(记忆,FW:β=-1.094,p=6.32x10-7;T1:β=-1.331,p=6.52x10-7;FW+T1:β=-1.476,p=2.53x10-10;执行功能,FW:β=-1.276,p=1.46x10-9;T1:β=-1.337,p=2.52x10-7;FW+T1:β=-1.850,p=3.85x10-17)和纵向认知(记忆,FW:β=-0.091,p=4.62x10-11;T1:β=-0.097,p=1.40x10-8;FW+T1:β=-0.101,p=1.35x10-11;执行功能,FW:β=-0.125,p=1.20x10-10;T1:β=-0.163,p=4.25x10-12;FW+T1:β=-0.158,p=1.65x10-14)。我们的研究结果证明,T1加权磁共振成像和dMRI测量都能改善脑年龄预测,并支持将预测脑年龄作为认知和认知衰退的敏感生物标志物。
{"title":"A deep neural network estimation of brain age is sensitive to cognitive impairment and decline.","authors":"Yisu Yang, Aditi Sathe, Kurt Schilling, Niranjana Shashikumar, Elizabeth Moore, Logan Dumitrescu, Kimberly R Pechman, Bennett A Landman, Katherine A Gifford, Timothy J Hohman, Angela L Jefferson, Derek B Archer","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The greatest known risk factor for Alzheimer's disease (AD) is age. While both normal aging and AD pathology involve structural changes in the brain, their trajectories of atrophy are not the same. Recent developments in artificial intelligence have encouraged studies to leverage neuroimaging-derived measures and deep learning approaches to predict brain age, which has shown promise as a sensitive biomarker in diagnosing and monitoring AD. However, prior efforts primarily involved structural magnetic resonance imaging and conventional diffusion MRI (dMRI) metrics without accounting for partial volume effects. To address this issue, we post-processed our dMRI scans with an advanced free-water (FW) correction technique to compute distinct FW-corrected fractional anisotropy (FAFWcorr) and FW maps that allow for the separation of tissue from fluid in a scan. We built 3 densely connected neural networks from FW-corrected dMRI, T1-weighted MRI, and combined FW+T1 features, respectively, to predict brain age. We then investigated the relationship of actual age and predicted brain ages with cognition. We found that all models accurately predicted actual age in cognitively unimpaired (CU) controls (FW: r=0.66, p=1.62x10-32; T1: r=0.61, p=1.45x10-26, FW+T1: r=0.77, p=6.48x10-50) and distinguished between CU and mild cognitive impairment participants (FW: p=0.006; T1: p=0.048; FW+T1: p=0.003), with FW+T1-derived age showing best performance. Additionally, all predicted brain age models were significantly associated with cross-sectional cognition (memory, FW: β=-1.094, p=6.32x10-7; T1: β=-1.331, p=6.52x10-7; FW+T1: β=-1.476, p=2.53x10-10; executive function, FW: β=-1.276, p=1.46x10-9; T1: β=-1.337, p=2.52x10-7; FW+T1: β=-1.850, p=3.85x10-17) and longitudinal cognition (memory, FW: β=-0.091, p=4.62x10-11; T1: β=-0.097, p=1.40x10-8; FW+T1: β=-0.101, p=1.35x10-11; executive function, FW: β=-0.125, p=1.20x10-10; T1: β=-0.163, p=4.25x10-12; FW+T1: β=-0.158, p=1.65x10-14). Our findings provide evidence that both T1-weighted MRI and dMRI measures improve brain age prediction and support predicted brain age as a sensitive biomarker of cognition and cognitive decline.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10764074/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SynTwin: A graph-based approach for predicting clinical outcomes using digital twins derived from synthetic patients. SynTwin:一种基于图谱的方法,利用从合成患者中提取的数字双胞胎预测临床结果。
Jason H Moore, Xi Li, Jui-Hsuan Chang, Nicholas P Tatonetti, Dan Theodorescu, Yong Chen, Folkert W Asselbergs, Mythreye Venkatesan, Zhiping Paul Wang

The concept of a digital twin came from the engineering, industrial, and manufacturing domains to create virtual objects or machines that could inform the design and development of real objects. This idea is appealing for precision medicine where digital twins of patients could help inform healthcare decisions. We have developed a methodology for generating and using digital twins for clinical outcome prediction. We introduce a new approach that combines synthetic data and network science to create digital twins (i.e. SynTwin) for precision medicine. First, our approach starts by estimating the distance between all subjects based on their available features. Second, the distances are used to construct a network with subjects as nodes and edges defining distance less than the percolation threshold. Third, communities or cliques of subjects are defined. Fourth, a large population of synthetic patients are generated using a synthetic data generation algorithm that models the correlation structure of the data to generate new patients. Fifth, digital twins are selected from the synthetic patient population that are within a given distance defining a subject community in the network. Finally, we compare and contrast community-based prediction of clinical endpoints using real subjects, digital twins, or both within and outside of the community. Key to this approach are the digital twins defined using patient similarity that represent hypothetical unobserved patients with patterns similar to nearby real patients as defined by network distance and community structure. We apply our SynTwin approach to predicting mortality in a population-based cancer registry (n=87,674) from the Surveillance, Epidemiology, and End Results (SEER) program from the National Cancer Institute (USA). Our results demonstrate that nearest network neighbor prediction of mortality in this study is significantly improved with digital twins (AUROC=0.864, 95% CI=0.857-0.872) over just using real data alone (AUROC=0.791, 95% CI=0.781-0.800). These results suggest a network-based digital twin strategy using synthetic patients may add value to precision medicine efforts.

数字孪生的概念来自工程、工业和制造领域,旨在创建虚拟物体或机器,为真实物体的设计和开发提供参考。这一想法对精准医疗很有吸引力,患者的数字孪生可以帮助医疗决策提供依据。我们开发了一种生成和使用数字双胞胎进行临床结果预测的方法。我们介绍了一种结合合成数据和网络科学的新方法,为精准医疗创建数字孪生(即 SynTwin)。首先,我们的方法是根据所有受试者的可用特征来估计他们之间的距离。其次,利用这些距离构建一个网络,以受试者为节点,边缘定义的距离小于渗透阈值。第三,定义受试者的群落或小群。第四,使用合成数据生成算法生成大量合成患者,该算法可模拟数据的相关结构,生成新的患者。第五,从合成患者群体中挑选出一定距离内的数字双胞胎,定义网络中的主体群落。最后,我们使用真实受试者、数字双胞胎或社区内外的受试者对基于社区的临床终点预测进行比较和对比。这种方法的关键在于使用患者相似性定义的数字孪生,它代表了假设的未观察到的患者,其模式与网络距离和社区结构定义的附近真实患者相似。我们将 SynTwin 方法应用于预测美国国家癌症研究所(National Cancer Institute,USA)监测、流行病学和最终结果(Surveillance,Epidemiology,and End Results,SEER)计划中基于人群的癌症登记(n=87,674)中的死亡率。我们的研究结果表明,在这项研究中,使用数字孪生(AUROC=0.864,95% CI=0.857-0.872)对死亡率进行最近网络邻接预测,比只使用真实数据(AUROC=0.791,95% CI=0.781-0.800)有显著提高。这些结果表明,使用合成患者的基于网络的数字孪生策略可能会为精准医疗工作增添价值。
{"title":"SynTwin: A graph-based approach for predicting clinical outcomes using digital twins derived from synthetic patients.","authors":"Jason H Moore, Xi Li, Jui-Hsuan Chang, Nicholas P Tatonetti, Dan Theodorescu, Yong Chen, Folkert W Asselbergs, Mythreye Venkatesan, Zhiping Paul Wang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The concept of a digital twin came from the engineering, industrial, and manufacturing domains to create virtual objects or machines that could inform the design and development of real objects. This idea is appealing for precision medicine where digital twins of patients could help inform healthcare decisions. We have developed a methodology for generating and using digital twins for clinical outcome prediction. We introduce a new approach that combines synthetic data and network science to create digital twins (i.e. SynTwin) for precision medicine. First, our approach starts by estimating the distance between all subjects based on their available features. Second, the distances are used to construct a network with subjects as nodes and edges defining distance less than the percolation threshold. Third, communities or cliques of subjects are defined. Fourth, a large population of synthetic patients are generated using a synthetic data generation algorithm that models the correlation structure of the data to generate new patients. Fifth, digital twins are selected from the synthetic patient population that are within a given distance defining a subject community in the network. Finally, we compare and contrast community-based prediction of clinical endpoints using real subjects, digital twins, or both within and outside of the community. Key to this approach are the digital twins defined using patient similarity that represent hypothetical unobserved patients with patterns similar to nearby real patients as defined by network distance and community structure. We apply our SynTwin approach to predicting mortality in a population-based cancer registry (n=87,674) from the Surveillance, Epidemiology, and End Results (SEER) program from the National Cancer Institute (USA). Our results demonstrate that nearest network neighbor prediction of mortality in this study is significantly improved with digital twins (AUROC=0.864, 95% CI=0.857-0.872) over just using real data alone (AUROC=0.791, 95% CI=0.781-0.800). These results suggest a network-based digital twin strategy using synthetic patients may add value to precision medicine efforts.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10827004/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transcript-aware analysis of rare predicted loss-of-function variants in the UK Biobank elucidate new isoform-trait associations. 对英国生物库中罕见的预测功能缺失变体进行转录本感知分析,阐明新的同工酶-性状关联。
Rachel A Hoffing, Aimee M Deaton, Aaron M Holleman, Lynne Krohn, Philip J LoGerfo, Mollie E Plekan, Sebastian Akle Serrano, Paul Nioi, Lucas D Ward

A single gene can produce multiple transcripts with distinct molecular functions. Rare-variant association tests often aggregate all coding variants across individual genes, without accounting for the variants' presence or consequence in resulting transcript isoforms. To evaluate the utility of transcript-aware variant sets, rare predicted loss-of-function (pLOF) variants were aggregated for 17,035 protein-coding genes using 55,558 distinct transcript-specific variant sets. These sets were tested for their association with 728 circulating proteins and 188 quantitative phenotypes across 406,921 individuals in the UK Biobank. The transcript-specific approach resulted in larger estimated effects of pLOF variants decreasing serum cis-protein levels compared to the gene-based approach (pbinom ≤ 2x10-16). Additionally, 251 quantitative trait associations were identified as being significant using the transcript-specific approach but not the gene-based approach, including PCSK5 transcript ENST00000376752 and standing height (transcript-specific statistic, P = 1.3x10-16, effect = 0.7 SD decrease; gene-based statistic, P = 0.02, effect = 0.05 SD decrease) and LDLR transcript ENST00000252444 and apolipoprotein B (transcript-specific statistic, P = 5.7x10-20, effect = 1.0 SD increase; gene-based statistic, P = 3.0x10-4, effect = 0.2 SD increase). This approach demonstrates the importance of considering the effect of pLOFs on specific transcript isoforms when performing rare-variant association studies.

一个基因可以产生多种具有不同分子功能的转录本。罕见变异关联测试通常会汇总单个基因的所有编码变异,而不会考虑变异在转录本异构体中的存在或后果。为了评估转录本感知变异集的效用,我们使用 55558 个不同的转录本特异性变异集汇总了 17035 个蛋白编码基因的罕见预测功能缺失(pLOF)变异。这些变异集与英国生物库中 406921 人的 728 种循环蛋白和 188 种定量表型进行了关联测试。与基于基因的方法相比(pbinom ≤ 2x10-16),转录本特异性方法导致 pLOF 变体降低血清顺式蛋白水平的估计效应更大。此外,使用转录本特异性方法而非基于基因的方法,确定了 251 个数量性状关联具有显著性,包括 PCSK5 转录本 ENST00000376752 和站立高度(转录本特异性统计量,P = 1.3x10-16,效应 = 0.7 SD 下降;基于基因的统计量,P = 0.02,效应 = 0.05 SD 下降)和 LDLR 转录本 ENST00000252444 与脂蛋白 B(转录本特异性统计量,P = 5.7x10-20,效应 = 1.0 SD 增加;基于基因的统计量,P = 3.0x10-4,效应 = 0.2 SD 增加)。这种方法表明,在进行罕见变异关联研究时,考虑 pLOF 对特定转录本同工酶的影响非常重要。
{"title":"Transcript-aware analysis of rare predicted loss-of-function variants in the UK Biobank elucidate new isoform-trait associations.","authors":"Rachel A Hoffing, Aimee M Deaton, Aaron M Holleman, Lynne Krohn, Philip J LoGerfo, Mollie E Plekan, Sebastian Akle Serrano, Paul Nioi, Lucas D Ward","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>A single gene can produce multiple transcripts with distinct molecular functions. Rare-variant association tests often aggregate all coding variants across individual genes, without accounting for the variants' presence or consequence in resulting transcript isoforms. To evaluate the utility of transcript-aware variant sets, rare predicted loss-of-function (pLOF) variants were aggregated for 17,035 protein-coding genes using 55,558 distinct transcript-specific variant sets. These sets were tested for their association with 728 circulating proteins and 188 quantitative phenotypes across 406,921 individuals in the UK Biobank. The transcript-specific approach resulted in larger estimated effects of pLOF variants decreasing serum cis-protein levels compared to the gene-based approach (pbinom ≤ 2x10-16). Additionally, 251 quantitative trait associations were identified as being significant using the transcript-specific approach but not the gene-based approach, including PCSK5 transcript ENST00000376752 and standing height (transcript-specific statistic, P = 1.3x10-16, effect = 0.7 SD decrease; gene-based statistic, P = 0.02, effect = 0.05 SD decrease) and LDLR transcript ENST00000252444 and apolipoprotein B (transcript-specific statistic, P = 5.7x10-20, effect = 1.0 SD increase; gene-based statistic, P = 3.0x10-4, effect = 0.2 SD increase). This approach demonstrates the importance of considering the effect of pLOFs on specific transcript isoforms when performing rare-variant association studies.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generating new drug repurposing hypotheses using disease-specific hypergraphs. 利用特定疾病超图生成新的药物再利用假设。
Ayush Jain, Marie-Laure Charpignon, Irene Y Chen, Anthony Philippakis, Ahmed Alaa

The drug development pipeline for a new compound can last 10-20 years and cost over $10 billion. Drug repurposing offers a more time- and cost-effective alternative. Computational approaches based on network graph representations, comprising a mixture of disease nodes and their interactions, have recently yielded new drug repurposing hypotheses, including suitable candidates for COVID-19. However, these interactomes remain aggregate by design and often lack disease specificity. This dilution of information may affect the relevance of drug node embeddings to a particular disease, the resulting drug-disease and drug-drug similarity scores, and therefore our ability to identify new targets or drug synergies. To address this problem, we propose constructing and learning disease-specific hypergraphs in which hyperedges encode biological pathways of various lengths. We use a modified node2vec algorithm to generate pathway embeddings. We evaluate our hypergraph's ability to find repurposing targets for an incurable but prevalent disease, Alzheimer's disease (AD), and compare our ranked-ordered recommendations to those derived from a state-of-the-art knowledge graph, the multiscale interactome. Using our method, we successfully identified 7 promising repurposing candidates for AD that were ranked as unlikely repurposing targets by the multiscale interactome but for which the existing literature provides supporting evidence. Additionally, our drug repositioning suggestions are accompanied by explanations, eliciting plausible biological pathways. In the future, we plan on scaling our proposed method to 800+ diseases, combining single-disease hypergraphs into multi-disease hypergraphs to account for subpopulations with risk factors or encode a given patient's comorbidities to formulate personalized repurposing recommendations.Supplementary materials and code: https://github.com/ayujain04/psb_supplement.

一种新化合物的药物开发周期可能长达 10-20 年,耗资超过 100 亿美元。药物再利用提供了一种时间更短、成本效益更高的替代方案。基于由疾病节点及其相互作用组成的网络图表示的计算方法最近产生了新的药物再利用假说,包括 COVID-19 的合适候选药物。然而,这些相互作用组的设计仍然是聚合的,往往缺乏疾病特异性。这种信息稀释可能会影响药物节点嵌入与特定疾病的相关性、由此产生的药物-疾病和药物-药物相似性得分,从而影响我们识别新靶点或药物协同作用的能力。为了解决这个问题,我们提出了构建和学习特定疾病超图的建议,其中超图编码了不同长度的生物通路。我们使用改进的 node2vec 算法生成路径嵌入。我们评估了我们的超图为阿尔茨海默病(AD)这一无法治愈但普遍存在的疾病寻找再利用目标的能力,并将我们的排序推荐与从最先进的知识图谱--多尺度交互组--中得出的推荐进行了比较。利用我们的方法,我们成功地发现了 7 种有希望重新成为治疗阿尔茨海默病目标的候选药物,这些候选药物在多尺度相互作用组中被列为不可能重新成为目标的药物,但现有文献为其提供了支持性证据。此外,我们的药物重新定位建议还附有解释,引出了合理的生物学途径。未来,我们计划将我们提出的方法推广到800多种疾病,将单病种超图结合到多病种超图中,以考虑具有风险因素的亚人群或编码特定患者的合并症,从而制定个性化的再利用建议。补充材料和代码:https://github.com/ayujain04/psb_supplement。
{"title":"Generating new drug repurposing hypotheses using disease-specific hypergraphs.","authors":"Ayush Jain, Marie-Laure Charpignon, Irene Y Chen, Anthony Philippakis, Ahmed Alaa","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The drug development pipeline for a new compound can last 10-20 years and cost over $10 billion. Drug repurposing offers a more time- and cost-effective alternative. Computational approaches based on network graph representations, comprising a mixture of disease nodes and their interactions, have recently yielded new drug repurposing hypotheses, including suitable candidates for COVID-19. However, these interactomes remain aggregate by design and often lack disease specificity. This dilution of information may affect the relevance of drug node embeddings to a particular disease, the resulting drug-disease and drug-drug similarity scores, and therefore our ability to identify new targets or drug synergies. To address this problem, we propose constructing and learning disease-specific hypergraphs in which hyperedges encode biological pathways of various lengths. We use a modified node2vec algorithm to generate pathway embeddings. We evaluate our hypergraph's ability to find repurposing targets for an incurable but prevalent disease, Alzheimer's disease (AD), and compare our ranked-ordered recommendations to those derived from a state-of-the-art knowledge graph, the multiscale interactome. Using our method, we successfully identified 7 promising repurposing candidates for AD that were ranked as unlikely repurposing targets by the multiscale interactome but for which the existing literature provides supporting evidence. Additionally, our drug repositioning suggestions are accompanied by explanations, eliciting plausible biological pathways. In the future, we plan on scaling our proposed method to 800+ diseases, combining single-disease hypergraphs into multi-disease hypergraphs to account for subpopulations with risk factors or encode a given patient's comorbidities to formulate personalized repurposing recommendations.Supplementary materials and code: https://github.com/ayujain04/psb_supplement.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LARGE LANGUAGE MODELS (LLMS) AND CHATGPT FOR BIOMEDICINE. 用于生物医学的大型语言模型(LLMS)和聊天软件。
Cecilia Arighi, Steven Brenner, Zhiyong Lu

Large Language Models (LLMs) are a type of artificial intelligence that has been revolutionizing various fields, including biomedicine. They have the capability to process and analyze large amounts of data, understand natural language, and generate new content, making them highly desirable in many biomedical applications and beyond. In this workshop, we aim to introduce the attendees to an in-depth understanding of the rise of LLMs in biomedicine, and how they are being used to drive innovation and improve outcomes in the field, along with associated challenges and pitfalls.

大型语言模型(LLMs)是一种人工智能,它给包括生物医学在内的各个领域带来了革命性的变化。它们有能力处理和分析大量数据、理解自然语言并生成新内容,因此在许多生物医学应用及其他领域非常受欢迎。在本次研讨会上,我们将向与会者深入介绍 LLM 在生物医学领域的崛起,以及如何利用 LLM 推动创新和改善该领域的成果,同时介绍相关的挑战和隐患。
{"title":"LARGE LANGUAGE MODELS (LLMS) AND CHATGPT FOR BIOMEDICINE.","authors":"Cecilia Arighi, Steven Brenner, Zhiyong Lu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Large Language Models (LLMs) are a type of artificial intelligence that has been revolutionizing various fields, including biomedicine. They have the capability to process and analyze large amounts of data, understand natural language, and generate new content, making them highly desirable in many biomedical applications and beyond. In this workshop, we aim to introduce the attendees to an in-depth understanding of the rise of LLMs in biomedicine, and how they are being used to drive innovation and improve outcomes in the field, along with associated challenges and pitfalls.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scalar-Function Causal Discovery for Generating Causal Hypotheses with Observational Wearable Device Data. 利用观察型可穿戴设备数据生成因果假设的标量函数因果关系发现。
Valeriya Rogovchenko, Austin Sibu, Yang Ni

Digital health technologies such as wearable devices have transformed health data analytics, providing continuous, high-resolution functional data on various health metrics, thereby opening new avenues for innovative research. In this work, we introduce a new approach for generating causal hypotheses for a pair of a continuous functional variable (e.g., physical activities recorded over time) and a binary scalar variable (e.g., mobility condition indicator). Our method goes beyond traditional association-focused approaches and has the potential to reveal the underlying causal mechanism. We theoretically show that the proposed scalar-function causal model is identifiable with observational data alone. Our identifiability theory justifies the use of a simple yet principled algorithm to discern the causal relationship by comparing the likelihood functions of competing causal hypotheses. The robustness and applicability of our method are demonstrated through simulation studies and a real-world application using wearable device data from the National Health and Nutrition Examination Survey.

可穿戴设备等数字健康技术改变了健康数据分析,为各种健康指标提供了连续、高分辨率的功能数据,从而为创新研究开辟了新途径。在这项工作中,我们介绍了一种新方法,用于为一对连续功能变量(如随时间记录的体力活动)和二元标量变量(如行动状况指标)生成因果假设。我们的方法超越了传统的以关联为重点的方法,具有揭示潜在因果机制的潜力。我们从理论上证明,所提出的标量函数因果模型仅凭观察数据就可以识别。我们的可识别性理论证明,通过比较相互竞争的因果假设的似然函数,可以使用一种简单而原则性强的算法来辨别因果关系。我们的方法通过模拟研究和实际应用(使用美国国家健康与营养调查的可穿戴设备数据)证明了其稳健性和适用性。
{"title":"Scalar-Function Causal Discovery for Generating Causal Hypotheses with Observational Wearable Device Data.","authors":"Valeriya Rogovchenko, Austin Sibu, Yang Ni","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Digital health technologies such as wearable devices have transformed health data analytics, providing continuous, high-resolution functional data on various health metrics, thereby opening new avenues for innovative research. In this work, we introduce a new approach for generating causal hypotheses for a pair of a continuous functional variable (e.g., physical activities recorded over time) and a binary scalar variable (e.g., mobility condition indicator). Our method goes beyond traditional association-focused approaches and has the potential to reveal the underlying causal mechanism. We theoretically show that the proposed scalar-function causal model is identifiable with observational data alone. Our identifiability theory justifies the use of a simple yet principled algorithm to discern the causal relationship by comparing the likelihood functions of competing causal hypotheses. The robustness and applicability of our method are demonstrated through simulation studies and a real-world application using wearable device data from the National Health and Nutrition Examination Survey.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10764070/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Conversational Agent for Early Detection of Neurotoxic Effects of Medications through Automated Intensive Observation. 通过自动强化观察及早发现药物神经毒性效应的对话式代理。
Serguei Pakhomov, Jacob Solinsky, Martin Michalowski, Veronika Bachanova

We present a fully automated AI-based system for intensive monitoring of cognitive symptoms of neurotoxicity that frequently appear as a result of immunotherapy of hematologic malignancies. Early manifestations of these symptoms are evident in the patient's speech in the form of mild aphasia and confusion and can be detected and effectively treated prior to onset of more serious and potentially life-threatening impairment. We have developed the Automated Neural Nursing Assistant (ANNA) system designed to conduct a brief cognitive assessment several times per day over the telephone for 5-14 days following infusion of the immunotherapy medication. ANNA uses a conversational agent based on a large language model to elicit spontaneous speech in a semi-structured dialogue, followed by a series of brief language-based neurocognitive tests. In this paper we share ANNA's design and implementation, results of a pilot functional evaluation study, and discuss technical and logistic challenges facing the introduction of this type of technology in clinical practice. A large-scale clinical evaluation of ANNA will be conducted in an observational study of patients undergoing immunotherapy at the University of Minnesota Masonic Cancer Center starting in the Fall 2023.

我们介绍了一种基于人工智能的全自动系统,用于密集监测血液恶性肿瘤免疫治疗过程中经常出现的神经毒性认知症状。这些症状的早期表现以轻度失语和意识模糊的形式出现在患者的言语中,可以在出现更严重和可能危及生命的损害之前被检测到并得到有效治疗。我们开发了自动神经护理助手(ANNA)系统,旨在通过电话在输注免疫疗法药物后的 5-14 天内每天多次进行简短的认知评估。ANNA 使用基于大型语言模型的对话代理,在半结构化对话中诱导自发言语,然后进行一系列基于语言的简短神经认知测试。在本文中,我们分享了 ANNA 的设计和实施、试点功能评估研究的结果,并讨论了在临床实践中引入此类技术所面临的技术和后勤挑战。从 2023 年秋季开始,明尼苏达大学松下癌症中心将对接受免疫疗法的患者进行观察研究,对 ANNA 进行大规模临床评估。
{"title":"A Conversational Agent for Early Detection of Neurotoxic Effects of Medications through Automated Intensive Observation.","authors":"Serguei Pakhomov, Jacob Solinsky, Martin Michalowski, Veronika Bachanova","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We present a fully automated AI-based system for intensive monitoring of cognitive symptoms of neurotoxicity that frequently appear as a result of immunotherapy of hematologic malignancies. Early manifestations of these symptoms are evident in the patient's speech in the form of mild aphasia and confusion and can be detected and effectively treated prior to onset of more serious and potentially life-threatening impairment. We have developed the Automated Neural Nursing Assistant (ANNA) system designed to conduct a brief cognitive assessment several times per day over the telephone for 5-14 days following infusion of the immunotherapy medication. ANNA uses a conversational agent based on a large language model to elicit spontaneous speech in a semi-structured dialogue, followed by a series of brief language-based neurocognitive tests. In this paper we share ANNA's design and implementation, results of a pilot functional evaluation study, and discuss technical and logistic challenges facing the introduction of this type of technology in clinical practice. A large-scale clinical evaluation of ANNA will be conducted in an observational study of patients undergoing immunotherapy at the University of Minnesota Masonic Cancer Center starting in the Fall 2023.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Creation of a Curated Database of Experimentally Determined Human Protein Structures for the Identification of Its Targetome. 创建实验确定的人类蛋白质结构编辑数据库,以确定其目标组。
Armand Ovanessians, Carson Snow, Thomas Jennewein, Susanta Sarkar, Gil Speyer, Judith Klein-Seetharaman

Assembling an "integrated structural map of the human cell" at atomic resolution will require a complete set of all human protein structures available for interaction with other biomolecules - the human protein structure targetome - and a pipeline of automated tools that allow quantitative analysis of millions of protein-ligand interactions. Toward this goal, we here describe the creation of a curated database of experimentally determined human protein structures. Starting with the sequences of 20,422 human proteins, we selected the most representative structure for each protein (if available) from the protein database (PDB), ranking structures by coverage of sequence by structure, depth (the difference between the final and initial residue number of each chain), resolution, and experimental method used to determine the structure. To enable expansion into an entire human targetome, we docked small molecule ligands to our curated set of protein structures. Using design constraints derived from comparing structure assembly and ligand docking results obtained with challenging protein examples, we here propose to combine this curated database of experimental structures with AlphaFold predictions and multi-domain assembly using DEMO2 in the future. To demonstrate the utility of our curated database in identification of the human protein structure targetome, we used docking with AutoDock Vina and created tools for automated analysis of affinity and binding site locations of the thousands of protein-ligand prediction results. The resulting human targetome, which can be updated and expanded with an evolving curated database and increasing numbers of ligands, is a valuable addition to the growing toolkit of structural bioinformatics.

要绘制原子分辨率的 "人类细胞综合结构图",需要一套完整的可与其他生物大分子相互作用的人类蛋白质结构--人类蛋白质结构目标组--以及一套可对数百万种蛋白质配体相互作用进行定量分析的自动化工具。为了实现这一目标,我们在此介绍了如何创建一个经实验确定的人类蛋白质结构数据库。从 20,422 个人类蛋白质的序列开始,我们从蛋白质数据库(PDB)中为每个蛋白质选择了最具代表性的结构(如果有的话),按照结构的序列覆盖率、深度(每条链的最终残基数与初始残基数之差)、分辨率以及确定结构所用的实验方法对结构进行排序。为了能够扩展到整个人类靶标组,我们将小分子配体与我们策划的蛋白质结构集对接。通过比较结构组装和配体对接结果与具有挑战性的蛋白质实例得出的设计约束,我们在此建议将来将这个实验结构策展数据库与 AlphaFold 预测和使用 DEMO2 的多域组装结合起来。为了证明我们所策划的数据库在识别人类蛋白质结构目标组方面的实用性,我们使用了 AutoDock Vina 进行对接,并创建了用于自动分析数千个蛋白质配体预测结果的亲和力和结合位点位置的工具。由此产生的人类靶标组可以随着不断发展的数据库和配体数量的增加而更新和扩展,是对结构生物信息学日益增长的工具包的宝贵补充。
{"title":"Creation of a Curated Database of Experimentally Determined Human Protein Structures for the Identification of Its Targetome.","authors":"Armand Ovanessians, Carson Snow, Thomas Jennewein, Susanta Sarkar, Gil Speyer, Judith Klein-Seetharaman","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Assembling an \"integrated structural map of the human cell\" at atomic resolution will require a complete set of all human protein structures available for interaction with other biomolecules - the human protein structure targetome - and a pipeline of automated tools that allow quantitative analysis of millions of protein-ligand interactions. Toward this goal, we here describe the creation of a curated database of experimentally determined human protein structures. Starting with the sequences of 20,422 human proteins, we selected the most representative structure for each protein (if available) from the protein database (PDB), ranking structures by coverage of sequence by structure, depth (the difference between the final and initial residue number of each chain), resolution, and experimental method used to determine the structure. To enable expansion into an entire human targetome, we docked small molecule ligands to our curated set of protein structures. Using design constraints derived from comparing structure assembly and ligand docking results obtained with challenging protein examples, we here propose to combine this curated database of experimental structures with AlphaFold predictions and multi-domain assembly using DEMO2 in the future. To demonstrate the utility of our curated database in identification of the human protein structure targetome, we used docking with AutoDock Vina and created tools for automated analysis of affinity and binding site locations of the thousands of protein-ligand prediction results. The resulting human targetome, which can be updated and expanded with an evolving curated database and increasing numbers of ligands, is a valuable addition to the growing toolkit of structural bioinformatics.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1