首页 > 最新文献

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing最新文献

英文 中文
Modeling Path Importance for Effective Alzheimer's Disease Drug Repurposing. 为有效的阿尔茨海默病药物再利用建立路径重要性模型
Shunian Xiang, Patrick J Lawrence, Bo Peng, ChienWei Chiang, Dokyoon Kim, Li Shen, Xia Ning

Recently, drug repurposing has emerged as an effective and resource-efficient paradigm for AD drug discovery. Among various methods for drug repurposing, network-based methods have shown promising results as they are capable of leveraging complex networks that integrate multiple interaction types, such as protein-protein interactions, to more effectively identify candidate drugs. However, existing approaches typically assume paths of the same length in the network have equal importance in identifying the therapeutic effect of drugs. Other domains have found that same length paths do not necessarily have the same importance. Thus, relying on this assumption may be deleterious to drug repurposing attempts. In this work, we propose MPI (Modeling Path Importance), a novel network-based method for AD drug repurposing. MPI is unique in that it prioritizes important paths via learned node embeddings, which can effectively capture a network's rich structural information. Thus, leveraging learned embeddings allows MPI to effectively differentiate the importance among paths. We evaluate MPI against a commonly used baseline method that identifies anti-AD drug candidates primarily based on the shortest paths between drugs and AD in the network. We observe that among the top-50 ranked drugs, MPI prioritizes 20.0% more drugs with anti-AD evidence compared to the baseline. Finally, Cox proportional-hazard models produced from insurance claims data aid us in identifying the use of etodolac, nicotine, and BBB-crossing ACE-INHs as having a reduced risk of AD, suggesting such drugs may be viable candidates for repurposing and should be explored further in future studies.

近来,药物再利用已成为一种有效且节省资源的AD药物发现范例。在各种药物再利用方法中,基于网络的方法显示出良好的效果,因为它们能够利用整合了多种相互作用类型(如蛋白质-蛋白质相互作用)的复杂网络,更有效地确定候选药物。然而,现有方法通常假定网络中相同长度的路径在确定药物治疗效果方面具有同等重要性。其他领域的研究发现,相同长度的路径并不一定具有相同的重要性。因此,依赖这一假设可能会不利于药物再利用的尝试。在这项工作中,我们提出了 MPI(路径重要性建模),这是一种基于网络的新型 AD 药物再利用方法。MPI 的独特之处在于,它通过学习的节点嵌入对重要路径进行优先排序,从而有效捕捉网络的丰富结构信息。因此,利用学习到的嵌入信息,MPI 可以有效区分不同路径的重要性。我们将 MPI 与一种常用的基线方法进行了对比评估,后者主要根据网络中药物与 AD 之间的最短路径来识别抗 AD 候选药物。我们发现,与基线方法相比,在排名前 50 位的药物中,MPI 优先选择的具有抗 AD 证据的药物多出 20.0%。最后,根据保险理赔数据建立的 Cox 比例危险模型帮助我们确定了使用依托度酸、尼古丁和跨越 BBB 的 ACE-INHs 可降低 AD 风险,这表明此类药物可能是再利用的可行候选药物,应在未来的研究中进一步探讨。
{"title":"Modeling Path Importance for Effective Alzheimer's Disease Drug Repurposing.","authors":"Shunian Xiang, Patrick J Lawrence, Bo Peng, ChienWei Chiang, Dokyoon Kim, Li Shen, Xia Ning","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Recently, drug repurposing has emerged as an effective and resource-efficient paradigm for AD drug discovery. Among various methods for drug repurposing, network-based methods have shown promising results as they are capable of leveraging complex networks that integrate multiple interaction types, such as protein-protein interactions, to more effectively identify candidate drugs. However, existing approaches typically assume paths of the same length in the network have equal importance in identifying the therapeutic effect of drugs. Other domains have found that same length paths do not necessarily have the same importance. Thus, relying on this assumption may be deleterious to drug repurposing attempts. In this work, we propose MPI (Modeling Path Importance), a novel network-based method for AD drug repurposing. MPI is unique in that it prioritizes important paths via learned node embeddings, which can effectively capture a network's rich structural information. Thus, leveraging learned embeddings allows MPI to effectively differentiate the importance among paths. We evaluate MPI against a commonly used baseline method that identifies anti-AD drug candidates primarily based on the shortest paths between drugs and AD in the network. We observe that among the top-50 ranked drugs, MPI prioritizes 20.0% more drugs with anti-AD evidence compared to the baseline. Finally, Cox proportional-hazard models produced from insurance claims data aid us in identifying the use of etodolac, nicotine, and BBB-crossing ACE-INHs as having a reduced risk of AD, suggesting such drugs may be viable candidates for repurposing and should be explored further in future studies.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"29 ","pages":"306-321"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11056095/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Imputation of race and ethnicity categories using genetic ancestry from real-world genomic testing data. 利用真实世界基因组测试数据中的遗传祖先推算种族和人种类别。
Brooke Rhead, Paige E Haffener, Yannick Pouliot, Francisco M De La Vega

The incompleteness of race and ethnicity information in real-world data (RWD) hampers its utility in promoting healthcare equity. This study introduces two methods-one heuristic and the other machine learning-based-to impute race and ethnicity from genetic ancestry using tumor profiling data. Analyzing de-identified data from over 100,000 cancer patients sequenced with the Tempus xT panel, we demonstrate that both methods outperform existing geolocation and surname-based methods, with the machine learning approach achieving high recall (range: 0.859-0.993) and precision (range: 0.932-0.981) across four mutually exclusive race and ethnicity categories. This work presents a novel pathway to enhance RWD utility in studying racial disparities in healthcare.

真实世界数据(RWD)中种族和民族信息的不完整性阻碍了其在促进医疗公平方面的作用。本研究介绍了两种方法--一种是启发式方法,另一种是基于机器学习的方法--利用肿瘤图谱数据从遗传祖先推算种族和人种。通过分析用 Tempus xT 面板测序的 10 万多名癌症患者的去标识化数据,我们证明这两种方法都优于现有的基于地理位置和姓氏的方法,其中机器学习方法在四个相互排斥的种族和民族类别中实现了高召回率(范围:0.859-0.993)和高精确度(范围:0.932-0.981)。这项工作提出了一种新的途径,以提高 RWD 在研究医疗保健中种族差异方面的效用。
{"title":"Imputation of race and ethnicity categories using genetic ancestry from real-world genomic testing data.","authors":"Brooke Rhead, Paige E Haffener, Yannick Pouliot, Francisco M De La Vega","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The incompleteness of race and ethnicity information in real-world data (RWD) hampers its utility in promoting healthcare equity. This study introduces two methods-one heuristic and the other machine learning-based-to impute race and ethnicity from genetic ancestry using tumor profiling data. Analyzing de-identified data from over 100,000 cancer patients sequenced with the Tempus xT panel, we demonstrate that both methods outperform existing geolocation and surname-based methods, with the machine learning approach achieving high recall (range: 0.859-0.993) and precision (range: 0.932-0.981) across four mutually exclusive race and ethnicity categories. This work presents a novel pathway to enhance RWD utility in studying racial disparities in healthcare.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"29 ","pages":"433-445"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session Introduction: Artificial Intelligence in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface. 会议简介:临床医学中的人工智能:人机界面上的生成和交互系统。
Sajjad Fouladvand, Emma Pierson, Ivana Jankovic, David Ouyang, Jonathan H Chen, Roxana Daneshjou

Artificial Intelligence (AI) models are substantially enhancing the capability to analyze complex and multi-dimensional datasets. Generative AI and deep learning models have demonstrated significant advancements in extracting knowledge from unstructured text, imaging as well as structured and tabular data. This recent breakthrough in AI has inspired research in medicine, leading to the development of numerous tools for creating clinical decision support systems, monitoring tools, image interpretation, and triaging capabilities. Nevertheless, comprehensive research is imperative to evaluate the potential impact and implications of AI systems in healthcare. At the 2024 Pacific Symposium on Biocomputing (PSB) session entitled "Artificial Intelligence in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface", we spotlight research that develops and applies AI algorithms to solve real-world problems in healthcare.

人工智能(AI)模型大大提高了分析复杂和多维数据集的能力。生成式人工智能和深度学习模型在从非结构化文本、图像以及结构化和表格数据中提取知识方面取得了显著进步。人工智能领域的这一最新突破激发了医学研究的灵感,开发出了许多用于创建临床决策支持系统、监测工具、图像解读和分流功能的工具。然而,要评估人工智能系统在医疗保健领域的潜在影响和意义,全面的研究势在必行。在 2024 年太平洋生物计算研讨会(PSB)题为 "人工智能在临床医学中的应用 "的会议上,与会代表就人工智能在医疗保健领域的应用进行了深入探讨:人机界面上的生成和交互系统 "分会上,我们将重点介绍开发和应用人工智能算法解决医疗保健领域实际问题的研究。
{"title":"Session Introduction: Artificial Intelligence in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface.","authors":"Sajjad Fouladvand, Emma Pierson, Ivana Jankovic, David Ouyang, Jonathan H Chen, Roxana Daneshjou","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Artificial Intelligence (AI) models are substantially enhancing the capability to analyze complex and multi-dimensional datasets. Generative AI and deep learning models have demonstrated significant advancements in extracting knowledge from unstructured text, imaging as well as structured and tabular data. This recent breakthrough in AI has inspired research in medicine, leading to the development of numerous tools for creating clinical decision support systems, monitoring tools, image interpretation, and triaging capabilities. Nevertheless, comprehensive research is imperative to evaluate the potential impact and implications of AI systems in healthcare. At the 2024 Pacific Symposium on Biocomputing (PSB) session entitled \"Artificial Intelligence in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface\", we spotlight research that develops and applies AI algorithms to solve real-world problems in healthcare.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"29 ","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A deep neural network estimation of brain age is sensitive to cognitive impairment and decline. 深度神经网络对大脑年龄的估计对认知障碍和衰退很敏感。
Yisu Yang, Aditi Sathe, Kurt Schilling, Niranjana Shashikumar, Elizabeth Moore, Logan Dumitrescu, Kimberly R Pechman, Bennett A Landman, Katherine A Gifford, Timothy J Hohman, Angela L Jefferson, Derek B Archer

The greatest known risk factor for Alzheimer's disease (AD) is age. While both normal aging and AD pathology involve structural changes in the brain, their trajectories of atrophy are not the same. Recent developments in artificial intelligence have encouraged studies to leverage neuroimaging-derived measures and deep learning approaches to predict brain age, which has shown promise as a sensitive biomarker in diagnosing and monitoring AD. However, prior efforts primarily involved structural magnetic resonance imaging and conventional diffusion MRI (dMRI) metrics without accounting for partial volume effects. To address this issue, we post-processed our dMRI scans with an advanced free-water (FW) correction technique to compute distinct FW-corrected fractional anisotropy (FAFWcorr) and FW maps that allow for the separation of tissue from fluid in a scan. We built 3 densely connected neural networks from FW-corrected dMRI, T1-weighted MRI, and combined FW+T1 features, respectively, to predict brain age. We then investigated the relationship of actual age and predicted brain ages with cognition. We found that all models accurately predicted actual age in cognitively unimpaired (CU) controls (FW: r=0.66, p=1.62x10-32; T1: r=0.61, p=1.45x10-26, FW+T1: r=0.77, p=6.48x10-50) and distinguished between CU and mild cognitive impairment participants (FW: p=0.006; T1: p=0.048; FW+T1: p=0.003), with FW+T1-derived age showing best performance. Additionally, all predicted brain age models were significantly associated with cross-sectional cognition (memory, FW: β=-1.094, p=6.32x10-7; T1: β=-1.331, p=6.52x10-7; FW+T1: β=-1.476, p=2.53x10-10; executive function, FW: β=-1.276, p=1.46x10-9; T1: β=-1.337, p=2.52x10-7; FW+T1: β=-1.850, p=3.85x10-17) and longitudinal cognition (memory, FW: β=-0.091, p=4.62x10-11; T1: β=-0.097, p=1.40x10-8; FW+T1: β=-0.101, p=1.35x10-11; executive function, FW: β=-0.125, p=1.20x10-10; T1: β=-0.163, p=4.25x10-12; FW+T1: β=-0.158, p=1.65x10-14). Our findings provide evidence that both T1-weighted MRI and dMRI measures improve brain age prediction and support predicted brain age as a sensitive biomarker of cognition and cognitive decline.

阿尔茨海默病(AD)最大的已知风险因素是年龄。虽然正常衰老和阿尔茨海默病的病理过程都涉及大脑结构的变化,但它们的萎缩轨迹并不相同。人工智能的最新发展推动了利用神经影像衍生测量和深度学习方法来预测脑年龄的研究。然而,之前的研究主要涉及结构性磁共振成像和传统的弥散磁共振成像(dMRI)指标,没有考虑到部分容积效应。为了解决这个问题,我们采用先进的自由水(FW)校正技术对 dMRI 扫描进行后处理,计算出不同的 FW 校正分数各向异性(FAFWcorr)和 FW 图,从而在扫描中将组织和液体分离开来。我们从 FW 校正 dMRI、T1 加权 MRI 和 FW+T1 组合特征中分别构建了 3 个密集连接的神经网络来预测大脑年龄。然后,我们研究了实际年龄和预测脑年龄与认知的关系。我们发现,所有模型都能准确预测认知功能未受损(CU)对照组的实际年龄(FW:r=0.66,p=1.62x10-32;T1:r=0.61,p=1.45x10-26,FW+T1:r=0.77,p=6.48x10-50),并能区分CU和轻度认知障碍参与者(FW:p=0.006;T1:p=0.048;FW+T1:p=0.003),其中FW+T1得出的年龄表现最佳。此外,所有预测的脑年龄模型都与横截面认知能力显著相关(记忆,FW:β=-1.094,p=6.32x10-7;T1:β=-1.331,p=6.52x10-7;FW+T1:β=-1.476,p=2.53x10-10;执行功能,FW:β=-1.276,p=1.46x10-9;T1:β=-1.337,p=2.52x10-7;FW+T1:β=-1.850,p=3.85x10-17)和纵向认知(记忆,FW:β=-0.091,p=4.62x10-11;T1:β=-0.097,p=1.40x10-8;FW+T1:β=-0.101,p=1.35x10-11;执行功能,FW:β=-0.125,p=1.20x10-10;T1:β=-0.163,p=4.25x10-12;FW+T1:β=-0.158,p=1.65x10-14)。我们的研究结果证明,T1加权磁共振成像和dMRI测量都能改善脑年龄预测,并支持将预测脑年龄作为认知和认知衰退的敏感生物标志物。
{"title":"A deep neural network estimation of brain age is sensitive to cognitive impairment and decline.","authors":"Yisu Yang, Aditi Sathe, Kurt Schilling, Niranjana Shashikumar, Elizabeth Moore, Logan Dumitrescu, Kimberly R Pechman, Bennett A Landman, Katherine A Gifford, Timothy J Hohman, Angela L Jefferson, Derek B Archer","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The greatest known risk factor for Alzheimer's disease (AD) is age. While both normal aging and AD pathology involve structural changes in the brain, their trajectories of atrophy are not the same. Recent developments in artificial intelligence have encouraged studies to leverage neuroimaging-derived measures and deep learning approaches to predict brain age, which has shown promise as a sensitive biomarker in diagnosing and monitoring AD. However, prior efforts primarily involved structural magnetic resonance imaging and conventional diffusion MRI (dMRI) metrics without accounting for partial volume effects. To address this issue, we post-processed our dMRI scans with an advanced free-water (FW) correction technique to compute distinct FW-corrected fractional anisotropy (FAFWcorr) and FW maps that allow for the separation of tissue from fluid in a scan. We built 3 densely connected neural networks from FW-corrected dMRI, T1-weighted MRI, and combined FW+T1 features, respectively, to predict brain age. We then investigated the relationship of actual age and predicted brain ages with cognition. We found that all models accurately predicted actual age in cognitively unimpaired (CU) controls (FW: r=0.66, p=1.62x10-32; T1: r=0.61, p=1.45x10-26, FW+T1: r=0.77, p=6.48x10-50) and distinguished between CU and mild cognitive impairment participants (FW: p=0.006; T1: p=0.048; FW+T1: p=0.003), with FW+T1-derived age showing best performance. Additionally, all predicted brain age models were significantly associated with cross-sectional cognition (memory, FW: β=-1.094, p=6.32x10-7; T1: β=-1.331, p=6.52x10-7; FW+T1: β=-1.476, p=2.53x10-10; executive function, FW: β=-1.276, p=1.46x10-9; T1: β=-1.337, p=2.52x10-7; FW+T1: β=-1.850, p=3.85x10-17) and longitudinal cognition (memory, FW: β=-0.091, p=4.62x10-11; T1: β=-0.097, p=1.40x10-8; FW+T1: β=-0.101, p=1.35x10-11; executive function, FW: β=-0.125, p=1.20x10-10; T1: β=-0.163, p=4.25x10-12; FW+T1: β=-0.158, p=1.65x10-14). Our findings provide evidence that both T1-weighted MRI and dMRI measures improve brain age prediction and support predicted brain age as a sensitive biomarker of cognition and cognitive decline.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"29 ","pages":"148-162"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10764074/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transcript-aware analysis of rare predicted loss-of-function variants in the UK Biobank elucidate new isoform-trait associations. 对英国生物库中罕见的预测功能缺失变体进行转录本感知分析,阐明新的同工酶-性状关联。
Rachel A Hoffing, Aimee M Deaton, Aaron M Holleman, Lynne Krohn, Philip J LoGerfo, Mollie E Plekan, Sebastian Akle Serrano, Paul Nioi, Lucas D Ward

A single gene can produce multiple transcripts with distinct molecular functions. Rare-variant association tests often aggregate all coding variants across individual genes, without accounting for the variants' presence or consequence in resulting transcript isoforms. To evaluate the utility of transcript-aware variant sets, rare predicted loss-of-function (pLOF) variants were aggregated for 17,035 protein-coding genes using 55,558 distinct transcript-specific variant sets. These sets were tested for their association with 728 circulating proteins and 188 quantitative phenotypes across 406,921 individuals in the UK Biobank. The transcript-specific approach resulted in larger estimated effects of pLOF variants decreasing serum cis-protein levels compared to the gene-based approach (pbinom ≤ 2x10-16). Additionally, 251 quantitative trait associations were identified as being significant using the transcript-specific approach but not the gene-based approach, including PCSK5 transcript ENST00000376752 and standing height (transcript-specific statistic, P = 1.3x10-16, effect = 0.7 SD decrease; gene-based statistic, P = 0.02, effect = 0.05 SD decrease) and LDLR transcript ENST00000252444 and apolipoprotein B (transcript-specific statistic, P = 5.7x10-20, effect = 1.0 SD increase; gene-based statistic, P = 3.0x10-4, effect = 0.2 SD increase). This approach demonstrates the importance of considering the effect of pLOFs on specific transcript isoforms when performing rare-variant association studies.

一个基因可以产生多种具有不同分子功能的转录本。罕见变异关联测试通常会汇总单个基因的所有编码变异,而不会考虑变异在转录本异构体中的存在或后果。为了评估转录本感知变异集的效用,我们使用 55558 个不同的转录本特异性变异集汇总了 17035 个蛋白编码基因的罕见预测功能缺失(pLOF)变异。这些变异集与英国生物库中 406921 人的 728 种循环蛋白和 188 种定量表型进行了关联测试。与基于基因的方法相比(pbinom ≤ 2x10-16),转录本特异性方法导致 pLOF 变体降低血清顺式蛋白水平的估计效应更大。此外,使用转录本特异性方法而非基于基因的方法,确定了 251 个数量性状关联具有显著性,包括 PCSK5 转录本 ENST00000376752 和站立高度(转录本特异性统计量,P = 1.3x10-16,效应 = 0.7 SD 下降;基于基因的统计量,P = 0.02,效应 = 0.05 SD 下降)和 LDLR 转录本 ENST00000252444 与脂蛋白 B(转录本特异性统计量,P = 5.7x10-20,效应 = 1.0 SD 增加;基于基因的统计量,P = 3.0x10-4,效应 = 0.2 SD 增加)。这种方法表明,在进行罕见变异关联研究时,考虑 pLOF 对特定转录本同工酶的影响非常重要。
{"title":"Transcript-aware analysis of rare predicted loss-of-function variants in the UK Biobank elucidate new isoform-trait associations.","authors":"Rachel A Hoffing, Aimee M Deaton, Aaron M Holleman, Lynne Krohn, Philip J LoGerfo, Mollie E Plekan, Sebastian Akle Serrano, Paul Nioi, Lucas D Ward","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>A single gene can produce multiple transcripts with distinct molecular functions. Rare-variant association tests often aggregate all coding variants across individual genes, without accounting for the variants' presence or consequence in resulting transcript isoforms. To evaluate the utility of transcript-aware variant sets, rare predicted loss-of-function (pLOF) variants were aggregated for 17,035 protein-coding genes using 55,558 distinct transcript-specific variant sets. These sets were tested for their association with 728 circulating proteins and 188 quantitative phenotypes across 406,921 individuals in the UK Biobank. The transcript-specific approach resulted in larger estimated effects of pLOF variants decreasing serum cis-protein levels compared to the gene-based approach (pbinom ≤ 2x10-16). Additionally, 251 quantitative trait associations were identified as being significant using the transcript-specific approach but not the gene-based approach, including PCSK5 transcript ENST00000376752 and standing height (transcript-specific statistic, P = 1.3x10-16, effect = 0.7 SD decrease; gene-based statistic, P = 0.02, effect = 0.05 SD decrease) and LDLR transcript ENST00000252444 and apolipoprotein B (transcript-specific statistic, P = 5.7x10-20, effect = 1.0 SD increase; gene-based statistic, P = 3.0x10-4, effect = 0.2 SD increase). This approach demonstrates the importance of considering the effect of pLOFs on specific transcript isoforms when performing rare-variant association studies.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"29 ","pages":"247-260"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SynTwin: A graph-based approach for predicting clinical outcomes using digital twins derived from synthetic patients. SynTwin:一种基于图谱的方法,利用从合成患者中提取的数字双胞胎预测临床结果。
Jason H Moore, Xi Li, Jui-Hsuan Chang, Nicholas P Tatonetti, Dan Theodorescu, Yong Chen, Folkert W Asselbergs, Mythreye Venkatesan, Zhiping Paul Wang

The concept of a digital twin came from the engineering, industrial, and manufacturing domains to create virtual objects or machines that could inform the design and development of real objects. This idea is appealing for precision medicine where digital twins of patients could help inform healthcare decisions. We have developed a methodology for generating and using digital twins for clinical outcome prediction. We introduce a new approach that combines synthetic data and network science to create digital twins (i.e. SynTwin) for precision medicine. First, our approach starts by estimating the distance between all subjects based on their available features. Second, the distances are used to construct a network with subjects as nodes and edges defining distance less than the percolation threshold. Third, communities or cliques of subjects are defined. Fourth, a large population of synthetic patients are generated using a synthetic data generation algorithm that models the correlation structure of the data to generate new patients. Fifth, digital twins are selected from the synthetic patient population that are within a given distance defining a subject community in the network. Finally, we compare and contrast community-based prediction of clinical endpoints using real subjects, digital twins, or both within and outside of the community. Key to this approach are the digital twins defined using patient similarity that represent hypothetical unobserved patients with patterns similar to nearby real patients as defined by network distance and community structure. We apply our SynTwin approach to predicting mortality in a population-based cancer registry (n=87,674) from the Surveillance, Epidemiology, and End Results (SEER) program from the National Cancer Institute (USA). Our results demonstrate that nearest network neighbor prediction of mortality in this study is significantly improved with digital twins (AUROC=0.864, 95% CI=0.857-0.872) over just using real data alone (AUROC=0.791, 95% CI=0.781-0.800). These results suggest a network-based digital twin strategy using synthetic patients may add value to precision medicine efforts.

数字孪生的概念来自工程、工业和制造领域,旨在创建虚拟物体或机器,为真实物体的设计和开发提供参考。这一想法对精准医疗很有吸引力,患者的数字孪生可以帮助医疗决策提供依据。我们开发了一种生成和使用数字双胞胎进行临床结果预测的方法。我们介绍了一种结合合成数据和网络科学的新方法,为精准医疗创建数字孪生(即 SynTwin)。首先,我们的方法是根据所有受试者的可用特征来估计他们之间的距离。其次,利用这些距离构建一个网络,以受试者为节点,边缘定义的距离小于渗透阈值。第三,定义受试者的群落或小群。第四,使用合成数据生成算法生成大量合成患者,该算法可模拟数据的相关结构,生成新的患者。第五,从合成患者群体中挑选出一定距离内的数字双胞胎,定义网络中的主体群落。最后,我们使用真实受试者、数字双胞胎或社区内外的受试者对基于社区的临床终点预测进行比较和对比。这种方法的关键在于使用患者相似性定义的数字孪生,它代表了假设的未观察到的患者,其模式与网络距离和社区结构定义的附近真实患者相似。我们将 SynTwin 方法应用于预测美国国家癌症研究所(National Cancer Institute,USA)监测、流行病学和最终结果(Surveillance,Epidemiology,and End Results,SEER)计划中基于人群的癌症登记(n=87,674)中的死亡率。我们的研究结果表明,在这项研究中,使用数字孪生(AUROC=0.864,95% CI=0.857-0.872)对死亡率进行最近网络邻接预测,比只使用真实数据(AUROC=0.791,95% CI=0.781-0.800)有显著提高。这些结果表明,使用合成患者的基于网络的数字孪生策略可能会为精准医疗工作增添价值。
{"title":"SynTwin: A graph-based approach for predicting clinical outcomes using digital twins derived from synthetic patients.","authors":"Jason H Moore, Xi Li, Jui-Hsuan Chang, Nicholas P Tatonetti, Dan Theodorescu, Yong Chen, Folkert W Asselbergs, Mythreye Venkatesan, Zhiping Paul Wang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The concept of a digital twin came from the engineering, industrial, and manufacturing domains to create virtual objects or machines that could inform the design and development of real objects. This idea is appealing for precision medicine where digital twins of patients could help inform healthcare decisions. We have developed a methodology for generating and using digital twins for clinical outcome prediction. We introduce a new approach that combines synthetic data and network science to create digital twins (i.e. SynTwin) for precision medicine. First, our approach starts by estimating the distance between all subjects based on their available features. Second, the distances are used to construct a network with subjects as nodes and edges defining distance less than the percolation threshold. Third, communities or cliques of subjects are defined. Fourth, a large population of synthetic patients are generated using a synthetic data generation algorithm that models the correlation structure of the data to generate new patients. Fifth, digital twins are selected from the synthetic patient population that are within a given distance defining a subject community in the network. Finally, we compare and contrast community-based prediction of clinical endpoints using real subjects, digital twins, or both within and outside of the community. Key to this approach are the digital twins defined using patient similarity that represent hypothetical unobserved patients with patterns similar to nearby real patients as defined by network distance and community structure. We apply our SynTwin approach to predicting mortality in a population-based cancer registry (n=87,674) from the Surveillance, Epidemiology, and End Results (SEER) program from the National Cancer Institute (USA). Our results demonstrate that nearest network neighbor prediction of mortality in this study is significantly improved with digital twins (AUROC=0.864, 95% CI=0.857-0.872) over just using real data alone (AUROC=0.791, 95% CI=0.781-0.800). These results suggest a network-based digital twin strategy using synthetic patients may add value to precision medicine efforts.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"29 ","pages":"96-107"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10827004/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging 3D Echocardiograms to Evaluate AI Model Performance in Predicting Cardiac Function on Out-of-Distribution Data. 利用三维超声心动图评估人工智能模型在分布外数据上预测心功能的性能。
Grant Duffy, Kai Christensen, David Ouyang

Advancements in medical imaging and artificial intelligence (AI) have revolutionized the field of cardiac diagnostics, providing accurate and efficient tools for assessing cardiac function. AI diagnostics claims to improve upon the human-to-human variation that is known to be significant. However, when put in practice, for cardiac ultrasound, AI models are being run on images acquired by human sonographers whose quality and consistency may vary. With more variation than other medical imaging modalities, variation in image acquisition may lead to out-of-distribution (OOD) data and unpredictable performance of the AI tools. Recent advances in ultrasound technology has allowed the acquisition of both 3D as well as 2D data, however 3D has more limited temporal and spatial resolution and is still not routinely acquired. Because the training datasets used when developing AI algorithms are mostly developed using 2D images, it is difficult to determine the impact of human variation on the performance of AI tools in the real world. The objective of this project is to leverage 3D echos to simulate realistic human variation of image acquisition and better understand the OOD performance of a previously validated AI model. In doing so, we develop tools for interpreting 3D echo data and quantifiably recreating common variation in image acquisition between sonographers. We also developed a technique for finding good standard 2D views in 3D echo volumes. We found the performance of the AI model we evaluated to be as expected when the view is good, but variations in acquisition position degraded AI model performance. Performance on far from ideal views was poor, but still better than random, suggesting that there is some information being used that permeates the whole volume, not just a quality view. Additionally, we found that variations in foreshortening didn't result in the same errors that a human would make.

医学成像和人工智能(AI)的进步彻底改变了心脏诊断领域,为评估心脏功能提供了准确高效的工具。众所周知,人与人之间存在显著差异,而人工智能诊断技术则能改善这种差异。然而,在实际应用中,就心脏超声而言,人工智能模型是在人类超声技师获取的图像上运行的,而人类超声技师的图像质量和一致性可能存在差异。与其他医学成像模式相比,人工智能模型的质量和一致性可能会有差异,图像采集的差异可能会导致数据超出分布范围(OOD)和人工智能工具性能的不可预测性。超声技术的最新进展使得三维和二维数据的采集成为可能,但三维数据的时间和空间分辨率较为有限,目前仍未被常规采集。由于开发人工智能算法时使用的训练数据集大多是使用二维图像开发的,因此很难确定人为变化对人工智能工具在真实世界中的性能的影响。本项目的目标是利用三维回声模拟人类在获取图像时的真实变化,并更好地了解先前验证过的人工智能模型的 OOD 性能。在此过程中,我们开发了解释三维回波数据的工具,并以量化的方式再现了超声技师在图像采集方面的常见差异。我们还开发了一种在三维回波卷中寻找良好标准二维视图的技术。我们发现,当视图良好时,我们评估的人工智能模型的性能符合预期,但采集位置的变化会降低人工智能模型的性能。远非理想视图的性能较差,但仍优于随机视图,这表明所使用的某些信息渗透到整个容积中,而不仅仅是优质视图。此外,我们还发现,前缩的变化并不会导致与人类相同的错误。
{"title":"Leveraging 3D Echocardiograms to Evaluate AI Model Performance in Predicting Cardiac Function on Out-of-Distribution Data.","authors":"Grant Duffy, Kai Christensen, David Ouyang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Advancements in medical imaging and artificial intelligence (AI) have revolutionized the field of cardiac diagnostics, providing accurate and efficient tools for assessing cardiac function. AI diagnostics claims to improve upon the human-to-human variation that is known to be significant. However, when put in practice, for cardiac ultrasound, AI models are being run on images acquired by human sonographers whose quality and consistency may vary. With more variation than other medical imaging modalities, variation in image acquisition may lead to out-of-distribution (OOD) data and unpredictable performance of the AI tools. Recent advances in ultrasound technology has allowed the acquisition of both 3D as well as 2D data, however 3D has more limited temporal and spatial resolution and is still not routinely acquired. Because the training datasets used when developing AI algorithms are mostly developed using 2D images, it is difficult to determine the impact of human variation on the performance of AI tools in the real world. The objective of this project is to leverage 3D echos to simulate realistic human variation of image acquisition and better understand the OOD performance of a previously validated AI model. In doing so, we develop tools for interpreting 3D echo data and quantifiably recreating common variation in image acquisition between sonographers. We also developed a technique for finding good standard 2D views in 3D echo volumes. We found the performance of the AI model we evaluated to be as expected when the view is good, but variations in acquisition position degraded AI model performance. Performance on far from ideal views was poor, but still better than random, suggesting that there is some information being used that permeates the whole volume, not just a quality view. Additionally, we found that variations in foreshortening didn't result in the same errors that a human would make.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"29 ","pages":"39-52"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11684417/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lymphocyte Count Derived Polygenic Score and Interindividual Variability in CD4 T-cell Recovery in Response to Antiretroviral Therapy. 淋巴细胞计数得出的多基因评分与抗逆转录病毒疗法后 CD4 T 细胞恢复的个体间差异。
Kathleen M Cardone, Scott Dudek, Karl Keat, Yuki Bradford, Zinhle Cindi, Eric S Daar, Roy Gulick, Sharon A Riddler, Jeffrey L Lennox, Phumla Sinxadi, David W Haas, Marylyn D Ritchie

Access to safe and effective antiretroviral therapy (ART) is a cornerstone in the global response to the HIV pandemic. Among people living with HIV, there is considerable interindividual variability in absolute CD4 T-cell recovery following initiation of virally suppressive ART. The contribution of host genetics to this variability is not well understood. We explored the contribution of a polygenic score which was derived from large, publicly available summary statistics for absolute lymphocyte count from individuals in the general population (PGSlymph) due to a lack of publicly available summary statistics for CD4 T-cell count. We explored associations with baseline CD4 T-cell count prior to ART initiation (n=4959) and change from baseline to week 48 on ART (n=3274) among treatment-naïve participants in prospective, randomized ART studies of the AIDS Clinical Trials Group. We separately examined an African-ancestry-derived and a European-ancestry-derived PGSlymph, and evaluated their performance across all participants, and also in the African and European ancestral groups separately. Multivariate models that included PGSlymph, baseline plasma HIV-1 RNA, age, sex, and 15 principal components (PCs) of genetic similarity explained ∼26-27% of variability in baseline CD4 T-cell count, but PGSlymph accounted for <1% of this variability. Models that also included baseline CD4 T-cell count explained ∼7-9% of variability in CD4 T-cell count increase on ART, but PGSlymph accounted for <1% of this variability. In univariate analyses, PGSlymph was not significantly associated with baseline or change in CD4 T-cell count. Among individuals of African ancestry, the African PGSlymph term in the multivariate model was significantly associated with change in CD4 T-cell count while not significant in the univariate model. When applied to lymphocyte count in a general medical biobank population (Penn Medicine BioBank), PGSlymph explained ∼6-10% of variability in multivariate models (including age, sex, and PCs) but only ∼1% in univariate models. In summary, a lymphocyte count PGS derived from the general population was not consistently associated with CD4 T-cell recovery on ART. Nonetheless, adjusting for clinical covariates is quite important when estimating such polygenic effects.

获得安全有效的抗逆转录病毒疗法(ART)是全球应对艾滋病大流行的基石。在艾滋病病毒感染者中,开始接受病毒抑制性抗逆转录病毒疗法后,CD4 T 细胞的绝对恢复能力在个体间存在相当大的差异。宿主遗传学对这一变异性的贡献尚不十分清楚。由于缺乏可公开获得的 CD4 T 细胞计数汇总统计数据,我们对多基因评分的贡献进行了探讨,该评分来自可公开获得的大量普通人群(PGSlymph)绝对淋巴细胞计数汇总统计数据。我们探讨了艾滋病临床试验组(AIDS Clinical Trials Group)前瞻性随机抗逆转录病毒疗法(ART)研究中未接受过治疗的参与者中,抗逆转录病毒疗法开始前的 CD4 T 细胞计数基线(4959 人)和从基线到抗逆转录病毒疗法第 48 周的变化(3274 人)之间的关联。我们分别研究了非洲裔和欧洲裔的 PGSlymph,并评估了它们在所有参与者中的表现,以及在非洲裔和欧洲裔群体中的表现。包含 PGSlymph、基线血浆 HIV-1 RNA、年龄、性别和 15 个遗传相似性主成分 (PCs) 的多变量模型解释了基线 CD4 T 细胞计数变异的 26% 至 27%,而 PGSlymph 则解释了基线 CD4 T 细胞计数变异的 26% 至 27%。
{"title":"Lymphocyte Count Derived Polygenic Score and Interindividual Variability in CD4 T-cell Recovery in Response to Antiretroviral Therapy.","authors":"Kathleen M Cardone, Scott Dudek, Karl Keat, Yuki Bradford, Zinhle Cindi, Eric S Daar, Roy Gulick, Sharon A Riddler, Jeffrey L Lennox, Phumla Sinxadi, David W Haas, Marylyn D Ritchie","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Access to safe and effective antiretroviral therapy (ART) is a cornerstone in the global response to the HIV pandemic. Among people living with HIV, there is considerable interindividual variability in absolute CD4 T-cell recovery following initiation of virally suppressive ART. The contribution of host genetics to this variability is not well understood. We explored the contribution of a polygenic score which was derived from large, publicly available summary statistics for absolute lymphocyte count from individuals in the general population (PGSlymph) due to a lack of publicly available summary statistics for CD4 T-cell count. We explored associations with baseline CD4 T-cell count prior to ART initiation (n=4959) and change from baseline to week 48 on ART (n=3274) among treatment-naïve participants in prospective, randomized ART studies of the AIDS Clinical Trials Group. We separately examined an African-ancestry-derived and a European-ancestry-derived PGSlymph, and evaluated their performance across all participants, and also in the African and European ancestral groups separately. Multivariate models that included PGSlymph, baseline plasma HIV-1 RNA, age, sex, and 15 principal components (PCs) of genetic similarity explained ∼26-27% of variability in baseline CD4 T-cell count, but PGSlymph accounted for <1% of this variability. Models that also included baseline CD4 T-cell count explained ∼7-9% of variability in CD4 T-cell count increase on ART, but PGSlymph accounted for <1% of this variability. In univariate analyses, PGSlymph was not significantly associated with baseline or change in CD4 T-cell count. Among individuals of African ancestry, the African PGSlymph term in the multivariate model was significantly associated with change in CD4 T-cell count while not significant in the univariate model. When applied to lymphocyte count in a general medical biobank population (Penn Medicine BioBank), PGSlymph explained ∼6-10% of variability in multivariate models (including age, sex, and PCs) but only ∼1% in univariate models. In summary, a lymphocyte count PGS derived from the general population was not consistently associated with CD4 T-cell recovery on ART. Nonetheless, adjusting for clinical covariates is quite important when estimating such polygenic effects.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"29 ","pages":"594-610"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10764076/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generating new drug repurposing hypotheses using disease-specific hypergraphs. 利用特定疾病超图生成新的药物再利用假设。
Ayush Jain, Marie-Laure Charpignon, Irene Y Chen, Anthony Philippakis, Ahmed Alaa

The drug development pipeline for a new compound can last 10-20 years and cost over $10 billion. Drug repurposing offers a more time- and cost-effective alternative. Computational approaches based on network graph representations, comprising a mixture of disease nodes and their interactions, have recently yielded new drug repurposing hypotheses, including suitable candidates for COVID-19. However, these interactomes remain aggregate by design and often lack disease specificity. This dilution of information may affect the relevance of drug node embeddings to a particular disease, the resulting drug-disease and drug-drug similarity scores, and therefore our ability to identify new targets or drug synergies. To address this problem, we propose constructing and learning disease-specific hypergraphs in which hyperedges encode biological pathways of various lengths. We use a modified node2vec algorithm to generate pathway embeddings. We evaluate our hypergraph's ability to find repurposing targets for an incurable but prevalent disease, Alzheimer's disease (AD), and compare our ranked-ordered recommendations to those derived from a state-of-the-art knowledge graph, the multiscale interactome. Using our method, we successfully identified 7 promising repurposing candidates for AD that were ranked as unlikely repurposing targets by the multiscale interactome but for which the existing literature provides supporting evidence. Additionally, our drug repositioning suggestions are accompanied by explanations, eliciting plausible biological pathways. In the future, we plan on scaling our proposed method to 800+ diseases, combining single-disease hypergraphs into multi-disease hypergraphs to account for subpopulations with risk factors or encode a given patient's comorbidities to formulate personalized repurposing recommendations.Supplementary materials and code: https://github.com/ayujain04/psb_supplement.

一种新化合物的药物开发周期可能长达 10-20 年,耗资超过 100 亿美元。药物再利用提供了一种时间更短、成本效益更高的替代方案。基于由疾病节点及其相互作用组成的网络图表示的计算方法最近产生了新的药物再利用假说,包括 COVID-19 的合适候选药物。然而,这些相互作用组的设计仍然是聚合的,往往缺乏疾病特异性。这种信息稀释可能会影响药物节点嵌入与特定疾病的相关性、由此产生的药物-疾病和药物-药物相似性得分,从而影响我们识别新靶点或药物协同作用的能力。为了解决这个问题,我们提出了构建和学习特定疾病超图的建议,其中超图编码了不同长度的生物通路。我们使用改进的 node2vec 算法生成路径嵌入。我们评估了我们的超图为阿尔茨海默病(AD)这一无法治愈但普遍存在的疾病寻找再利用目标的能力,并将我们的排序推荐与从最先进的知识图谱--多尺度交互组--中得出的推荐进行了比较。利用我们的方法,我们成功地发现了 7 种有希望重新成为治疗阿尔茨海默病目标的候选药物,这些候选药物在多尺度相互作用组中被列为不可能重新成为目标的药物,但现有文献为其提供了支持性证据。此外,我们的药物重新定位建议还附有解释,引出了合理的生物学途径。未来,我们计划将我们提出的方法推广到800多种疾病,将单病种超图结合到多病种超图中,以考虑具有风险因素的亚人群或编码特定患者的合并症,从而制定个性化的再利用建议。补充材料和代码:https://github.com/ayujain04/psb_supplement。
{"title":"Generating new drug repurposing hypotheses using disease-specific hypergraphs.","authors":"Ayush Jain, Marie-Laure Charpignon, Irene Y Chen, Anthony Philippakis, Ahmed Alaa","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The drug development pipeline for a new compound can last 10-20 years and cost over $10 billion. Drug repurposing offers a more time- and cost-effective alternative. Computational approaches based on network graph representations, comprising a mixture of disease nodes and their interactions, have recently yielded new drug repurposing hypotheses, including suitable candidates for COVID-19. However, these interactomes remain aggregate by design and often lack disease specificity. This dilution of information may affect the relevance of drug node embeddings to a particular disease, the resulting drug-disease and drug-drug similarity scores, and therefore our ability to identify new targets or drug synergies. To address this problem, we propose constructing and learning disease-specific hypergraphs in which hyperedges encode biological pathways of various lengths. We use a modified node2vec algorithm to generate pathway embeddings. We evaluate our hypergraph's ability to find repurposing targets for an incurable but prevalent disease, Alzheimer's disease (AD), and compare our ranked-ordered recommendations to those derived from a state-of-the-art knowledge graph, the multiscale interactome. Using our method, we successfully identified 7 promising repurposing candidates for AD that were ranked as unlikely repurposing targets by the multiscale interactome but for which the existing literature provides supporting evidence. Additionally, our drug repositioning suggestions are accompanied by explanations, eliciting plausible biological pathways. In the future, we plan on scaling our proposed method to 800+ diseases, combining single-disease hypergraphs into multi-disease hypergraphs to account for subpopulations with risk factors or encode a given patient's comorbidities to formulate personalized repurposing recommendations.Supplementary materials and code: https://github.com/ayujain04/psb_supplement.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"29 ","pages":"261-275"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LARGE LANGUAGE MODELS (LLMS) AND CHATGPT FOR BIOMEDICINE. 用于生物医学的大型语言模型(LLMS)和聊天软件。
Cecilia Arighi, Steven Brenner, Zhiyong Lu

Large Language Models (LLMs) are a type of artificial intelligence that has been revolutionizing various fields, including biomedicine. They have the capability to process and analyze large amounts of data, understand natural language, and generate new content, making them highly desirable in many biomedical applications and beyond. In this workshop, we aim to introduce the attendees to an in-depth understanding of the rise of LLMs in biomedicine, and how they are being used to drive innovation and improve outcomes in the field, along with associated challenges and pitfalls.

大型语言模型(LLMs)是一种人工智能,它给包括生物医学在内的各个领域带来了革命性的变化。它们有能力处理和分析大量数据、理解自然语言并生成新内容,因此在许多生物医学应用及其他领域非常受欢迎。在本次研讨会上,我们将向与会者深入介绍 LLM 在生物医学领域的崛起,以及如何利用 LLM 推动创新和改善该领域的成果,同时介绍相关的挑战和隐患。
{"title":"LARGE LANGUAGE MODELS (LLMS) AND CHATGPT FOR BIOMEDICINE.","authors":"Cecilia Arighi, Steven Brenner, Zhiyong Lu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Large Language Models (LLMs) are a type of artificial intelligence that has been revolutionizing various fields, including biomedicine. They have the capability to process and analyze large amounts of data, understand natural language, and generate new content, making them highly desirable in many biomedical applications and beyond. In this workshop, we aim to introduce the attendees to an in-depth understanding of the rise of LLMs in biomedicine, and how they are being used to drive innovation and improve outcomes in the field, along with associated challenges and pitfalls.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"29 ","pages":"641-644"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1