首页 > 最新文献

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing最新文献

英文 中文
Practical Approaches to Enhancing Fairness, Social Responsibility and the Inclusion of Diverse Viewpoints in Biomedicine. 增强生物医学的公平性、社会责任感和多元观点包容性的实用方法。
Daphne O Martschenko, Nicole Martinez-Martin, Meghan Halley

The following sections are included:Workshop DescriptionLearning ObjectivesPresenter InformationAbout the Workshop OrganizersPresentationsSpeaker Presentations.

包括以下部分:研讨会简介学习目标主讲人信息关于研讨会组织者演讲人演讲。
{"title":"Practical Approaches to Enhancing Fairness, Social Responsibility and the Inclusion of Diverse Viewpoints in Biomedicine.","authors":"Daphne O Martschenko, Nicole Martinez-Martin, Meghan Halley","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The following sections are included:Workshop DescriptionLearning ObjectivesPresenter InformationAbout the Workshop OrganizersPresentationsSpeaker Presentations.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Risk prediction: Methods, Challenges, and Opportunities. 风险预测:方法、挑战和机遇。
Ruowang Li, Rui Duan, Lifang He, Jason H Moore

The following sections are included:Introduction to the workshopWorkshop Presenters.

包括以下部分:讲习班简介讲习班主讲人。
{"title":"Risk prediction: Methods, Challenges, and Opportunities.","authors":"Ruowang Li, Rui Duan, Lifang He, Jason H Moore","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The following sections are included:Introduction to the workshopWorkshop Presenters.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session Introduction: Drug-repurposing and discovery in the era of "big" real-world data: how the incorporation of observational data, genetics, and other -omic technologies can move us forward. 会议简介:大 "真实世界数据时代的药物再利用和发现:观察数据、遗传学和其他原子技术如何推动我们前进。
Megan M Shuey, Jacklyn N Hellwege, Nikhil Khankari, Marijana Vujkovic, Todd L Edwards

This PSB 2024 session discusses the many broad biological, computational, and statistical approaches currently being used for therapeutic drug target identification and repurposing of existing treatments. Drug repurposing efforts have the potential to dramatically improve the treatment landscape by more rapidly identifying drug targets and alternative strategies for untreated or poorly managed diseases. The overarching theme for this session is the use and integration of real-world data to identify drug-disease pairs with potential therapeutic use. These drug-disease pairs may be identified through genomic, proteomic, biomarkers, protein interaction analyses, electronic health records, and chemical profiling. Taken together, this session combines novel applications of methods and innovative modeling strategies with diverse real-world data to suggest new pharmaceutical treatments for human diseases.

本次 PSB 2024 会议将讨论目前用于治疗药物靶点识别和现有疗法再利用的许多广泛的生物、计算和统计方法。通过更快速地识别药物靶点和针对未治疗或治疗效果不佳疾病的替代策略,药物再利用工作有可能极大地改善治疗状况。本次会议的首要主题是使用和整合真实世界的数据,以确定具有潜在治疗用途的药物-疾病配对。这些药物-疾病配对可通过基因组、蛋白质组、生物标记物、蛋白质相互作用分析、电子健康记录和化学特征分析来确定。总之,本环节将新方法的应用和创新建模策略与各种真实世界数据相结合,为人类疾病提出新的药物治疗建议。
{"title":"Session Introduction: Drug-repurposing and discovery in the era of \"big\" real-world data: how the incorporation of observational data, genetics, and other -omic technologies can move us forward.","authors":"Megan M Shuey, Jacklyn N Hellwege, Nikhil Khankari, Marijana Vujkovic, Todd L Edwards","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This PSB 2024 session discusses the many broad biological, computational, and statistical approaches currently being used for therapeutic drug target identification and repurposing of existing treatments. Drug repurposing efforts have the potential to dramatically improve the treatment landscape by more rapidly identifying drug targets and alternative strategies for untreated or poorly managed diseases. The overarching theme for this session is the use and integration of real-world data to identify drug-disease pairs with potential therapeutic use. These drug-disease pairs may be identified through genomic, proteomic, biomarkers, protein interaction analyses, electronic health records, and chemical profiling. Taken together, this session combines novel applications of methods and innovative modeling strategies with diverse real-world data to suggest new pharmaceutical treatments for human diseases.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KombOver: Efficient k-core and K-truss based characterization of perturbations within the human gut microbiome. KombOver:基于 K 核心和 K 桁架的人类肠道微生物群扰动高效表征。
Nicolae Sapoval, Marko Tanevski, Todd J Treangen

The microbes present in the human gastrointestinal tract are regularly linked to human health and disease outcomes. Thanks to technological and methodological advances in recent years, metagenomic sequencing data, and computational methods designed to analyze metagenomic data, have contributed to improved understanding of the link between the human gut microbiome and disease. However, while numerous methods have been recently developed to extract quantitative and qualitative results from host-associated microbiome data, improved computational tools are still needed to track microbiome dynamics with short-read sequencing data. Previously we have proposed KOMB as a de novo tool for identifying copy number variations in metagenomes for characterizing microbial genome dynamics in response to perturbations. In this work, we present KombOver (KO), which includes four key contributions with respect to our previous work: (i) it scales to large microbiome study cohorts, (ii) it includes both k-core and K-truss based analysis, (iii) we provide the foundation of a theoretical understanding of the relation between various graph-based metagenome representations, and (iv) we provide an improved user experience with easier-to-run code and more descriptive outputs/results. To highlight the aforementioned benefits, we applied KO to nearly 1000 human microbiome samples, requiring less than 10 minutes and 10 GB RAM per sample to process these data. Furthermore, we highlight how graph-based approaches such as k-core and K-truss can be informative for pinpointing microbial community dynamics within a myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) cohort. KO is open source and available for download/use at: https://github.com/treangenlab/komb.

人类胃肠道中的微生物经常与人类健康和疾病结果联系在一起。近年来,由于技术和方法上的进步,元基因组测序数据和用于分析元基因组数据的计算方法有助于人们更好地了解人类肠道微生物组与疾病之间的联系。然而,尽管最近已开发出许多方法来从宿主相关微生物组数据中提取定量和定性结果,但仍需要改进计算工具来利用短线程测序数据跟踪微生物组动态。在此之前,我们已经提出了 KOMB 作为一种全新的工具,用于识别元基因组中的拷贝数变异,以描述微生物基因组对扰动的动态响应。在这项工作中,我们提出了 KombOver (KO),它与我们之前的工作相比有四个主要贡献:(i) 它可扩展到大型微生物组研究队列;(ii) 它包括基于 K 核和 K 桁架的分析;(iii) 我们为理解各种基于图的元基因组表示之间的关系提供了理论基础;(iv) 我们提供了更好的用户体验,代码更易于运行,输出/结果更具描述性。为了突出上述优势,我们将 KO 应用于近 1000 个人类微生物组样本,每个样本只需不到 10 分钟和 10 GB 内存就能处理这些数据。此外,我们还强调了基于图的方法(如 k-core 和 K-truss)如何为确定肌痛性脑脊髓炎/慢性疲劳综合征(ME/CFS)队列中的微生物群落动态提供信息。KO 是开放源代码,可在以下网址下载/使用:https://github.com/treangenlab/komb。
{"title":"KombOver: Efficient k-core and K-truss based characterization of perturbations within the human gut microbiome.","authors":"Nicolae Sapoval, Marko Tanevski, Todd J Treangen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The microbes present in the human gastrointestinal tract are regularly linked to human health and disease outcomes. Thanks to technological and methodological advances in recent years, metagenomic sequencing data, and computational methods designed to analyze metagenomic data, have contributed to improved understanding of the link between the human gut microbiome and disease. However, while numerous methods have been recently developed to extract quantitative and qualitative results from host-associated microbiome data, improved computational tools are still needed to track microbiome dynamics with short-read sequencing data. Previously we have proposed KOMB as a de novo tool for identifying copy number variations in metagenomes for characterizing microbial genome dynamics in response to perturbations. In this work, we present KombOver (KO), which includes four key contributions with respect to our previous work: (i) it scales to large microbiome study cohorts, (ii) it includes both k-core and K-truss based analysis, (iii) we provide the foundation of a theoretical understanding of the relation between various graph-based metagenome representations, and (iv) we provide an improved user experience with easier-to-run code and more descriptive outputs/results. To highlight the aforementioned benefits, we applied KO to nearly 1000 human microbiome samples, requiring less than 10 minutes and 10 GB RAM per sample to process these data. Furthermore, we highlight how graph-based approaches such as k-core and K-truss can be informative for pinpointing microbial community dynamics within a myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) cohort. KO is open source and available for download/use at: https://github.com/treangenlab/komb.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10764071/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low- and high-level information analyses of transcriptome connecting endometrial-decidua-placental origin of preeclampsia subtypes: A preliminary study. 子痫前期亚型子宫内膜-蜕膜-胎盘来源转录组的低级和高级信息分析:初步研究。
Herdiantri Sufriyana, Yu-Wei Wu, Emily Chia-Yu Su

Background: Existing proposed pathogenesis for preeclampsia (PE) was only applied for early onset subtype and did not consider pre-pregnancy and competing risks. We aimed to decipher PE subtypes by identifying related transcriptome that represents endometrial maturation and histologic chorioamnionitis.

Methods: We utilized eight arrays of mRNA expression for discovery (n=289), and other eight arrays for validation (n=352). Differentially expressed genes (DEGs) were overlapped between those of: (1) healthy samples from endometrium, decidua, and placenta, and placenta samples under histologic chorioamnionitis; and (2) placenta samples for each of the subtypes. They were all possible combinations based on four axes: (1) pregnancy-induced hypertension; (2) placental dysfunction-related diseases (e.g., fetal growth restriction [FGR]); (3) onset; and (4) severity.

Results: The DEGs of endometrium at late-secretory phase, but none of decidua, significantly overlapped with those of any subtypes with: (1) early onset (p-values ≤0.008); (2) severe hypertension and proteinuria (p-values ≤0.042); or (3) chronic hypertension and/or severe PE with FGR (p-values ≤0.042). Although sharing the same subtypes whose DEGs with which significantly overlap, the gene regulation was mostly counter-expressed in placenta under chorioamnionitis (n=13/18, 72.22%; odds ratio [OR] upper bounds ≤0.21) but co-expressed in late-secretory endometrium (n=3/9, 66.67%; OR lower bounds ≥1.17). Neither the placental DEGs at first-nor second-trimester under normotensive pregnancy significantly overlapped with those under late-onset, severe PE without FGR.

Conclusions: We identified the transcriptome of endometrial maturation in placental dysfunction that distinguished early- and late-onset PE, and indicated chorioamnionitis as a PE competing risk. This study implied a feasibility to develop and validate the pathogenesis models that include pre-pregnancy and competing risks to decide if it is needed to collect prospective data for PE starting from pre-pregnancy including chorioamnionitis information.

背景:现有的子痫前期(PE)发病机制仅适用于早发亚型,并未考虑孕前和竞争性风险。我们的目的是通过识别代表子宫内膜成熟和组织学绒毛膜炎的相关转录组来解读子痫前期亚型:我们利用八种 mRNA 表达阵列进行发现(样本数=289),并利用其他八种阵列进行验证(样本数=352)。差异表达基因(DEGs)在以下两类样本中重叠:(1) 子宫内膜、蜕膜和胎盘的健康样本和组织学绒毛膜羊膜炎的胎盘样本;(2) 每种亚型的胎盘样本。它们都是基于四个轴的可能组合:(1)妊娠诱发高血压;(2)胎盘功能障碍相关疾病(如胎儿生长受限[FGR]);(3)发病;(4)严重程度:结果:分泌晚期子宫内膜的 DEGs 与任何亚型的 DEGs 都有明显重叠,但蜕膜没有:(1)早期发病(p 值≤0.008);(2)严重高血压和蛋白尿(p 值≤0.042);或(3)慢性高血压和/或严重 PE 合并 FGR(p 值≤0.042)。虽然DEGs与之有明显重叠的亚型相同,但在绒毛膜羊膜炎的胎盘中,基因调控大多是反表达(n=13/18,72.22%;比值比[OR]上限≤0.21),但在晚分泌期子宫内膜中却是共表达(n=3/9,66.67%;比值比下限≥1.17)。正常血压妊娠的胎盘 DEGs 在一胎和二胎均未与晚期重度 PE 无 FGR 的胎盘 DEGs 显著重叠:我们确定了胎盘功能障碍中子宫内膜成熟的转录组,该转录组可区分早发和晚发PE,并指出绒毛膜羊膜炎是PE的竞争风险之一。这项研究意味着开发和验证包括孕前和竞争风险在内的发病机理模型的可行性,以决定是否需要从孕前开始收集包括绒毛膜羊膜炎信息在内的前瞻性 PE 数据。
{"title":"Low- and high-level information analyses of transcriptome connecting endometrial-decidua-placental origin of preeclampsia subtypes: A preliminary study.","authors":"Herdiantri Sufriyana, Yu-Wei Wu, Emily Chia-Yu Su","doi":"","DOIUrl":"","url":null,"abstract":"<p><strong>Background: </strong>Existing proposed pathogenesis for preeclampsia (PE) was only applied for early onset subtype and did not consider pre-pregnancy and competing risks. We aimed to decipher PE subtypes by identifying related transcriptome that represents endometrial maturation and histologic chorioamnionitis.</p><p><strong>Methods: </strong>We utilized eight arrays of mRNA expression for discovery (n=289), and other eight arrays for validation (n=352). Differentially expressed genes (DEGs) were overlapped between those of: (1) healthy samples from endometrium, decidua, and placenta, and placenta samples under histologic chorioamnionitis; and (2) placenta samples for each of the subtypes. They were all possible combinations based on four axes: (1) pregnancy-induced hypertension; (2) placental dysfunction-related diseases (e.g., fetal growth restriction [FGR]); (3) onset; and (4) severity.</p><p><strong>Results: </strong>The DEGs of endometrium at late-secretory phase, but none of decidua, significantly overlapped with those of any subtypes with: (1) early onset (p-values ≤0.008); (2) severe hypertension and proteinuria (p-values ≤0.042); or (3) chronic hypertension and/or severe PE with FGR (p-values ≤0.042). Although sharing the same subtypes whose DEGs with which significantly overlap, the gene regulation was mostly counter-expressed in placenta under chorioamnionitis (n=13/18, 72.22%; odds ratio [OR] upper bounds ≤0.21) but co-expressed in late-secretory endometrium (n=3/9, 66.67%; OR lower bounds ≥1.17). Neither the placental DEGs at first-nor second-trimester under normotensive pregnancy significantly overlapped with those under late-onset, severe PE without FGR.</p><p><strong>Conclusions: </strong>We identified the transcriptome of endometrial maturation in placental dysfunction that distinguished early- and late-onset PE, and indicated chorioamnionitis as a PE competing risk. This study implied a feasibility to develop and validate the pathogenesis models that include pre-pregnancy and competing risks to decide if it is needed to collect prospective data for PE starting from pre-pregnancy including chorioamnionitis information.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
nSEA: n-Node Subnetwork Enumeration Algorithm Identifies Lower Grade Glioma Subtypes with Altered Subnetworks and Distinct Prognostics. nSEA:n节点子网络枚举算法可识别具有改变的子网络和不同预后的低级别胶质瘤亚型。
Zhihan Zhang, Christiana Wang, Ziyin Zhao, Ziyue Yi, Arda Durmaz, Jennifer S Yu, Gurkan Bebek

Advances in molecular characterization have reshaped our understanding of low-grade glioma (LGG) subtypes, emphasizing the need for comprehensive classification beyond histology. Lever-aging this, we present a novel approach, network-based Subnetwork Enumeration, and Analysis (nSEA), to identify distinct LGG patient groups based on dysregulated molecular pathways. Using gene expression profiles from 516 patients and a protein-protein interaction network we generated 25 million sub-networks. Through our unsupervised bottom-up approach, we selected 92 subnetworks that categorized LGG patients into five groups. Notably, a new LGG patient group with a lack of mutations in EGFR, NF1, and PTEN emerged as a previously unidentified patient subgroup with unique clinical features and subnetwork states. Validation of the patient groups on an independent dataset demonstrated the robustness of our approach and revealed consistent survival traits across different patient populations. This study offers a comprehensive molecular classification of LGG, providing insights beyond traditional genetic markers. By integrating network analysis with patient clustering, we unveil a previously overlooked patient subgroup with potential implications for prognosis and treatment strategies. Our approach sheds light on the synergistic nature of driver genes and highlights the biological relevance of the identified subnetworks. With broad implications for glioma research, our findings pave the way for further investigations into the mechanistic underpinnings of LGG subtypes and their clinical relevance.Availability: Source code and supplementary data are available at https://github.com/bebeklab/nSEA.

分子特征描述的进展重塑了我们对低级别胶质瘤(LGG)亚型的认识,强调了超越组织学进行综合分类的必要性。利用这一点,我们提出了一种新方法--基于网络的子网络枚举和分析(nSEA)--来根据失调的分子通路识别不同的 LGG 患者群体。利用来自 516 名患者的基因表达谱和蛋白-蛋白相互作用网络,我们生成了 2,500 万个子网络。通过自下而上的无监督方法,我们筛选出 92 个子网络,将 LGG 患者分为五组。值得注意的是,一个缺乏表皮生长因子受体(EGFR)、NF1和PTEN突变的新LGG患者组出现了,这是一个以前未被发现的患者亚组,具有独特的临床特征和亚网络状态。在一个独立数据集上对患者分组进行的验证证明了我们的方法的稳健性,并揭示了不同患者群体的一致生存特征。这项研究提供了一种全面的 LGG 分子分类方法,提供了超越传统遗传标记的见解。通过将网络分析与患者聚类相结合,我们揭示了一个以前被忽视的患者亚群,并对预后和治疗策略产生了潜在影响。我们的方法揭示了驱动基因的协同作用,并强调了已识别子网络的生物学相关性。我们的发现对胶质瘤研究具有广泛的意义,为进一步研究 LGG 亚型的机理基础及其临床意义铺平了道路:源代码和补充数据见 https://github.com/bebeklab/nSEA。
{"title":"nSEA: n-Node Subnetwork Enumeration Algorithm Identifies Lower Grade Glioma Subtypes with Altered Subnetworks and Distinct Prognostics.","authors":"Zhihan Zhang, Christiana Wang, Ziyin Zhao, Ziyue Yi, Arda Durmaz, Jennifer S Yu, Gurkan Bebek","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Advances in molecular characterization have reshaped our understanding of low-grade glioma (LGG) subtypes, emphasizing the need for comprehensive classification beyond histology. Lever-aging this, we present a novel approach, network-based Subnetwork Enumeration, and Analysis (nSEA), to identify distinct LGG patient groups based on dysregulated molecular pathways. Using gene expression profiles from 516 patients and a protein-protein interaction network we generated 25 million sub-networks. Through our unsupervised bottom-up approach, we selected 92 subnetworks that categorized LGG patients into five groups. Notably, a new LGG patient group with a lack of mutations in EGFR, NF1, and PTEN emerged as a previously unidentified patient subgroup with unique clinical features and subnetwork states. Validation of the patient groups on an independent dataset demonstrated the robustness of our approach and revealed consistent survival traits across different patient populations. This study offers a comprehensive molecular classification of LGG, providing insights beyond traditional genetic markers. By integrating network analysis with patient clustering, we unveil a previously overlooked patient subgroup with potential implications for prognosis and treatment strategies. Our approach sheds light on the synergistic nature of driver genes and highlights the biological relevance of the identified subnetworks. With broad implications for glioma research, our findings pave the way for further investigations into the mechanistic underpinnings of LGG subtypes and their clinical relevance.Availability: Source code and supplementary data are available at https://github.com/bebeklab/nSEA.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PopGenAdapt: Semi-Supervised Domain Adaptation for Genotype-to-Phenotype Prediction in Underrepresented Populations. PopGenAdapt:在代表性不足的人群中进行基因型到表型预测的半监督领域适应。
Marçal Comajoan Cara, Daniel Mas Montserrat, Alexander G Ioannidis

The lack of diversity in genomic datasets, currently skewed towards individuals of European ancestry, presents a challenge in developing inclusive biomedical models. The scarcity of such data is particularly evident in labeled datasets that include genomic data linked to electronic health records. To address this gap, this paper presents PopGenAdapt, a genotype-to-phenotype prediction model which adopts semi-supervised domain adaptation (SSDA) techniques originally proposed for computer vision. PopGenAdapt is designed to leverage the substantial labeled data available from individuals of European ancestry, as well as the limited labeled and the larger amount of unlabeled data from currently underrepresented populations. The method is evaluated in underrepresented populations from Nigeria, Sri Lanka, and Hawaii for the prediction of several disease outcomes. The results suggest a significant improvement in the performance of genotype-to-phenotype models for these populations over state-of-the-art supervised learning methods, setting SSDA as a promising strategy for creating more inclusive machine learning models in biomedical research.Our code is available at https://github.com/AI-sandbox/PopGenAdapt.

基因组数据集目前偏重于欧洲血统的个体,缺乏多样性,这给开发包容性生物医学模型带来了挑战。此类数据的稀缺性在包含与电子健康记录相关联的基因组数据的标记数据集中尤为明显。为了弥补这一不足,本文介绍了一种基因型到表型预测模型 PopGenAdapt,它采用了最初为计算机视觉提出的半监督领域适应(SSDA)技术。PopGenAdapt 的设计目的是利用欧洲血统个体的大量标注数据,以及目前代表性不足人群的有限标注数据和大量未标注数据。该方法在来自尼日利亚、斯里兰卡和夏威夷的代表性不足人群中进行了评估,以预测几种疾病的结果。结果表明,与最先进的监督学习方法相比,针对这些人群的基因型到表型模型的性能有了显著提高,这使得 SSDA 成为在生物医学研究中创建更具包容性的机器学习模型的一种有前途的策略。我们的代码可在 https://github.com/AI-sandbox/PopGenAdapt 上获取。
{"title":"PopGenAdapt: Semi-Supervised Domain Adaptation for Genotype-to-Phenotype Prediction in Underrepresented Populations.","authors":"Marçal Comajoan Cara, Daniel Mas Montserrat, Alexander G Ioannidis","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The lack of diversity in genomic datasets, currently skewed towards individuals of European ancestry, presents a challenge in developing inclusive biomedical models. The scarcity of such data is particularly evident in labeled datasets that include genomic data linked to electronic health records. To address this gap, this paper presents PopGenAdapt, a genotype-to-phenotype prediction model which adopts semi-supervised domain adaptation (SSDA) techniques originally proposed for computer vision. PopGenAdapt is designed to leverage the substantial labeled data available from individuals of European ancestry, as well as the limited labeled and the larger amount of unlabeled data from currently underrepresented populations. The method is evaluated in underrepresented populations from Nigeria, Sri Lanka, and Hawaii for the prediction of several disease outcomes. The results suggest a significant improvement in the performance of genotype-to-phenotype models for these populations over state-of-the-art supervised learning methods, setting SSDA as a promising strategy for creating more inclusive machine learning models in biomedical research.Our code is available at https://github.com/AI-sandbox/PopGenAdapt.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10906137/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Expanding the access of wearable silicone wristbands in community-engaged research through best practices in data analysis and integration. 通过数据分析和整合方面的最佳实践,扩大可穿戴硅胶腕带在社区参与式研究中的使用范围。
Lisa M Bramer, Holly M Dixon, David J Degnan, Diana Rohlman, Julie B Herbstman, Kim A Anderson, Katrina M Waters

Wearable silicone wristbands are a rapidly growing exposure assessment technology that offer researchers the ability to study previously inaccessible cohorts and have the potential to provide a more comprehensive picture of chemical exposure within diverse communities. However, there are no established best practices for analyzing the data within a study or across multiple studies, thereby limiting impact and access of these data for larger meta-analyses. We utilize data from three studies, from over 600 wristbands worn by participants in New York City and Eugene, Oregon, to present a first-of-its-kind manuscript detailing wristband data properties. We further discuss and provide concrete examples of key areas and considerations in common statistical modeling methods where best practices must be established to enable meta-analyses and integration of data from multiple studies. Finally, we detail important and challenging aspects of machine learning, meta-analysis, and data integration that researchers will face in order to extend beyond the limited scope of individual studies focused on specific populations.

可穿戴硅胶腕带是一种快速发展的暴露评估技术,它为研究人员提供了研究以前无法接触到的群体的能力,并有可能更全面地反映不同社区的化学品暴露情况。然而,目前还没有既定的最佳实践来分析一项研究或多项研究中的数据,从而限制了这些数据对大型荟萃分析的影响和使用。我们利用纽约市和俄勒冈州尤金市参与者佩戴的 600 多条腕带上的三项研究数据,首次提交了一份详细说明腕带数据特性的手稿。我们进一步讨论了常用统计建模方法中的关键领域和注意事项,并提供了具体实例,这些领域和注意事项必须建立最佳实践,才能进行荟萃分析和整合来自多项研究的数据。最后,我们详细介绍了研究人员在机器学习、荟萃分析和数据整合方面将面临的重要挑战,以便超越以特定人群为重点的单项研究的有限范围。
{"title":"Expanding the access of wearable silicone wristbands in community-engaged research through best practices in data analysis and integration.","authors":"Lisa M Bramer, Holly M Dixon, David J Degnan, Diana Rohlman, Julie B Herbstman, Kim A Anderson, Katrina M Waters","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Wearable silicone wristbands are a rapidly growing exposure assessment technology that offer researchers the ability to study previously inaccessible cohorts and have the potential to provide a more comprehensive picture of chemical exposure within diverse communities. However, there are no established best practices for analyzing the data within a study or across multiple studies, thereby limiting impact and access of these data for larger meta-analyses. We utilize data from three studies, from over 600 wristbands worn by participants in New York City and Eugene, Oregon, to present a first-of-its-kind manuscript detailing wristband data properties. We further discuss and provide concrete examples of key areas and considerations in common statistical modeling methods where best practices must be established to enable meta-analyses and integration of data from multiple studies. Finally, we detail important and challenging aspects of machine learning, meta-analysis, and data integration that researchers will face in order to extend beyond the limited scope of individual studies focused on specific populations.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10766083/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Systematic Estimation of Treatment Effect on Hospitalization Risk as a Drug Repurposing Screening Method. 系统估算治疗效果对住院风险的影响,作为药物再利用筛选方法。
Costa Georgantas, Jaume Banus, Roger Hullin, Jonas Richiardi

Drug repurposing (DR) intends to identify new uses for approved medications outside their original indication. Computational methods for finding DR candidates usually rely on prior biological and chemical information on a specific drug or target but rarely utilize real-world observations. In this work, we propose a simple and effective systematic screening approach to measure medication impact on hospitalization risk based on large-scale observational data. We use common classification systems to group drugs and diseases into broader functional categories and test for non-zero effects in each drug-disease category pair. Treatment effects on the hospitalization risk of an individual disease are obtained by combining widely used methods for causal inference and time-to-event modelling. 6468 drug-disease pairs were tested using data from the UK Biobank, focusing on cardiovascular, metabolic, and respiratory diseases. We determined key parameters to reduce the number of spurious correlations and identified 7 statistically significant associations of reduced hospitalization risk after correcting for multiple testing. Some of these associations were already reported in other studies, including new potential applications for cardioselective beta-blockers and thiazides. We also found evidence for proton pump inhibitor side effects and multiple possible associations for anti-diabetic drugs. Our work demonstrates the applicability of the present screening approach and the utility of real-world data for identifying potential DR candidates.

药物再利用(DR)旨在为已批准的药物确定其原始适应症之外的新用途。寻找 DR 候选药物的计算方法通常依赖于特定药物或靶点的先前生物和化学信息,但很少利用真实世界的观察结果。在这项工作中,我们提出了一种简单有效的系统筛选方法,基于大规模观察数据来衡量药物对住院风险的影响。我们使用常见的分类系统将药物和疾病归入更广泛的功能类别,并检验每个药物-疾病类别对的非零效应。通过结合广泛使用的因果推断和时间到事件建模方法,得出治疗对单个疾病住院风险的影响。我们利用英国生物库的数据对 6468 对药物-疾病配对进行了测试,重点关注心血管、代谢和呼吸系统疾病。我们确定了减少虚假相关性的关键参数,并在校正多重检验后确定了 7 种具有统计学意义的降低住院风险的相关性。其中一些关联在其他研究中已有报道,包括心脏选择性β受体阻滞剂和噻嗪类药物的新潜在应用。我们还发现了质子泵抑制剂副作用的证据以及抗糖尿病药物的多种可能关联。我们的工作证明了目前筛选方法的适用性以及真实世界数据在确定潜在 DR 候选药物方面的实用性。
{"title":"Systematic Estimation of Treatment Effect on Hospitalization Risk as a Drug Repurposing Screening Method.","authors":"Costa Georgantas, Jaume Banus, Roger Hullin, Jonas Richiardi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Drug repurposing (DR) intends to identify new uses for approved medications outside their original indication. Computational methods for finding DR candidates usually rely on prior biological and chemical information on a specific drug or target but rarely utilize real-world observations. In this work, we propose a simple and effective systematic screening approach to measure medication impact on hospitalization risk based on large-scale observational data. We use common classification systems to group drugs and diseases into broader functional categories and test for non-zero effects in each drug-disease category pair. Treatment effects on the hospitalization risk of an individual disease are obtained by combining widely used methods for causal inference and time-to-event modelling. 6468 drug-disease pairs were tested using data from the UK Biobank, focusing on cardiovascular, metabolic, and respiratory diseases. We determined key parameters to reduce the number of spurious correlations and identified 7 statistically significant associations of reduced hospitalization risk after correcting for multiple testing. Some of these associations were already reported in other studies, including new potential applications for cardioselective beta-blockers and thiazides. We also found evidence for proton pump inhibitor side effects and multiple possible associations for anti-diabetic drugs. Our work demonstrates the applicability of the present screening approach and the utility of real-world data for identifying potential DR candidates.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VetLLM: Large Language Model for Predicting Diagnosis from Veterinary Notes. VetLLM:从兽医笔记中预测诊断的大型语言模型。
Yixing Jiang, Jeremy A Irvin, Andrew Y Ng, James Zou

Lack of diagnosis coding is a barrier to leveraging veterinary notes for medical and public health research. Previous work is limited to develop specialized rule-based or customized supervised learning models to predict diagnosis coding, which is tedious and not easily transferable. In this work, we show that open-source large language models (LLMs) pretrained on general corpus can achieve reasonable performance in a zero-shot setting. Alpaca-7B can achieve a zero-shot F1 of 0.538 on CSU test data and 0.389 on PP test data, two standard benchmarks for coding from veterinary notes. Furthermore, with appropriate fine-tuning, the performance of LLMs can be substantially boosted, exceeding those of strong state-of-the-art supervised models. VetLLM, which is fine-tuned on Alpaca-7B using just 5000 veterinary notes, can achieve a F1 of 0.747 on CSU test data and 0.637 on PP test data. It is of note that our fine-tuning is data-efficient: using 200 notes can outperform supervised models trained with more than 100,000 notes. The findings demonstrate the great potential of leveraging LLMs for language processing tasks in medicine, and we advocate this new paradigm for processing clinical text.

缺乏诊断编码是利用兽医笔记进行医学和公共卫生研究的障碍。以往的工作仅限于开发基于规则的专门模型或定制的监督学习模型来预测诊断编码,这既繁琐又不易移植。在这项工作中,我们展示了在通用语料库上预先训练的开源大语言模型(LLM)可以在零镜头设置中实现合理的性能。Alpaca-7B 在 CSU 测试数据和 PP 测试数据(兽医笔记编码的两个标准基准)上的零射频 F1 分别为 0.538 和 0.389。此外,通过适当的微调,LLM 的性能可以大幅提升,超过最先进的强监督模型。仅使用 5000 份兽医笔记在 Alpaca-7B 上进行微调的 VetLLM 在 CSU 测试数据上的 F1 值为 0.747,在 PP 测试数据上的 F1 值为 0.637。值得注意的是,我们的微调具有很高的数据效率:使用 200 份笔记的效果优于使用超过 100,000 份笔记训练的监督模型。研究结果表明,利用 LLMs 完成医学语言处理任务具有巨大的潜力,我们提倡将这种新模式用于处理临床文本。
{"title":"VetLLM: Large Language Model for Predicting Diagnosis from Veterinary Notes.","authors":"Yixing Jiang, Jeremy A Irvin, Andrew Y Ng, James Zou","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Lack of diagnosis coding is a barrier to leveraging veterinary notes for medical and public health research. Previous work is limited to develop specialized rule-based or customized supervised learning models to predict diagnosis coding, which is tedious and not easily transferable. In this work, we show that open-source large language models (LLMs) pretrained on general corpus can achieve reasonable performance in a zero-shot setting. Alpaca-7B can achieve a zero-shot F1 of 0.538 on CSU test data and 0.389 on PP test data, two standard benchmarks for coding from veterinary notes. Furthermore, with appropriate fine-tuning, the performance of LLMs can be substantially boosted, exceeding those of strong state-of-the-art supervised models. VetLLM, which is fine-tuned on Alpaca-7B using just 5000 veterinary notes, can achieve a F1 of 0.747 on CSU test data and 0.637 on PP test data. It is of note that our fine-tuning is data-efficient: using 200 notes can outperform supervised models trained with more than 100,000 notes. The findings demonstrate the great potential of leveraging LLMs for language processing tasks in medicine, and we advocate this new paradigm for processing clinical text.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1