首页 > 最新文献

Biodata Mining最新文献

英文 中文
Development of an AI-powered AR glasses system for real-time first aid guidance in emergency situations. 开发人工智能增强现实眼镜系统,用于紧急情况下的实时急救指导。
IF 6.1 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-26 DOI: 10.1186/s13040-025-00473-6
Mohammed Abo-Zahhad, Mostafa N Zakaria, Farida M Sharaf, May M Ismaiel, Habiba Hafrag, Yousef M Amer
{"title":"Development of an AI-powered AR glasses system for real-time first aid guidance in emergency situations.","authors":"Mohammed Abo-Zahhad, Mostafa N Zakaria, Farida M Sharaf, May M Ismaiel, Habiba Hafrag, Yousef M Amer","doi":"10.1186/s13040-025-00473-6","DOIUrl":"10.1186/s13040-025-00473-6","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"59"},"PeriodicalIF":6.1,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12382044/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144975584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mapping the evolving trend of research on efferocytosis: a comprehensive data-mining-based study. 绘制出红细胞增生研究的发展趋势:一项基于数据挖掘的综合研究。
IF 6.1 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-25 DOI: 10.1186/s13040-025-00475-4
Yanpeng Jian, Shijia Dong, Weijie Liu, Genfeng Li, Xiaoyu Lian, Yigong Wang

Background: Efferocytosis, the process by which apoptotic cells are recognized and removed by phagocytes, plays a critical role in maintaining tissue homeostasis and modulating inflammatory responses. Over recent decades, an increasing number of studies have investigated the molecular mechanisms and clinical implications of efferocytosis. This bibliometric analysis aims to map the evolving trends, identify key contributors, and outline emerging research themes in this field.

Methods: A comprehensive search was conducted in Web of Science database, to collect literature related to efferocytosis from 2006 to 2024. The dataset was analyzed using several tools such as CiteSpace and VOSviewer. Analyses included evaluation of publication trends, citation networks, keyword co-occurrence, and co-cited references. Key metrics such as the most prolific authors, top contributing countries, and major research clusters were identified to understand the field's evolution and interdisciplinary collaborations.

Results: The final dataset comprised 1549 scholarly works, consisting of 1166 original research articles and 383 review papers. The analysis revealed a steady increase in the number of publications concerning efferocytosis, particularly in the past decade. Geographically, China and the United States emerged as dominant contributors, representing over 64.4% of total publications. Among institutions, Harvard University demonstrated the highest research output in this field. Keyword analysis demonstrated the current research focus including molecular mechanisms and signaling regulation of efferocytosis, macrophage polarization and inflammatory modulation, pathological implications and therapeutic potential of efferocytosis in diseases. Inflammation, atherosclerosis, cardiovascular disease, myocardial infarction, and COPD are diseases that has received the most attention in this field. Several research topics including nanoparticle, neuroinflammation, fibrosis, immunometabolism, exosomes, apoptotic bodies, mesenchymal stem cells, aging, microglia, reactive oxygen species, CD47, lipid metabolism, immunotherapy, mitochondria, ferroptosis, may have great potential to be hot topics in the near future. Gene-focused investigations identified TNF, MERTK, IL10, LI6, and IL1b as the most extensively studied genetic elements in efferocytosis research.

Conclusions: This bibliometric study provides a comprehensive overview of the evolving research landscape in efferocytosis. These insights not only highlight the current milestones but also serve as a valuable guide for future research and policy-making aimed at harnessing efferocytosis for therapeutic innovations.

背景:Efferocytosis是凋亡细胞被吞噬细胞识别并清除的过程,在维持组织稳态和调节炎症反应中起着关键作用。近几十年来,越来越多的研究探讨了efferocytosis的分子机制和临床意义。这个文献计量分析的目的是绘制发展趋势,确定关键贡献者,并概述该领域的新兴研究主题。方法:全面检索Web of Science数据库,收集2006年至2024年与effocytosis相关的文献。使用CiteSpace和VOSviewer等工具对数据集进行分析。分析包括对出版趋势、引文网络、关键词共现和共被引文献的评估。确定了诸如最多产的作者、贡献最大的国家和主要研究集群等关键指标,以了解该领域的演变和跨学科合作。结果:最终数据集共收录学术著作1549篇,其中原创研究论文1166篇,综述论文383篇。分析显示,特别是在过去十年中,有关effocytosis的出版物数量稳步增加。从地理上看,中国和美国成为主要贡献者,占总出版物的64.4%以上。其中,哈佛大学在该领域的研究产出最高。关键词分析显示了当前的研究热点,包括efferocytosis的分子机制和信号调控、巨噬细胞极化和炎症调节、efferocytosis在疾病中的病理意义和治疗潜力。炎症、动脉粥样硬化、心血管疾病、心肌梗死和慢性阻塞性肺病是该领域最受关注的疾病。纳米粒子、神经炎症、纤维化、免疫代谢、外泌体、凋亡小体、间充质干细胞、衰老、小胶质细胞、活性氧、CD47、脂质代谢、免疫治疗、线粒体、铁凋亡等研究课题在不久的将来可能成为热点。以基因为中心的研究发现,TNF、MERTK、IL10、LI6和IL1b是在efferocytosis研究中研究最广泛的遗传因子。结论:这项文献计量学研究提供了一个全面的概述,不断发展的研究景观在effocytosis。这些见解不仅突出了当前的里程碑,而且为未来的研究和政策制定提供了有价值的指导,旨在利用efferocytosis进行治疗创新。
{"title":"Mapping the evolving trend of research on efferocytosis: a comprehensive data-mining-based study.","authors":"Yanpeng Jian, Shijia Dong, Weijie Liu, Genfeng Li, Xiaoyu Lian, Yigong Wang","doi":"10.1186/s13040-025-00475-4","DOIUrl":"10.1186/s13040-025-00475-4","url":null,"abstract":"<p><strong>Background: </strong>Efferocytosis, the process by which apoptotic cells are recognized and removed by phagocytes, plays a critical role in maintaining tissue homeostasis and modulating inflammatory responses. Over recent decades, an increasing number of studies have investigated the molecular mechanisms and clinical implications of efferocytosis. This bibliometric analysis aims to map the evolving trends, identify key contributors, and outline emerging research themes in this field.</p><p><strong>Methods: </strong>A comprehensive search was conducted in Web of Science database, to collect literature related to efferocytosis from 2006 to 2024. The dataset was analyzed using several tools such as CiteSpace and VOSviewer. Analyses included evaluation of publication trends, citation networks, keyword co-occurrence, and co-cited references. Key metrics such as the most prolific authors, top contributing countries, and major research clusters were identified to understand the field's evolution and interdisciplinary collaborations.</p><p><strong>Results: </strong>The final dataset comprised 1549 scholarly works, consisting of 1166 original research articles and 383 review papers. The analysis revealed a steady increase in the number of publications concerning efferocytosis, particularly in the past decade. Geographically, China and the United States emerged as dominant contributors, representing over 64.4% of total publications. Among institutions, Harvard University demonstrated the highest research output in this field. Keyword analysis demonstrated the current research focus including molecular mechanisms and signaling regulation of efferocytosis, macrophage polarization and inflammatory modulation, pathological implications and therapeutic potential of efferocytosis in diseases. Inflammation, atherosclerosis, cardiovascular disease, myocardial infarction, and COPD are diseases that has received the most attention in this field. Several research topics including nanoparticle, neuroinflammation, fibrosis, immunometabolism, exosomes, apoptotic bodies, mesenchymal stem cells, aging, microglia, reactive oxygen species, CD47, lipid metabolism, immunotherapy, mitochondria, ferroptosis, may have great potential to be hot topics in the near future. Gene-focused investigations identified TNF, MERTK, IL10, LI6, and IL1b as the most extensively studied genetic elements in efferocytosis research.</p><p><strong>Conclusions: </strong>This bibliometric study provides a comprehensive overview of the evolving research landscape in efferocytosis. These insights not only highlight the current milestones but also serve as a valuable guide for future research and policy-making aimed at harnessing efferocytosis for therapeutic innovations.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"58"},"PeriodicalIF":6.1,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12376401/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144975622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The application of artificial intelligence models in predicting the risk of diabetic foot: a multicenter study. 人工智能模型在预测糖尿病足风险中的应用:一项多中心研究。
IF 6.1 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-21 DOI: 10.1186/s13040-025-00477-2
Yao Li, Siyuan Zhou, Bichen Ren, Shuai Ju, Xiaoyan Li, Wenqiang Li, Bingzhe Li, Yunmin Cai, Chunlei Chang, Lihong Huang, Zhihui Dong

This study explores diabetic foot (DF), a severe complication in diabetes, by combining deep learning (DL) and machine learning (ML) to develop a multi-model prediction tool. Early identification of high-risk DF patients can reduce disability and mortality. The research also aims to create an integrated application to assist clinicians in precise, efficient risk assessment for early intervention. In this multicenter retrospective study, 6,180 elderly diabetic patients (aged 60-85) were enrolled from 11 community hospitals in Shanghai in 2024. Lasso regression was used to identify 16 key DF risk factors, including age, MMSE score, lower limb discomfort, ABI, and hematocrit. Fourteen ML models (RF, XGBoost, CART, MLP, etc.) and three DL models (DNN, CNN, Transformer) were trained, with hyperparameters optimized via cross-validation and grid search. An application was developed integrating these models, offering both single and batch prediction options with visualization tools for clinical use.Experimental results showed the Logistic regression ensemble model achieved robust performance, with AUC values of 0.943 (validation set, 95% CI: 0.935-0.951) and 0.938 (test set, 95% CI: 0.929-0.947), along with high accuracy, precision, recall, and F1 scores. SHAP analysis revealed key predictive features including ABI results, lower limb discomfort, and MMSE score. The developed app integrates multiple models, compares their predictions for different clinical scenarios, and enhances prediction transparency and reliability.The multi-model approach demonstrates strong predictive performance for DF risk, offering clinicians an intuitive and accurate assessment tool tailored to individual patients. By combining multiple models, we enhance result stability and clinical applicability compared to single-model approaches. Future work will focus on algorithm optimization, expanded datasets, and real-time monitoring integration to enable more precise, dynamic risk evaluation for improved DF prevention and early intervention.

本研究将深度学习(DL)和机器学习(ML)相结合,开发了一种多模型预测工具,探讨糖尿病严重并发症糖尿病足(DF)。早期发现高危DF患者可降低致残率和死亡率。该研究还旨在创建一个综合应用程序,以帮助临床医生对早期干预进行精确、有效的风险评估。在这项多中心回顾性研究中,于2024年从上海11家社区医院招募了6180例老年糖尿病患者(60-85岁)。使用Lasso回归确定16个关键DF危险因素,包括年龄、MMSE评分、下肢不适、ABI和红细胞压积。训练了14个ML模型(RF、XGBoost、CART、MLP等)和3个DL模型(DNN、CNN、Transformer),并通过交叉验证和网格搜索对超参数进行了优化。开发了一个集成这些模型的应用程序,提供单个和批量预测选项以及用于临床使用的可视化工具。实验结果表明,Logistic回归集成模型具有较好的稳健性,AUC值分别为0.943(验证集,95% CI: 0.935-0.951)和0.938(检验集,95% CI: 0.929-0.947),具有较高的正确率、精密度、召回率和F1分数。SHAP分析揭示了关键的预测特征,包括ABI结果、下肢不适和MMSE评分。开发的应用程序集成了多个模型,比较了不同临床情况下的预测,提高了预测的透明度和可靠性。多模型方法显示了对DF风险的强大预测性能,为临床医生提供了针对个体患者的直观和准确的评估工具。与单一模型方法相比,通过组合多个模型,我们提高了结果的稳定性和临床适用性。未来的工作将侧重于算法优化、扩展数据集和实时监测集成,以实现更精确、动态的风险评估,以改进DF预防和早期干预。
{"title":"The application of artificial intelligence models in predicting the risk of diabetic foot: a multicenter study.","authors":"Yao Li, Siyuan Zhou, Bichen Ren, Shuai Ju, Xiaoyan Li, Wenqiang Li, Bingzhe Li, Yunmin Cai, Chunlei Chang, Lihong Huang, Zhihui Dong","doi":"10.1186/s13040-025-00477-2","DOIUrl":"10.1186/s13040-025-00477-2","url":null,"abstract":"<p><p>This study explores diabetic foot (DF), a severe complication in diabetes, by combining deep learning (DL) and machine learning (ML) to develop a multi-model prediction tool. Early identification of high-risk DF patients can reduce disability and mortality. The research also aims to create an integrated application to assist clinicians in precise, efficient risk assessment for early intervention. In this multicenter retrospective study, 6,180 elderly diabetic patients (aged 60-85) were enrolled from 11 community hospitals in Shanghai in 2024. Lasso regression was used to identify 16 key DF risk factors, including age, MMSE score, lower limb discomfort, ABI, and hematocrit. Fourteen ML models (RF, XGBoost, CART, MLP, etc.) and three DL models (DNN, CNN, Transformer) were trained, with hyperparameters optimized via cross-validation and grid search. An application was developed integrating these models, offering both single and batch prediction options with visualization tools for clinical use.Experimental results showed the Logistic regression ensemble model achieved robust performance, with AUC values of 0.943 (validation set, 95% CI: 0.935-0.951) and 0.938 (test set, 95% CI: 0.929-0.947), along with high accuracy, precision, recall, and F1 scores. SHAP analysis revealed key predictive features including ABI results, lower limb discomfort, and MMSE score. The developed app integrates multiple models, compares their predictions for different clinical scenarios, and enhances prediction transparency and reliability.The multi-model approach demonstrates strong predictive performance for DF risk, offering clinicians an intuitive and accurate assessment tool tailored to individual patients. By combining multiple models, we enhance result stability and clinical applicability compared to single-model approaches. Future work will focus on algorithm optimization, expanded datasets, and real-time monitoring integration to enable more precise, dynamic risk evaluation for improved DF prevention and early intervention.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"57"},"PeriodicalIF":6.1,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12372307/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144975599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A simple guide to the use of Student's t-test, Mann-Whitney U test, Chi-squared test, and Kruskal-Wallis test in biostatistics. 生物统计学中学生t检验、Mann-Whitney U检验、卡方检验和Kruskal-Wallis检验的简单使用指南。
IF 6.1 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-20 DOI: 10.1186/s13040-025-00465-6
Davide Chicco, Andrea Sichenze, Giuseppe Jurman

In an age when machine learning and artificial intelligence are broadly employed, traditional statistics can still provide insightful information and results quickly and at a low computational cost. Statistics, in fact, offers many useful tools to researchers, including a series of univariate statistical tests that can identify relationships between pairs of numeric samples: Student's t-test, Mann-Whitney U test, Chi-squared test, and Kruskal-Wallis test. These tests generate several outcomes, including probability values (p-values) that can express a numerical quantity which accepts or rejects the null hypothesis, based on a certain threshold used. Although effective, these tests are often misused or employed in the wrong contexts, especially among biostatistics studies. Many scientific researchers do not seem to know how to choose one test over the others, and this misuse can lead to incorrect results and wrong conclusions. Here we present a simple theoretical and practical guide to the use of these four tests, first describing their theoretical properties and then displaying the results obtained by applying these tests to real-world medical datasets. Eventually, we explain when and how to use each test based on the data types of the samples considered. Our study can have a strong impact on scientific research by potentially influencing future studies involving these tests. Our recommendations, in turn, can help researchers produce more reliable and sound scientific results, thus increasing the quality of multiple scientific studies across various fields.

在一个机器学习和人工智能被广泛应用的时代,传统统计仍然可以以较低的计算成本快速提供有洞察力的信息和结果。事实上,统计学为研究人员提供了许多有用的工具,包括一系列可以识别数字样本对之间关系的单变量统计检验:学生t检验、Mann-Whitney U检验、卡方检验和Kruskal-Wallis检验。这些检验产生若干结果,包括概率值(p值),它可以根据所使用的某个阈值表示接受或拒绝原假设的数值。这些测试虽然有效,但经常被误用或在错误的情况下使用,特别是在生物统计学研究中。许多科学研究人员似乎不知道如何选择一种测试而不是其他测试,这种滥用可能导致不正确的结果和错误的结论。在这里,我们提供了一个简单的理论和实践指南来使用这四个测试,首先描述了它们的理论特性,然后展示了将这些测试应用于现实世界的医疗数据集所获得的结果。最后,我们将根据所考虑的样本的数据类型解释何时以及如何使用每个测试。我们的研究可能会影响未来涉及这些测试的研究,从而对科学研究产生重大影响。反过来,我们的建议可以帮助研究人员产生更可靠、更合理的科学结果,从而提高各个领域的多项科学研究的质量。
{"title":"A simple guide to the use of Student's t-test, Mann-Whitney U test, Chi-squared test, and Kruskal-Wallis test in biostatistics.","authors":"Davide Chicco, Andrea Sichenze, Giuseppe Jurman","doi":"10.1186/s13040-025-00465-6","DOIUrl":"10.1186/s13040-025-00465-6","url":null,"abstract":"<p><p>In an age when machine learning and artificial intelligence are broadly employed, traditional statistics can still provide insightful information and results quickly and at a low computational cost. Statistics, in fact, offers many useful tools to researchers, including a series of univariate statistical tests that can identify relationships between pairs of numeric samples: Student's t-test, Mann-Whitney U test, Chi-squared test, and Kruskal-Wallis test. These tests generate several outcomes, including probability values (p-values) that can express a numerical quantity which accepts or rejects the null hypothesis, based on a certain threshold used. Although effective, these tests are often misused or employed in the wrong contexts, especially among biostatistics studies. Many scientific researchers do not seem to know how to choose one test over the others, and this misuse can lead to incorrect results and wrong conclusions. Here we present a simple theoretical and practical guide to the use of these four tests, first describing their theoretical properties and then displaying the results obtained by applying these tests to real-world medical datasets. Eventually, we explain when and how to use each test based on the data types of the samples considered. Our study can have a strong impact on scientific research by potentially influencing future studies involving these tests. Our recommendations, in turn, can help researchers produce more reliable and sound scientific results, thus increasing the quality of multiple scientific studies across various fields.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"56"},"PeriodicalIF":6.1,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12366075/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144975644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Skin in the game: a review of computational models of the skin. 游戏中的皮肤:皮肤的计算模型回顾。
IF 6.1 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-19 DOI: 10.1186/s13040-025-00471-8
Seda Ceylan, Didem Demir, Cayla Harris, Semih Latif İpek, Vasileios Vavourakis, Marco Manca, Sandrine Dubrac, Roman Bauer

With the vast advances in computing technology, computational (or in silico) modelling has emerged as a transformative tool in dermatology. These findings can provide novel insights into complex biological processes and aid in the development of innovative therapeutic and regenerative strategies for the skin. Modelling combines experimental data and knowledge across multiple disciplines, serving as a common framework to elucidate the workings of the skin. From a biomedical perspective, the mechanisms of skin diseases can be studied by simulating cellular interactions and signalling pathways. Computational investigations of these mechanisms can be categorised into two distinct approaches: data-driven and model-based. Data-driven approaches allow the diagnosis of skin diseases on the basis of data collection via imaging or feedback from portable sensors, often yielding performance exceeding that of their human counterparts. Model-based methods are well suited to address topics such as skin cell biology and biomechanics, contributing to wound healing and skin cancer research. Furthermore, such modelling has found utility in the development of virtual skin models and skin-on-chip devices, enabling the prediction of skin responses to various substances, including cosmetics and drugs. In the realm of dermatological surgery, computational tools have been instrumental in optimizing surgical planning and improving clinical outcomes. While significant advancements have been made, challenges such as data availability, model validation, and interdisciplinary collaboration persist. This review highlights the current state-of-the-art in computational modeling in dermatology, identifies key challenges, and outlines its prospects.

随着计算机技术的巨大进步,计算机(或计算机)建模已经成为皮肤病学的一种变革性工具。这些发现可以为复杂的生物过程提供新的见解,并有助于开发创新的皮肤治疗和再生策略。建模结合了跨多个学科的实验数据和知识,作为一个共同的框架来阐明皮肤的工作原理。从生物医学的角度来看,皮肤疾病的机制可以通过模拟细胞相互作用和信号通路来研究。这些机制的计算研究可以分为两种不同的方法:数据驱动和基于模型的。数据驱动的方法允许在通过成像或便携式传感器反馈收集的数据的基础上诊断皮肤病,其性能往往超过人类同行。基于模型的方法非常适合解决皮肤细胞生物学和生物力学等主题,有助于伤口愈合和皮肤癌研究。此外,这种建模在虚拟皮肤模型和皮肤芯片设备的开发中发现了实用性,能够预测皮肤对各种物质的反应,包括化妆品和药物。在皮肤外科领域,计算工具在优化手术计划和改善临床结果方面发挥了重要作用。虽然取得了重大进展,但数据可用性、模型验证和跨学科协作等挑战仍然存在。这篇综述强调了目前皮肤科计算建模的最新技术,确定了主要挑战,并概述了其前景。
{"title":"Skin in the game: a review of computational models of the skin.","authors":"Seda Ceylan, Didem Demir, Cayla Harris, Semih Latif İpek, Vasileios Vavourakis, Marco Manca, Sandrine Dubrac, Roman Bauer","doi":"10.1186/s13040-025-00471-8","DOIUrl":"10.1186/s13040-025-00471-8","url":null,"abstract":"<p><p>With the vast advances in computing technology, computational (or in silico) modelling has emerged as a transformative tool in dermatology. These findings can provide novel insights into complex biological processes and aid in the development of innovative therapeutic and regenerative strategies for the skin. Modelling combines experimental data and knowledge across multiple disciplines, serving as a common framework to elucidate the workings of the skin. From a biomedical perspective, the mechanisms of skin diseases can be studied by simulating cellular interactions and signalling pathways. Computational investigations of these mechanisms can be categorised into two distinct approaches: data-driven and model-based. Data-driven approaches allow the diagnosis of skin diseases on the basis of data collection via imaging or feedback from portable sensors, often yielding performance exceeding that of their human counterparts. Model-based methods are well suited to address topics such as skin cell biology and biomechanics, contributing to wound healing and skin cancer research. Furthermore, such modelling has found utility in the development of virtual skin models and skin-on-chip devices, enabling the prediction of skin responses to various substances, including cosmetics and drugs. In the realm of dermatological surgery, computational tools have been instrumental in optimizing surgical planning and improving clinical outcomes. While significant advancements have been made, challenges such as data availability, model validation, and interdisciplinary collaboration persist. This review highlights the current state-of-the-art in computational modeling in dermatology, identifies key challenges, and outlines its prospects.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"55"},"PeriodicalIF":6.1,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12366154/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144884146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the common genetic basis of metabolic syndrome-related diseases and chronic kidney disease: insights from extensive genome-wide cross-trait analyses. 探索代谢综合征相关疾病和慢性肾脏疾病的共同遗传基础:来自广泛的全基因组交叉性状分析的见解
IF 6.1 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-17 DOI: 10.1186/s13040-025-00472-7
Yu Yin, Chenkai Zhao, Yibo Hua, Fei Yang, Dandan Qiu, Jiasheng Yan, Xiaodong Jin

Background: Chronic kidney disease (CKD) is a globally prevalent chronic condition characterized by progressive renal function decline, imposing significant economic and psychological burdens on patients. Metabolic syndrome (MetS), characterized by obesity, hypertension, hyperglycemia, and dyslipidemia, is a significant risk factor for CKD. A strong epidemiological association exists between CKD and MetS. This study explores the genetic connections between MetS-related diseases and CKD, focusing on identifying shared risk loci, key tissues, and underlying genetic mechanisms.

Methods: We performed a cross-trait pleiotropy analysis using summary-level GWAS data from ten MetS-related diseases and CKD obtained from the IEU database to detect shared pleiotropic loci and genes. Functional annotation and tissue-specific analyses were conducted to reveal potential associations between CKD and MetS. Additionally, we used metabolite colocalization methods to explore the metabolic perspective of these diseases' associations. Finally, Mendelian randomization (MR) was employed for further association analysis.

Results: The study identified shared genetic mechanisms between mental disorders and prostatitis, revealing 1,437 pleiotropic loci at genome-wide significance. Forty-four dominant risk SNP loci were annotated, with 11 loci confirmed through causal colocalization analysis. Further gene-level analysis identified eight unique pleiotropic genes, including APOC1, APOE, BICC1, and PDILT. Pathway analysis identified the significant involvement of the Metabolism of Fat-Soluble Vitamins, Positive Regulation of Plasma Membrane-Bounded Cell Projection Assembly, and Positive Regulation of RNA Metabolic Process pathways in these diseases. Tissue enrichment analyses at the SNP and gene levels indicated that pleiotropic mechanisms play crucial roles in the Adipose Visceral Omentum, Brain Cerebellum, and Testis. Ultimately, phenotypic-level metabolite colocalization analysis revealed a metabolic intermediary mechanism linking MetS-related diseases and CKD.

Conclusion: This study uncovers the complex genetic interactions between CKD and MetS-related diseases, identifying shared genetic loci and biological pathways, providing novel insights for future therapeutic strategies.

背景:慢性肾脏疾病(CKD)是一种全球流行的慢性疾病,其特征是肾功能进行性下降,给患者带来了巨大的经济和心理负担。代谢综合征(MetS)以肥胖、高血压、高血糖和血脂异常为特征,是CKD的重要危险因素。CKD和MetS之间存在很强的流行病学关联。本研究探讨了met相关疾病与CKD之间的遗传联系,重点确定了共同的风险位点、关键组织和潜在的遗传机制。方法:我们使用从IEU数据库中获得的10种met相关疾病和CKD的汇总级GWAS数据进行跨性状多效性分析,以检测共享的多效位点和基因。进行了功能注释和组织特异性分析,以揭示CKD和MetS之间的潜在关联。此外,我们使用代谢物共定位方法来探索这些疾病关联的代谢角度。最后,采用孟德尔随机化(MR)进行进一步的关联分析。结果:该研究确定了精神障碍和前列腺炎之间的共同遗传机制,揭示了1437个具有全基因组意义的多效位点。44个显性风险SNP位点被注释,其中11个位点通过因果共定位分析被确认。进一步的基因水平分析鉴定出8个独特的多效基因,包括APOC1、APOE、BICC1和PDILT。途径分析确定了脂溶性维生素代谢、质膜结合细胞投射组装的正调节和RNA代谢过程途径的正调节在这些疾病中的重要作用。SNP和基因水平的组织富集分析表明,多效性机制在脂肪内脏大网膜、大脑小脑和睾丸中起着至关重要的作用。最终,表型水平的代谢物共定位分析揭示了met相关疾病和CKD之间的代谢中介机制。结论:本研究揭示了CKD与met相关疾病之间复杂的遗传相互作用,确定了共享的遗传位点和生物学途径,为未来的治疗策略提供了新的见解。
{"title":"Exploring the common genetic basis of metabolic syndrome-related diseases and chronic kidney disease: insights from extensive genome-wide cross-trait analyses.","authors":"Yu Yin, Chenkai Zhao, Yibo Hua, Fei Yang, Dandan Qiu, Jiasheng Yan, Xiaodong Jin","doi":"10.1186/s13040-025-00472-7","DOIUrl":"10.1186/s13040-025-00472-7","url":null,"abstract":"<p><strong>Background: </strong>Chronic kidney disease (CKD) is a globally prevalent chronic condition characterized by progressive renal function decline, imposing significant economic and psychological burdens on patients. Metabolic syndrome (MetS), characterized by obesity, hypertension, hyperglycemia, and dyslipidemia, is a significant risk factor for CKD. A strong epidemiological association exists between CKD and MetS. This study explores the genetic connections between MetS-related diseases and CKD, focusing on identifying shared risk loci, key tissues, and underlying genetic mechanisms.</p><p><strong>Methods: </strong>We performed a cross-trait pleiotropy analysis using summary-level GWAS data from ten MetS-related diseases and CKD obtained from the IEU database to detect shared pleiotropic loci and genes. Functional annotation and tissue-specific analyses were conducted to reveal potential associations between CKD and MetS. Additionally, we used metabolite colocalization methods to explore the metabolic perspective of these diseases' associations. Finally, Mendelian randomization (MR) was employed for further association analysis.</p><p><strong>Results: </strong>The study identified shared genetic mechanisms between mental disorders and prostatitis, revealing 1,437 pleiotropic loci at genome-wide significance. Forty-four dominant risk SNP loci were annotated, with 11 loci confirmed through causal colocalization analysis. Further gene-level analysis identified eight unique pleiotropic genes, including APOC1, APOE, BICC1, and PDILT. Pathway analysis identified the significant involvement of the Metabolism of Fat-Soluble Vitamins, Positive Regulation of Plasma Membrane-Bounded Cell Projection Assembly, and Positive Regulation of RNA Metabolic Process pathways in these diseases. Tissue enrichment analyses at the SNP and gene levels indicated that pleiotropic mechanisms play crucial roles in the Adipose Visceral Omentum, Brain Cerebellum, and Testis. Ultimately, phenotypic-level metabolite colocalization analysis revealed a metabolic intermediary mechanism linking MetS-related diseases and CKD.</p><p><strong>Conclusion: </strong>This study uncovers the complex genetic interactions between CKD and MetS-related diseases, identifying shared genetic loci and biological pathways, providing novel insights for future therapeutic strategies.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"54"},"PeriodicalIF":6.1,"publicationDate":"2025-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12359972/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144876360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Short- and long-term weekly patient-reported outcomes prediction undergoing radiotherapy: single-patient time series model vs. transformer-based multi-patient time series model. 放疗患者每周报告的短期和长期预后预测:单患者时间序列模型与基于变压器的多患者时间序列模型
IF 6.1 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-12 DOI: 10.1186/s13040-025-00464-7
Yang Yan, Zhong Chen, Xinglei Shen, Ronald C Chen, Hao Gao

Background: Patient-reported outcomes (PROs) are direct reports from patients on health status, symptoms, quality of life, or treatment satisfaction, offering critical insights into subjective experiences that clinical metrics may overlook. Accurately predicting personalized short- and long-term weekly PROs during radiotherapy is essential for monitoring health status, optimizing treatment efficacy, and enabling timely interventions to manage side effects.

Methods: Based on the well-documented prostate cancer PRO dataset with 17 patients after pre-processing, this study evaluates single-patient time series models (i.e., vector autoregression (VAR) and VAR with incremental ground truth PRO data (VAR-Inc)) and a transformer-based multi-patient model (i.e., Temporal Fusion Transformer (TFT)) for short- and long-term weekly PRO prediction. VAR-Inc integrates follow-up PRO data to refine predictions, while TFT leverages multi-patient heterogeneous information to capture complex temporal patterns.

Results: Key experimental results on prostate cancer patients demonstrate that (1) VAR-Inc demonstrated superior performance (lower MAE/RMSE) over VAR, highlighting the importance of incremental PRO updates. (2) TFT significantly outperformed both VAR models in long-term prediction, with statistical significance, by utilizing multi-patient data. (3) TFT effectively captured weekly PRO trends and variations, aligning closely with ground truth. (4) Unlike single-patient models, TFT built robust predictive frameworks by integrating cross-patient similarities and complementary patients' PRO information. VAR-Inc's performance deteriorated with missing follow-up PROs, whereas TFT remained stable, overcoming this limitation. On average, TFT outperforms VAR and VAR-Inc by achieving a lowest MAE 0.7715, while the MAE of VAR and VAR-Inc are 1.1329 and 0.8089, respectively. Furthermore, TFT is superior to VAR and VAR-Inc by achieving a lowest RMSE 0.9586, while the RMSE of VAR and VAR-Inc are 1.4817 and 1.0693, respectively.

Conclusion: TFT emerges as a reliable approach for PRO prediction, excelling in long-term accuracy, trend capture, and resilience to data gaps by leveraging multi-patient information. Its ability to synthesize heterogeneous PRO data offers advantages over single-patient models, supporting personalized treatment adaptation and informed clinical decision-making. This underscores the potential of transformer-based models in enhancing PRO-driven radiotherapy management.

背景:患者报告的结果(pro)是患者对健康状况、症状、生活质量或治疗满意度的直接报告,为临床指标可能忽略的主观体验提供了重要的见解。准确预测放射治疗期间个性化的短期和长期每周PROs对于监测健康状况、优化治疗效果和及时干预以管理副作用至关重要。方法:基于经过预处理的17例前列腺癌PRO数据集,本研究评估了单患者时间序列模型(即向量自回归(VAR)和增量地真PRO数据(VAR- inc)的VAR)和基于变压器的多患者模型(即时间融合变压器(TFT))用于短期和长期每周PRO预测。VAR-Inc整合了后续的PRO数据来改进预测,而TFT利用多患者异构信息来捕获复杂的时间模式。结果:前列腺癌患者的关键实验结果表明(1)VAR- inc表现出优于VAR的性能(MAE/RMSE更低),突出了PRO增量更新的重要性。(2)在利用多患者数据进行长期预测时,TFT显著优于VAR模型,且具有统计学意义。(3) TFT有效捕获了每周PRO趋势和变化,与实际情况密切相关。(4)与单一患者模型不同,TFT通过整合跨患者相似性和补充患者PRO信息构建了鲁棒的预测框架。VAR-Inc的表现因缺少后续PROs而恶化,而TFT保持稳定,克服了这一限制。平均而言,TFT优于VAR和VAR- inc,其MAE最低为0.7715,而VAR和VAR- inc的MAE分别为1.1329和0.8089。此外,TFT优于VAR和VAR- inc, RMSE最低为0.9586,而VAR和VAR- inc的RMSE分别为1.4817和1.0693。结论:TFT是预测PRO的可靠方法,通过利用多患者信息,在长期准确性、趋势捕获和对数据缺口的弹性方面表现出色。其综合异构PRO数据的能力比单一患者模型具有优势,支持个性化治疗适应和知情的临床决策。这强调了基于变压器的模型在增强pro驱动的放射治疗管理方面的潜力。
{"title":"Short- and long-term weekly patient-reported outcomes prediction undergoing radiotherapy: single-patient time series model vs. transformer-based multi-patient time series model.","authors":"Yang Yan, Zhong Chen, Xinglei Shen, Ronald C Chen, Hao Gao","doi":"10.1186/s13040-025-00464-7","DOIUrl":"10.1186/s13040-025-00464-7","url":null,"abstract":"<p><strong>Background: </strong>Patient-reported outcomes (PROs) are direct reports from patients on health status, symptoms, quality of life, or treatment satisfaction, offering critical insights into subjective experiences that clinical metrics may overlook. Accurately predicting personalized short- and long-term weekly PROs during radiotherapy is essential for monitoring health status, optimizing treatment efficacy, and enabling timely interventions to manage side effects.</p><p><strong>Methods: </strong>Based on the well-documented prostate cancer PRO dataset with 17 patients after pre-processing, this study evaluates single-patient time series models (i.e., vector autoregression (VAR) and VAR with incremental ground truth PRO data (VAR-Inc)) and a transformer-based multi-patient model (i.e., Temporal Fusion Transformer (TFT)) for short- and long-term weekly PRO prediction. VAR-Inc integrates follow-up PRO data to refine predictions, while TFT leverages multi-patient heterogeneous information to capture complex temporal patterns.</p><p><strong>Results: </strong>Key experimental results on prostate cancer patients demonstrate that (1) VAR-Inc demonstrated superior performance (lower MAE/RMSE) over VAR, highlighting the importance of incremental PRO updates. (2) TFT significantly outperformed both VAR models in long-term prediction, with statistical significance, by utilizing multi-patient data. (3) TFT effectively captured weekly PRO trends and variations, aligning closely with ground truth. (4) Unlike single-patient models, TFT built robust predictive frameworks by integrating cross-patient similarities and complementary patients' PRO information. VAR-Inc's performance deteriorated with missing follow-up PROs, whereas TFT remained stable, overcoming this limitation. On average, TFT outperforms VAR and VAR-Inc by achieving a lowest MAE 0.7715, while the MAE of VAR and VAR-Inc are 1.1329 and 0.8089, respectively. Furthermore, TFT is superior to VAR and VAR-Inc by achieving a lowest RMSE 0.9586, while the RMSE of VAR and VAR-Inc are 1.4817 and 1.0693, respectively.</p><p><strong>Conclusion: </strong>TFT emerges as a reliable approach for PRO prediction, excelling in long-term accuracy, trend capture, and resilience to data gaps by leveraging multi-patient information. Its ability to synthesize heterogeneous PRO data offers advantages over single-patient models, supporting personalized treatment adaptation and informed clinical decision-making. This underscores the potential of transformer-based models in enhancing PRO-driven radiotherapy management.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"53"},"PeriodicalIF":6.1,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12341308/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144838314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exo-Tox: Identifying Exotoxins from secreted bacterial proteins. 外毒素:从分泌的细菌蛋白中鉴定外毒素。
IF 6.1 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-08 DOI: 10.1186/s13040-025-00469-2
Tanja Krueger, Damla A Durmaz, Luisa F Jimenez-Soto

Background: Bacterial exotoxins are secreted proteins able to affect target cells, and associated with diseases. Their accurate identification can enhance drug discovery and ensure the safety of bacteria-based medical applications. However, current toxin predictors prioritize broad coverage by mixing toxins from multiple biological kingdoms and diverse control sets. This general approach has proven sub-optimal for identifying niche toxins, such as bacterial exotoxins. Recent Protein Language Models offer an opportunity to improve toxin prediction by capturing global sequence context and biochemical properties from protein sequences.

Results: We introduce Exo-Tox, a specialized predictor trained exclusively on curated datasets of bacterial exotoxins and secreted non-toxic bacterial proteins, represented as embeddings by Protein Language Models. Compared to Basic Local Alignment Search Tool (BLAST)-based methods and generalized toxin predictors, Exo-Tox outperforms across multiple metrics, achieving a Matthews correlation coefficient > 0.9. Notably, Exo-Tox's performance remains robust regardless of protein length or the presence of signal peptides. We analyze its limited transferability to bacteriophage proteins and non-secreted proteins.

Conclusion: Exo-Tox reliably identifies bacterial exotoxins, filling a niche overlooked by generalized predictors. Our findings highlight the importance of domain-specific training data and emphasize that specialized predictors are necessary for accurate classification. We provide open access to the model, training data, and usage guidelines via the LMU Munich Open Data repository.

背景:细菌外毒素是一种能够影响靶细胞的分泌蛋白,与疾病有关。它们的准确鉴定可以加强药物发现,并确保基于细菌的医疗应用的安全性。然而,目前的毒素预测通过混合来自多个生物王国和不同控制集的毒素来优先考虑广泛的覆盖范围。这种一般的方法已被证明是次优的识别生态位毒素,如细菌外毒素。最近的蛋白质语言模型通过捕获蛋白质序列的全局序列上下文和生化特性,提供了改进毒素预测的机会。结果:我们介绍了Exo-Tox,一个专门的预测器,专门训练细菌外毒素和分泌的无毒细菌蛋白的策划数据集,用蛋白质语言模型表示嵌入。与基于基本局部比对搜索工具(Basic Local Alignment Search Tool, BLAST)的方法和广义毒素预测器相比,Exo-Tox在多个指标上都表现出色,马修斯相关系数达到了>.9。值得注意的是,Exo-Tox的性能保持稳健,无论蛋白质长度或信号肽的存在。我们分析了它对噬菌体蛋白和非分泌蛋白的有限可转移性。结论:Exo-Tox可以可靠地识别细菌外毒素,填补了一般预测指标所忽视的空白。我们的研究结果强调了特定领域训练数据的重要性,并强调了专业预测器对于准确分类是必要的。我们通过LMU慕尼黑开放数据存储库提供对模型、训练数据和使用指南的开放访问。
{"title":"Exo-Tox: Identifying Exotoxins from secreted bacterial proteins.","authors":"Tanja Krueger, Damla A Durmaz, Luisa F Jimenez-Soto","doi":"10.1186/s13040-025-00469-2","DOIUrl":"10.1186/s13040-025-00469-2","url":null,"abstract":"<p><strong>Background: </strong>Bacterial exotoxins are secreted proteins able to affect target cells, and associated with diseases. Their accurate identification can enhance drug discovery and ensure the safety of bacteria-based medical applications. However, current toxin predictors prioritize broad coverage by mixing toxins from multiple biological kingdoms and diverse control sets. This general approach has proven sub-optimal for identifying niche toxins, such as bacterial exotoxins. Recent Protein Language Models offer an opportunity to improve toxin prediction by capturing global sequence context and biochemical properties from protein sequences.</p><p><strong>Results: </strong>We introduce Exo-Tox, a specialized predictor trained exclusively on curated datasets of bacterial exotoxins and secreted non-toxic bacterial proteins, represented as embeddings by Protein Language Models. Compared to Basic Local Alignment Search Tool (BLAST)-based methods and generalized toxin predictors, Exo-Tox outperforms across multiple metrics, achieving a Matthews correlation coefficient > 0.9. Notably, Exo-Tox's performance remains robust regardless of protein length or the presence of signal peptides. We analyze its limited transferability to bacteriophage proteins and non-secreted proteins.</p><p><strong>Conclusion: </strong>Exo-Tox reliably identifies bacterial exotoxins, filling a niche overlooked by generalized predictors. Our findings highlight the importance of domain-specific training data and emphasize that specialized predictors are necessary for accurate classification. We provide open access to the model, training data, and usage guidelines via the LMU Munich Open Data repository.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"52"},"PeriodicalIF":6.1,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12333140/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144805140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Drug repurposing for Alzheimer's disease using a graph-of-thoughts based large language model to infer drug-disease relationships in a comprehensive knowledge graph. 使用基于思想图的大型语言模型在综合知识图中推断药物-疾病关系,对阿尔茨海默病进行药物再利用。
IF 6.1 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-05 DOI: 10.1186/s13040-025-00466-5
Zhiping Paul Wang, Xi Li, Nicholas Matsumoto, Mythreye Venkatesan, Jui-Hsuan Chang, Jay Moran, Hyunjun Choi, Binglan Li, Yufei Meng, Miguel E Hernandez, Jason H Moore

Drug repurposing (DR) offers a promising alternative to the high cost and low success rate of traditional drug development, especially for complex diseases like Alzheimer's disease (AD). This study addressed DR for AD from three key angles: (1) demonstrating how disease-specific knowledge graphs can improve DR performance, (2) evaluating the role of large language models (LLMs) in enhancing the usability and efficiency of these graphs, and (3) assessing whether Graph-of-Thoughts (GoT)-enhanced LLMs, when integrated with AD knowledge graphs, can outperform traditional machine learning and LLM-based approaches. We tested five distinct DR strategies (DR1-DR5) for AD: DR1, a machine learning method using TxGNN; DR2, a machine learning model leveraging the Alzheimer's KnowledgeBase (AlzKB); DR3, an LLM-based chatbot built on AlzKB; DR4, our ESCARGOT framework combining GoT-enhanced LLMs with AlzKB; and DR5, a general reasoning-driven LLM approach. Results showed that AlzKB significantly improved DR outcomes. ESCARGOT further enhanced performance while reducing the need for coding or advanced expertise in knowledge graph analysis. Because the architecture of AlzKB is easily adaptable to other diseases and ESCARGOT can integrate with various knowledge graph platforms, this framework offers a broadly applicable, innovative tool for accelerating drug discovery through repurposing.

药物再利用(DR)为传统药物开发的高成本和低成功率提供了一个有希望的替代方案,特别是对于像阿尔茨海默病(AD)这样的复杂疾病。本研究从三个关键角度解决AD的DR问题:(1)展示疾病特异性知识图如何提高DR性能,(2)评估大型语言模型(llm)在提高这些图的可用性和效率方面的作用,以及(3)评估当与AD知识图集成时,思想图(GoT)增强的llm是否优于传统的机器学习和基于llm的方法。我们针对AD测试了五种不同的DR策略(DR1- dr5): DR1,一种使用TxGNN的机器学习方法;DR2,利用阿尔茨海默病知识库(AlzKB)的机器学习模型;DR3,一个基于llm的聊天机器人,建立在AlzKB上;DR4,我们的ESCARGOT框架结合了got增强LLMs和AlzKB;DR5是一种通用推理驱动的法学硕士方法。结果显示,AlzKB显著改善了DR预后。ESCARGOT进一步提高了性能,同时减少了对编码或知识图谱分析高级专业知识的需求。由于AlzKB的架构很容易适应其他疾病,并且ESCARGOT可以与各种知识图谱平台集成,因此该框架为通过重新利用加速药物发现提供了广泛适用的创新工具。
{"title":"Drug repurposing for Alzheimer's disease using a graph-of-thoughts based large language model to infer drug-disease relationships in a comprehensive knowledge graph.","authors":"Zhiping Paul Wang, Xi Li, Nicholas Matsumoto, Mythreye Venkatesan, Jui-Hsuan Chang, Jay Moran, Hyunjun Choi, Binglan Li, Yufei Meng, Miguel E Hernandez, Jason H Moore","doi":"10.1186/s13040-025-00466-5","DOIUrl":"10.1186/s13040-025-00466-5","url":null,"abstract":"<p><p>Drug repurposing (DR) offers a promising alternative to the high cost and low success rate of traditional drug development, especially for complex diseases like Alzheimer's disease (AD). This study addressed DR for AD from three key angles: (1) demonstrating how disease-specific knowledge graphs can improve DR performance, (2) evaluating the role of large language models (LLMs) in enhancing the usability and efficiency of these graphs, and (3) assessing whether Graph-of-Thoughts (GoT)-enhanced LLMs, when integrated with AD knowledge graphs, can outperform traditional machine learning and LLM-based approaches. We tested five distinct DR strategies (DR1-DR5) for AD: DR1, a machine learning method using TxGNN; DR2, a machine learning model leveraging the Alzheimer's KnowledgeBase (AlzKB); DR3, an LLM-based chatbot built on AlzKB; DR4, our ESCARGOT framework combining GoT-enhanced LLMs with AlzKB; and DR5, a general reasoning-driven LLM approach. Results showed that AlzKB significantly improved DR outcomes. ESCARGOT further enhanced performance while reducing the need for coding or advanced expertise in knowledge graph analysis. Because the architecture of AlzKB is easily adaptable to other diseases and ESCARGOT can integrate with various knowledge graph platforms, this framework offers a broadly applicable, innovative tool for accelerating drug discovery through repurposing.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"51"},"PeriodicalIF":6.1,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12326721/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144790506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
circGPAcorr: an integrative tool for functional annotation of circular RNAs using expression data. circGPAcorr:利用表达数据对环状rna进行功能注释的集成工具。
IF 6.1 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-01 DOI: 10.1186/s13040-025-00468-3
Petr Ryšavý, Alikhan Anuarbekov, Michaela Dostálová Merkerová, Jiří Kléma

Circular RNAs play a crucial role in cell development and serve as biomarkers in many diseases. Nevertheless, the function of many circular RNAs remains unknown. This function can be inferred from sponging and silencing interactions with micro RNAs and messenger RNAs. We recently proposed a network-based circRNA functional annotation tool, circGPA. However, validation data for RNA interactions are often sparse and predicted interactions contain many false positives. To address this issue, we propose an extended algorithm named circGPAcorr, which uses expression data to weight the interactions, resulting in more precise functional annotation. To assess the significance of the results, the p-value is calculated using reduction to circGPA, a generating-polynomial-based method. We show that the problem is #P-hard, and thus computationally difficult. The circGPAcorr algorithm is tested on publicly available myelodysplastic syndromes expression data, providing gene ontology annotations that align with the literature on myelodysplastic syndromes. At the same time, we demonstrate its performance in the circRNA-disease annotation task.

环状rna在细胞发育中起着至关重要的作用,并在许多疾病中作为生物标志物。然而,许多环状rna的功能仍然未知。这种功能可以通过海绵和沉默与微rna和信使rna的相互作用来推断。我们最近提出了一个基于网络的circRNA功能注释工具circGPA。然而,RNA相互作用的验证数据通常是稀疏的,并且预测的相互作用包含许多假阳性。为了解决这个问题,我们提出了一个名为circGPAcorr的扩展算法,该算法使用表达式数据来权衡交互,从而产生更精确的功能注释。为了评估结果的显著性,p值是使用一种基于生成多项式的方法来计算的。我们证明这个问题是#P-hard的,因此计算困难。circGPAcorr算法在公开可用的骨髓增生异常综合征表达数据上进行了测试,提供了与骨髓增生异常综合征文献一致的基因本体注释。同时,我们展示了它在circRNA-disease注释任务中的表现。
{"title":"circGPAcorr: an integrative tool for functional annotation of circular RNAs using expression data.","authors":"Petr Ryšavý, Alikhan Anuarbekov, Michaela Dostálová Merkerová, Jiří Kléma","doi":"10.1186/s13040-025-00468-3","DOIUrl":"10.1186/s13040-025-00468-3","url":null,"abstract":"<p><p>Circular RNAs play a crucial role in cell development and serve as biomarkers in many diseases. Nevertheless, the function of many circular RNAs remains unknown. This function can be inferred from sponging and silencing interactions with micro RNAs and messenger RNAs. We recently proposed a network-based circRNA functional annotation tool, circGPA. However, validation data for RNA interactions are often sparse and predicted interactions contain many false positives. To address this issue, we propose an extended algorithm named circGPAcorr, which uses expression data to weight the interactions, resulting in more precise functional annotation. To assess the significance of the results, the p-value is calculated using reduction to circGPA, a generating-polynomial-based method. We show that the problem is #P-hard, and thus computationally difficult. The circGPAcorr algorithm is tested on publicly available myelodysplastic syndromes expression data, providing gene ontology annotations that align with the literature on myelodysplastic syndromes. At the same time, we demonstrate its performance in the circRNA-disease annotation task.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"50"},"PeriodicalIF":6.1,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12317645/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144765669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biodata Mining
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1