首页 > 最新文献

Biodata Mining最新文献

英文 中文
A generative deep neural network for pan-digestive tract cancer survival analysis. 泛消化道肿瘤生存分析的生成式深度神经网络。
IF 4 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-27 DOI: 10.1186/s13040-025-00426-z
Lekai Xu, Tianjun Lan, Yiqian Huang, Liansheng Wang, Junqi Lin, Xinpeng Song, Hui Tang, Haotian Cao, Hua Chai

Background: The accurate identification of molecular subtypes in digestive tract cancer (DTC) is crucial for making informed treatment decisions and selecting potential biomarkers. With the rapid advancement of artificial intelligence, various machine learning algorithms have been successfully applied in this field. However, the complexity and high dimensionality of the data features may lead to overlapping and ambiguous subtypes during clustering.

Results: In this study, we propose GDEC, a multi-task generative deep neural network designed for precise digestive tract cancer subtyping. The network optimization process involves employing an integrated loss function consisting of two modules: the generative-adversarial module facilitates spatial data distribution understanding for extracting high-quality information, while the clustering module aids in identifying disease subtypes. The experiments conducted on digestive tract cancer datasets demonstrate that GDEC exhibits exceptional performance compared to other advanced methodologies and can separate different cancer molecular subtypes that possess both statistical and biological significance. Subsequently, 21 hub genes related to pan-DTC heterogeneity and prognosis were identified based on the subtypes clustered by GDEC. The following drug analysis suggested Dasatinib and YM155 as potential therapeutic agents for improving the prognosis of patients in pan-DTC immunotherapy, thereby contributing to the enhancement of cancer patient survival.

Conclusions: The experiment indicate that GDEC outperforms better than other deep-learning-based methods, and the interpretable algorithm can select biologically significant genes and potential drugs for DTC treatment.

背景:准确识别消化道癌(DTC)分子亚型对于制定明智的治疗决策和选择潜在的生物标志物至关重要。随着人工智能的飞速发展,各种机器学习算法已成功应用于该领域。然而,数据特征的复杂性和高维性可能导致聚类过程中出现重叠和模糊的子类型。结果:在本研究中,我们提出了一种多任务生成深度神经网络GDEC,用于精确的消化道癌症亚型分型。网络优化过程涉及使用由两个模块组成的集成损失函数:生成对抗模块有助于理解空间数据分布以提取高质量信息,而聚类模块有助于识别疾病亚型。在消化道癌症数据集上进行的实验表明,与其他先进的方法相比,GDEC表现出卓越的性能,可以分离出具有统计学和生物学意义的不同癌症分子亚型。随后,根据GDEC聚类的亚型,确定了21个与泛dtc异质性和预后相关的枢纽基因。以下药物分析表明,达沙替尼和YM155是改善pan-DTC免疫治疗患者预后的潜在治疗药物,有助于提高癌症患者的生存期。结论:实验表明GDEC优于其他基于深度学习的方法,可解释算法可以选择具有生物学意义的基因和潜在的DTC治疗药物。
{"title":"A generative deep neural network for pan-digestive tract cancer survival analysis.","authors":"Lekai Xu, Tianjun Lan, Yiqian Huang, Liansheng Wang, Junqi Lin, Xinpeng Song, Hui Tang, Haotian Cao, Hua Chai","doi":"10.1186/s13040-025-00426-z","DOIUrl":"10.1186/s13040-025-00426-z","url":null,"abstract":"<p><strong>Background: </strong>The accurate identification of molecular subtypes in digestive tract cancer (DTC) is crucial for making informed treatment decisions and selecting potential biomarkers. With the rapid advancement of artificial intelligence, various machine learning algorithms have been successfully applied in this field. However, the complexity and high dimensionality of the data features may lead to overlapping and ambiguous subtypes during clustering.</p><p><strong>Results: </strong>In this study, we propose GDEC, a multi-task generative deep neural network designed for precise digestive tract cancer subtyping. The network optimization process involves employing an integrated loss function consisting of two modules: the generative-adversarial module facilitates spatial data distribution understanding for extracting high-quality information, while the clustering module aids in identifying disease subtypes. The experiments conducted on digestive tract cancer datasets demonstrate that GDEC exhibits exceptional performance compared to other advanced methodologies and can separate different cancer molecular subtypes that possess both statistical and biological significance. Subsequently, 21 hub genes related to pan-DTC heterogeneity and prognosis were identified based on the subtypes clustered by GDEC. The following drug analysis suggested Dasatinib and YM155 as potential therapeutic agents for improving the prognosis of patients in pan-DTC immunotherapy, thereby contributing to the enhancement of cancer patient survival.</p><p><strong>Conclusions: </strong>The experiment indicate that GDEC outperforms better than other deep-learning-based methods, and the interpretable algorithm can select biologically significant genes and potential drugs for DTC treatment.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"9"},"PeriodicalIF":4.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11771125/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143054000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Motif clustering and digital biomarker extraction for free-living physical activity analysis. 基序聚类和数字生物标记提取用于自由生活的身体活动分析。
IF 4 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-22 DOI: 10.1186/s13040-025-00424-1
Ya-Ting Liang, Charlotte Wang

Background: Analyzing free-living physical activity (PA) data presents challenges due to variability in daily routines and the lack of activity labels. Traditional approaches often rely on summary statistics, which may not capture the nuances of individual activity patterns. To address these limitations and advance our understanding of the relationship between PA patterns and health outcomes, we propose a novel motif clustering algorithm that identifies and characterizes specific PA patterns.

Methods: This paper proposes an elastic distance-based motif clustering algorithm for identifying specific PA patterns (motifs) in free-living PA data. The algorithm segments long-term PA curves into short-term segments and utilizes elastic shape analysis to measure the similarity between activity segments. This enables the discovery of recurring motifs through pattern clustering. Then, functional principal component analysis (FPCA) is then used to extract digital biomarkers from each motif. These digital biomarkers can subsequently be used to explore the relationship between PA and health outcomes of interest.

Results: We demonstrate the efficacy of our method through three real-world applications. Results show that digital biomarkers derived from these motifs effectively capture the association between PA patterns and disease outcomes, improving the accuracy of patient classification.

Conclusions: This study introduced a novel approach to analyzing free-living PA data by identifying and characterizing specific activity patterns (motifs). The derived digital biomarkers provide a more nuanced understanding of PA and its impact on health, with potential applications in personalized health assessment and disease detection, offering a promising future for healthcare.

背景:由于日常生活的可变性和缺乏活动标签,分析自由生活的身体活动(PA)数据面临挑战。传统的方法通常依赖于汇总统计,这可能无法捕捉到个体活动模式的细微差别。为了解决这些限制并促进我们对PA模式与健康结果之间关系的理解,我们提出了一种新的基序聚类算法,该算法可以识别和表征特定的PA模式。方法:本文提出了一种基于弹性距离的基序聚类算法,用于识别自由生活的PA数据中的特定PA模式(motif)。该算法将长期PA曲线分割为短期段,并利用弹性形状分析来衡量活动段之间的相似性。这使得通过模式聚类发现重复出现的主题成为可能。然后,使用功能主成分分析(FPCA)从每个基序中提取数字生物标志物。这些数字生物标志物随后可用于探索PA与感兴趣的健康结果之间的关系。结果:我们通过三个实际应用证明了我们的方法的有效性。结果表明,来自这些基序的数字生物标志物有效地捕获了PA模式与疾病结果之间的关联,提高了患者分类的准确性。结论:本研究引入了一种新的方法,通过识别和表征特定的活动模式(基序)来分析自由生活的PA数据。衍生的数字生物标志物提供了对PA及其对健康影响的更细致的理解,在个性化健康评估和疾病检测中具有潜在的应用,为医疗保健提供了一个充满希望的未来。
{"title":"Motif clustering and digital biomarker extraction for free-living physical activity analysis.","authors":"Ya-Ting Liang, Charlotte Wang","doi":"10.1186/s13040-025-00424-1","DOIUrl":"10.1186/s13040-025-00424-1","url":null,"abstract":"<p><strong>Background: </strong>Analyzing free-living physical activity (PA) data presents challenges due to variability in daily routines and the lack of activity labels. Traditional approaches often rely on summary statistics, which may not capture the nuances of individual activity patterns. To address these limitations and advance our understanding of the relationship between PA patterns and health outcomes, we propose a novel motif clustering algorithm that identifies and characterizes specific PA patterns.</p><p><strong>Methods: </strong>This paper proposes an elastic distance-based motif clustering algorithm for identifying specific PA patterns (motifs) in free-living PA data. The algorithm segments long-term PA curves into short-term segments and utilizes elastic shape analysis to measure the similarity between activity segments. This enables the discovery of recurring motifs through pattern clustering. Then, functional principal component analysis (FPCA) is then used to extract digital biomarkers from each motif. These digital biomarkers can subsequently be used to explore the relationship between PA and health outcomes of interest.</p><p><strong>Results: </strong>We demonstrate the efficacy of our method through three real-world applications. Results show that digital biomarkers derived from these motifs effectively capture the association between PA patterns and disease outcomes, improving the accuracy of patient classification.</p><p><strong>Conclusions: </strong>This study introduced a novel approach to analyzing free-living PA data by identifying and characterizing specific activity patterns (motifs). The derived digital biomarkers provide a more nuanced understanding of PA and its impact on health, with potential applications in personalized health assessment and disease detection, offering a promising future for healthcare.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"8"},"PeriodicalIF":4.0,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11753168/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143025254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An ensemble machine learning-based performance evaluation identifies top In-Silico pathogenicity prediction methods that best classify driver mutations in cancer. 基于集成机器学习的性能评估确定了对癌症驱动突变进行最佳分类的顶级计算机致病性预测方法。
IF 4 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-20 DOI: 10.1186/s13040-024-00420-x
Subrata Das, Vatsal Patel, Shouvik Chakravarty, Arnab Ghosh, Anirban Mukhopadhyay, Nidhan K Biswas

Background and objective: Accurate identification and prioritization of driver-mutations in cancer is critical for effective patient management. Despite the presence of numerous bioinformatic algorithms for estimating mutation pathogenicity, there is significant variation in their assessments. This inconsistency is evident even for well-established cancer driver mutations. This study aims to develop an ensemble machine learning approach to evaluate the performance (rank) of pathogenic and conservation scoring algorithms (PCSAs) based on their ability to distinguish pathogenic driver mutations from benign passenger (non-driver) mutations in head and neck squamous cell carcinoma (HNSC).

Methods: The study used a dataset from 502 HNSC patients, classifying mutations based on 299 known high-confidence cancer driver genes. Missense somatic mutations in driver genes were treated as driver mutations, while non-driver mutations were randomly selected from other genes. Each mutation was annotated with 41 PCSAs. Three machine learning algorithms-logistic regression, random forest, and support vector machine-along with recursive feature elimination, were used to rank these PCSAs. The final ranking of the PCSAs was determined using rank-average-sort and rank-sum-sort methods.

Results: The random forest algorithm emerged as the top performer among the three tested ML algorithms, with an AUC-ROC of 0.89, compared to 0.83 for the other two, in distinguishing pathogenic driver mutations from benign passenger mutations using all 41 PCSAs. The top 11 PCSAs were selected based on the first quintile cut-off from the final rank-sum distribution. Classifiers built using these top 11 PCSAs (DEOGEN2, Integrated_fitCons, MVP, etc.) demonstrated significantly higher performance (p-value < 2.22e-16) compared to those using the remaining 30 PCSAs across all three ML algorithms, in separating pathogenic driver from benign passenger mutations. The top PCSAs demonstrated strong performance on a validation cohort including independent HNSC and other cancer types: breast, lung, and colorectal - reflecting its consistency, robustness and generalizability.

Conclusions: The ensemble machine learning approach effectively evaluates the performance of PCSAs based on their ability to differentiate pathogenic drivers from benign passenger mutations in HNSC and other cancer types. Notably, some well-known PCSAs performed poorly, underscoring the importance of data-driven selection over relying solely on popularity.

背景和目的:癌症驱动突变的准确识别和优先排序对于有效的患者管理至关重要。尽管存在许多用于估计突变致病性的生物信息学算法,但它们的评估存在显着差异。这种不一致甚至在已经确定的癌症驱动突变中也很明显。本研究旨在开发一种集成机器学习方法,基于区分头颈部鳞状细胞癌(HNSC)中致病驱动突变和良性乘客(非驱动)突变的能力,评估致病性和保守性评分算法(pcsa)的性能(排名)。方法:该研究使用了来自502例HNSC患者的数据集,基于299个已知的高可信度癌症驱动基因对突变进行了分类。驱动基因中的错义体细胞突变被视为驱动突变,而非驱动突变则从其他基因中随机选择。每个突变用41个pcsa注释。三种机器学习算法——逻辑回归、随机森林和支持向量机——以及递归特征消除,被用来对这些pcsa进行排序。采用秩-平均排序和秩-和排序方法确定pcsa的最终排序。结果:在使用所有41种pcsa区分致病驱动突变和良性乘客突变方面,随机森林算法在三种测试的ML算法中表现最佳,AUC-ROC为0.89,而其他两种算法的AUC-ROC为0.83。排名前11位的pcsa是根据最终秩和分布的第一个五分位数选出的。使用这11种排名靠前的pcsa (DEOGEN2、Integrated_fitCons、MVP等)构建的分类器表现出了显著更高的性能(p值)。结论:集成机器学习方法基于pcsa区分HNSC和其他癌症类型的致病驱动因子与良性乘客突变的能力,有效地评估了pcsa的性能。值得注意的是,一些知名的pcsa表现不佳,强调了数据驱动选择的重要性,而不是仅仅依靠人气。
{"title":"An ensemble machine learning-based performance evaluation identifies top In-Silico pathogenicity prediction methods that best classify driver mutations in cancer.","authors":"Subrata Das, Vatsal Patel, Shouvik Chakravarty, Arnab Ghosh, Anirban Mukhopadhyay, Nidhan K Biswas","doi":"10.1186/s13040-024-00420-x","DOIUrl":"10.1186/s13040-024-00420-x","url":null,"abstract":"<p><strong>Background and objective: </strong>Accurate identification and prioritization of driver-mutations in cancer is critical for effective patient management. Despite the presence of numerous bioinformatic algorithms for estimating mutation pathogenicity, there is significant variation in their assessments. This inconsistency is evident even for well-established cancer driver mutations. This study aims to develop an ensemble machine learning approach to evaluate the performance (rank) of pathogenic and conservation scoring algorithms (PCSAs) based on their ability to distinguish pathogenic driver mutations from benign passenger (non-driver) mutations in head and neck squamous cell carcinoma (HNSC).</p><p><strong>Methods: </strong>The study used a dataset from 502 HNSC patients, classifying mutations based on 299 known high-confidence cancer driver genes. Missense somatic mutations in driver genes were treated as driver mutations, while non-driver mutations were randomly selected from other genes. Each mutation was annotated with 41 PCSAs. Three machine learning algorithms-logistic regression, random forest, and support vector machine-along with recursive feature elimination, were used to rank these PCSAs. The final ranking of the PCSAs was determined using rank-average-sort and rank-sum-sort methods.</p><p><strong>Results: </strong>The random forest algorithm emerged as the top performer among the three tested ML algorithms, with an AUC-ROC of 0.89, compared to 0.83 for the other two, in distinguishing pathogenic driver mutations from benign passenger mutations using all 41 PCSAs. The top 11 PCSAs were selected based on the first quintile cut-off from the final rank-sum distribution. Classifiers built using these top 11 PCSAs (DEOGEN2, Integrated_fitCons, MVP, etc.) demonstrated significantly higher performance (p-value < 2.22e-16) compared to those using the remaining 30 PCSAs across all three ML algorithms, in separating pathogenic driver from benign passenger mutations. The top PCSAs demonstrated strong performance on a validation cohort including independent HNSC and other cancer types: breast, lung, and colorectal - reflecting its consistency, robustness and generalizability.</p><p><strong>Conclusions: </strong>The ensemble machine learning approach effectively evaluates the performance of PCSAs based on their ability to differentiate pathogenic drivers from benign passenger mutations in HNSC and other cancer types. Notably, some well-known PCSAs performed poorly, underscoring the importance of data-driven selection over relying solely on popularity.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"7"},"PeriodicalIF":4.0,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11744934/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143014566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enriched phenotypes in rare variant carriers suggest pathogenic mechanisms in rare disease patients. 罕见变异携带者的丰富表型提示罕见病患者的致病机制。
IF 6.1 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-17 DOI: 10.1186/s13040-024-00418-5
Lane Fitzsimmons, Brett Beaulieu-Jones, Shilpa Nadimpalli Kobren

Background: The mechanistic pathways that give rise to the extreme symptoms exhibited by rare disease patients are complex, heterogeneous, and difficult to discern. Understanding these mechanisms is critical for developing treatments that address the underlying causes of diseases rather than merely the presenting symptoms. Moreover, the same dysfunctional series of interrelated symptoms implicated in rare recessive diseases may also lead to milder and potentially preventable symptoms in carriers in the general population. Seizures are a common and extreme phenotype that can result from diverse and often elusive pathways in patients with ultrarare or undiagnosed disorders.

Methods: In this pilot study, we present an approach to understand the underlying pathways leading to seizures in patients from the Undiagnosed Diseases Network (UDN) by analyzing aggregated genotype and phenotype data from the UK Biobank (UKB). Specifically, we look for enriched phenotypes across UKB participants who harbor rare variants in the same gene known or suspected to be causally implicated in a UDN patient's recessively manifesting disorder. Analyzing these milder but related associated phenotypes in UKB participants can provide insight into the disease-causing mechanisms at play in rare disease UDN patients.

Results: We present six vignettes of undiagnosed patients experiencing seizures as part of their recessive genetic condition. For each patient, we analyze a gene of interest: MPO, P2RX7, SQSTM1, COL27A1, PIGQ, or CACNA2D2, and find relevant symptoms associated with UKB participants. We discuss the potential mechanisms by which the digestive, skeletal, circulatory, and immune system abnormalities found in the UKB patients may contribute to the severe presentations exhibited by UDN patients. We find that in our set of rare disease patients, seizures may result from diverse, multi-step pathways that involve multiple body systems.

Conclusions: Analyses of large-scale population cohorts such as the UKB can be a critical tool to further our understanding of rare diseases in general. Continued research in this area could lead to more precise diagnostics and personalized treatment strategies for patients with rare and undiagnosed conditions.

背景:引起罕见病患者表现出的极端症状的机制途径是复杂的,异质性的,并且难以辨别。了解这些机制对于开发治疗方法,解决疾病的根本原因而不仅仅是表现症状至关重要。此外,与罕见隐性疾病相关的一系列功能失调症状也可能导致普通人群中的携带者出现较轻且可能可预防的症状。癫痫发作是一种常见的极端表型,可由多种多样且往往难以捉摸的途径引起,发生在患有罕见或未确诊疾病的患者中。方法:在这项初步研究中,我们提出了一种方法,通过分析来自英国生物银行(UKB)的汇总基因型和表型数据,了解导致未确诊疾病网络(UDN)患者癫痫发作的潜在途径。具体来说,我们在UKB参与者中寻找富集的表型,这些参与者在已知或怀疑与UDN患者隐性表现疾病有因果关系的同一基因中含有罕见变异。分析UKB参与者中这些较轻但相关的表型可以深入了解罕见疾病UDN患者的致病机制。结果:我们目前的六个小插曲未确诊的患者经历癫痫发作的一部分,他们的隐性遗传条件。对于每位患者,我们分析了一个感兴趣的基因:MPO、P2RX7、SQSTM1、COL27A1、PIGQ或CACNA2D2,并找到与UKB参与者相关的相关症状。我们讨论了在UKB患者中发现的消化、骨骼、循环和免疫系统异常可能导致UDN患者表现出严重症状的潜在机制。我们发现,在我们的一组罕见疾病患者中,癫痫发作可能是由涉及多个身体系统的多种多步骤途径引起的。结论:对大规模人群队列(如UKB)的分析可以成为进一步了解罕见病的关键工具。在这一领域的持续研究可能会为罕见和未确诊疾病的患者带来更精确的诊断和个性化的治疗策略。
{"title":"Enriched phenotypes in rare variant carriers suggest pathogenic mechanisms in rare disease patients.","authors":"Lane Fitzsimmons, Brett Beaulieu-Jones, Shilpa Nadimpalli Kobren","doi":"10.1186/s13040-024-00418-5","DOIUrl":"10.1186/s13040-024-00418-5","url":null,"abstract":"<p><strong>Background: </strong>The mechanistic pathways that give rise to the extreme symptoms exhibited by rare disease patients are complex, heterogeneous, and difficult to discern. Understanding these mechanisms is critical for developing treatments that address the underlying causes of diseases rather than merely the presenting symptoms. Moreover, the same dysfunctional series of interrelated symptoms implicated in rare recessive diseases may also lead to milder and potentially preventable symptoms in carriers in the general population. Seizures are a common and extreme phenotype that can result from diverse and often elusive pathways in patients with ultrarare or undiagnosed disorders.</p><p><strong>Methods: </strong>In this pilot study, we present an approach to understand the underlying pathways leading to seizures in patients from the Undiagnosed Diseases Network (UDN) by analyzing aggregated genotype and phenotype data from the UK Biobank (UKB). Specifically, we look for enriched phenotypes across UKB participants who harbor rare variants in the same gene known or suspected to be causally implicated in a UDN patient's recessively manifesting disorder. Analyzing these milder but related associated phenotypes in UKB participants can provide insight into the disease-causing mechanisms at play in rare disease UDN patients.</p><p><strong>Results: </strong>We present six vignettes of undiagnosed patients experiencing seizures as part of their recessive genetic condition. For each patient, we analyze a gene of interest: MPO, P2RX7, SQSTM1, COL27A1, PIGQ, or CACNA2D2, and find relevant symptoms associated with UKB participants. We discuss the potential mechanisms by which the digestive, skeletal, circulatory, and immune system abnormalities found in the UKB patients may contribute to the severe presentations exhibited by UDN patients. We find that in our set of rare disease patients, seizures may result from diverse, multi-step pathways that involve multiple body systems.</p><p><strong>Conclusions: </strong>Analyses of large-scale population cohorts such as the UKB can be a critical tool to further our understanding of rare diseases in general. Continued research in this area could lead to more precise diagnostics and personalized treatment strategies for patients with rare and undiagnosed conditions.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"6"},"PeriodicalIF":6.1,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11740427/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143014569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction: Predictive modeling of ALS progression: an XGBoost approach using clinical features. 纠正:ALS进展的预测建模:使用临床特征的XGBoost方法。
IF 4 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-17 DOI: 10.1186/s13040-025-00423-2
Richa Gupta, Mansi Bhandari, Anhad Grover, Taher Al-Shehari, Mohammed Kadrie, Taha Alfakih, Hussain Alsalman
{"title":"Correction: Predictive modeling of ALS progression: an XGBoost approach using clinical features.","authors":"Richa Gupta, Mansi Bhandari, Anhad Grover, Taher Al-Shehari, Mohammed Kadrie, Taha Alfakih, Hussain Alsalman","doi":"10.1186/s13040-025-00423-2","DOIUrl":"https://doi.org/10.1186/s13040-025-00423-2","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"5"},"PeriodicalIF":4.0,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11740421/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143014567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MultiChem: predicting chemical properties using multi-view graph attention network. MultiChem:使用多视图图注意网络预测化学性质。
IF 4 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-16 DOI: 10.1186/s13040-024-00419-4
Heesang Moon, Mina Rho

Background: Understanding the molecular properties of chemical compounds is essential for identifying potential candidates or ensuring safety in drug discovery. However, exploring the vast chemical space is time-consuming and costly, necessitating the development of time-efficient and cost-effective computational methods. Recent advances in deep learning approaches have offered deeper insights into molecular structures. Leveraging this progress, we developed a novel multi-view learning model.

Results: We introduce a graph-integrated model that captures both local and global structural features of chemical compounds. In our model, graph attention layers are employed to effectively capture essential local structures by jointly considering atom and bond features, while multi-head attention layers extract important global features. We evaluated our model on nine MoleculeNet datasets, encompassing both classification and regression tasks, and compared its performance with state-of-the-art methods. Our model achieved an average area under the receiver operating characteristic (AUROC) of 0.822 and a root mean squared error (RMSE) of 1.133, representing a 3% improvement in AUROC and a 7% improvement in RMSE over state-of-the-art models in extensive seed testing.

Conclusion: MultiChem highlights the importance of integrating both local and global structural information in predicting molecular properties, while also assessing the stability of the models across multiple datasets using various random seed values.

Implementation: The codes are available at https://github.com/DMnBI/MultiChem .

背景:了解化合物的分子性质对于确定潜在候选药物或确保药物开发的安全性至关重要。然而,探索广阔的化学空间既耗时又昂贵,因此需要开发具有时间效率和成本效益的计算方法。深度学习方法的最新进展为分子结构提供了更深入的见解。利用这一进展,我们开发了一种新的多视图学习模型。结果:我们引入了一个图集成模型,可以捕获化合物的局部和全局结构特征。在我们的模型中,图关注层通过联合考虑原子和键的特征来有效捕获重要的局部结构,而多头关注层则提取重要的全局特征。我们在九个MoleculeNet数据集上评估了我们的模型,包括分类和回归任务,并将其性能与最先进的方法进行了比较。我们的模型在接收者操作特征(AUROC)下的平均面积为0.822,均方根误差(RMSE)为1.133,在广泛的种子测试中,与最先进的模型相比,AUROC提高了3%,RMSE提高了7%。结论:MultiChem强调了在预测分子性质时整合局部和全局结构信息的重要性,同时也使用不同的随机种子值评估了模型在多个数据集上的稳定性。实现:代码可在https://github.com/DMnBI/MultiChem上获得。
{"title":"MultiChem: predicting chemical properties using multi-view graph attention network.","authors":"Heesang Moon, Mina Rho","doi":"10.1186/s13040-024-00419-4","DOIUrl":"10.1186/s13040-024-00419-4","url":null,"abstract":"<p><strong>Background: </strong>Understanding the molecular properties of chemical compounds is essential for identifying potential candidates or ensuring safety in drug discovery. However, exploring the vast chemical space is time-consuming and costly, necessitating the development of time-efficient and cost-effective computational methods. Recent advances in deep learning approaches have offered deeper insights into molecular structures. Leveraging this progress, we developed a novel multi-view learning model.</p><p><strong>Results: </strong>We introduce a graph-integrated model that captures both local and global structural features of chemical compounds. In our model, graph attention layers are employed to effectively capture essential local structures by jointly considering atom and bond features, while multi-head attention layers extract important global features. We evaluated our model on nine MoleculeNet datasets, encompassing both classification and regression tasks, and compared its performance with state-of-the-art methods. Our model achieved an average area under the receiver operating characteristic (AUROC) of 0.822 and a root mean squared error (RMSE) of 1.133, representing a 3% improvement in AUROC and a 7% improvement in RMSE over state-of-the-art models in extensive seed testing.</p><p><strong>Conclusion: </strong>MultiChem highlights the importance of integrating both local and global structural information in predicting molecular properties, while also assessing the stability of the models across multiple datasets using various random seed values.</p><p><strong>Implementation: </strong>The codes are available at https://github.com/DMnBI/MultiChem .</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"4"},"PeriodicalIF":4.0,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11737097/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143014571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome-wide association studies are enriched for interacting genes. 全基因组关联研究丰富了相互作用基因。
IF 4 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-15 DOI: 10.1186/s13040-024-00421-w
Peter T Nguyen, Simon G Coetzee, Irina Silacheva, Dennis J Hazelett

Background: With recent advances in single cell technology, high-throughput methods provide unique insight into disease mechanisms and more importantly, cell type origin. Here, we used multi-omics data to understand how genetic variants from genome-wide association studies influence development of disease. We show in principle how to use genetic algorithms with normal, matching pairs of single-nucleus RNA- and ATAC-seq, genome annotations, and protein-protein interaction data to describe the genes and cell types collectively and their contribution to increased risk.

Results: We used genetic algorithms to measure fitness of gene-cell set proposals against a series of objective functions that capture data and annotations. The highest information objective function captured protein-protein interactions. We observed significantly greater fitness scores and subgraph sizes in foreground vs. matching sets of control variants. Furthermore, our model reliably identified known targets and ligand-receptor pairs, consistent with prior studies.

Conclusions: Our findings suggested that application of genetic algorithms to association studies can generate a coherent cellular model of risk from a set of susceptibility variants. Further, we showed, using breast cancer as an example, that such variants have a greater number of physical interactions than expected due to chance.

背景:随着单细胞技术的最新进展,高通量方法为疾病机制和更重要的细胞类型起源提供了独特的见解。在这里,我们使用多组学数据来了解来自全基因组关联研究的遗传变异如何影响疾病的发展。我们原则上展示了如何使用遗传算法与正常的、匹配的单核RNA-和ATAC-seq对、基因组注释和蛋白质-蛋白质相互作用数据来共同描述基因和细胞类型及其对风险增加的贡献。结果:我们使用遗传算法来测量针对一系列目标函数的基因细胞集建议的适应度,这些目标函数捕获数据和注释。最高信息目标函数捕获蛋白质-蛋白质相互作用。我们观察到前景的适应度得分和子图大小明显高于匹配控制变量集。此外,我们的模型可靠地识别了已知的靶标和配体受体对,与先前的研究一致。结论:我们的研究结果表明,将遗传算法应用于关联研究可以从一组易感性变异中产生连贯的风险细胞模型。此外,我们以乳腺癌为例表明,由于偶然性,这些变异具有比预期更多的物理相互作用。
{"title":"Genome-wide association studies are enriched for interacting genes.","authors":"Peter T Nguyen, Simon G Coetzee, Irina Silacheva, Dennis J Hazelett","doi":"10.1186/s13040-024-00421-w","DOIUrl":"10.1186/s13040-024-00421-w","url":null,"abstract":"<p><strong>Background: </strong>With recent advances in single cell technology, high-throughput methods provide unique insight into disease mechanisms and more importantly, cell type origin. Here, we used multi-omics data to understand how genetic variants from genome-wide association studies influence development of disease. We show in principle how to use genetic algorithms with normal, matching pairs of single-nucleus RNA- and ATAC-seq, genome annotations, and protein-protein interaction data to describe the genes and cell types collectively and their contribution to increased risk.</p><p><strong>Results: </strong>We used genetic algorithms to measure fitness of gene-cell set proposals against a series of objective functions that capture data and annotations. The highest information objective function captured protein-protein interactions. We observed significantly greater fitness scores and subgraph sizes in foreground vs. matching sets of control variants. Furthermore, our model reliably identified known targets and ligand-receptor pairs, consistent with prior studies.</p><p><strong>Conclusions: </strong>Our findings suggested that application of genetic algorithms to association studies can generate a coherent cellular model of risk from a set of susceptibility variants. Further, we showed, using breast cancer as an example, that such variants have a greater number of physical interactions than expected due to chance.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"3"},"PeriodicalIF":4.0,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11734473/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143014570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Venus score for the assessment of the quality and trustworthiness of biomedical datasets. 维纳斯分数用于评估生物医学数据集的质量和可信度。
IF 4 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-09 DOI: 10.1186/s13040-024-00412-x
Davide Chicco, Alessandro Fabris, Giuseppe Jurman

Biomedical datasets are the mainstays of computational biology and health informatics projects, and can be found on multiple data platforms online or obtained from wet-lab biologists and physicians. The quality and the trustworthiness of these datasets, however, can sometimes be poor, producing bad results in turn, which can harm patients and data subjects. To address this problem, policy-makers, researchers, and consortia have proposed diverse regulations, guidelines, and scores to assess the quality and increase the reliability of datasets. Although generally useful, however, they are often incomplete and impractical. The guidelines of Datasheets for Datasets, in particular, are too numerous; the requirements of the Kaggle Dataset Usability Score focus on non-scientific requisites (for example, including a cover image); and the European Union Artificial Intelligence Act (EU AI Act) sets forth sparse and general data governance requirements, which we tailored to datasets for biomedical AI. Against this backdrop, we introduce our new Venus score to assess the data quality and trustworthiness of biomedical datasets. Our score ranges from 0 to 10 and consists of ten questions that anyone developing a bioinformatics, medical informatics, or cheminformatics dataset should answer before the release. In this study, we first describe the EU AI Act, Datasheets for Datasets, and the Kaggle Dataset Usability Score, presenting their requirements and their drawbacks. To do so, we reverse-engineer the weights of the influential Kaggle Score for the first time and report them in this study. We distill the most important data governance requirements into ten questions tailored to the biomedical domain, comprising the Venus score. We apply the Venus score to twelve datasets from multiple subdomains, including electronic health records, medical imaging, microarray and bulk RNA-seq gene expression, cheminformatics, physiologic electrogram signals, and medical text. Analyzing the results, we surface fine-grained strengths and weaknesses of popular datasets, as well as aggregate trends. Most notably, we find a widespread tendency to gloss over sources of data inaccuracy and noise, which may hinder the reliable exploitation of data and, consequently, research results. Overall, our results confirm the applicability and utility of the Venus score to assess the trustworthiness of biomedical data.

生物医学数据集是计算生物学和健康信息学项目的支柱,可以在多个在线数据平台上找到,也可以从湿实验室生物学家和医生那里获得。然而,这些数据集的质量和可信度有时可能很差,从而产生不好的结果,这可能会伤害患者和数据主体。为了解决这个问题,政策制定者、研究人员和协会提出了各种各样的法规、指南和评分来评估数据集的质量并提高数据集的可靠性。然而,尽管它们通常是有用的,但往往是不完整的和不切实际的。特别是,数据集的数据表指南太多了;Kaggle数据集可用性评分的要求侧重于非科学的必要条件(例如,包括封面图像);欧盟人工智能法案(EU AI Act)规定了稀疏和一般的数据治理要求,我们为生物医学人工智能的数据集量身定制了这些要求。在此背景下,我们引入了新的Venus评分来评估生物医学数据集的数据质量和可信度。我们的评分范围从0到10,由10个问题组成,任何开发生物信息学,医学信息学或化学信息学数据集的人都应该在发布之前回答这些问题。在本研究中,我们首先描述了欧盟人工智能法案、数据集数据表和Kaggle数据集可用性评分,并提出了它们的要求和缺点。为此,我们首次对有影响力的Kaggle评分的权重进行了逆向工程,并在本研究中报告了它们。我们将最重要的数据治理需求提炼成针对生物医学领域量身定制的十个问题,组成维纳斯评分。我们将Venus评分应用于来自多个子领域的12个数据集,包括电子健康记录、医学成像、微阵列和大量RNA-seq基因表达、化学信息学、生理电图信号和医学文本。分析结果,我们揭示了流行数据集的细粒度优势和劣势,以及总体趋势。最值得注意的是,我们发现了一种广泛的趋势,即掩盖数据不准确和噪音的来源,这可能会阻碍数据的可靠利用,从而影响研究结果。总体而言,我们的研究结果证实了维纳斯评分在评估生物医学数据可信度方面的适用性和实用性。
{"title":"The Venus score for the assessment of the quality and trustworthiness of biomedical datasets.","authors":"Davide Chicco, Alessandro Fabris, Giuseppe Jurman","doi":"10.1186/s13040-024-00412-x","DOIUrl":"10.1186/s13040-024-00412-x","url":null,"abstract":"<p><p>Biomedical datasets are the mainstays of computational biology and health informatics projects, and can be found on multiple data platforms online or obtained from wet-lab biologists and physicians. The quality and the trustworthiness of these datasets, however, can sometimes be poor, producing bad results in turn, which can harm patients and data subjects. To address this problem, policy-makers, researchers, and consortia have proposed diverse regulations, guidelines, and scores to assess the quality and increase the reliability of datasets. Although generally useful, however, they are often incomplete and impractical. The guidelines of Datasheets for Datasets, in particular, are too numerous; the requirements of the Kaggle Dataset Usability Score focus on non-scientific requisites (for example, including a cover image); and the European Union Artificial Intelligence Act (EU AI Act) sets forth sparse and general data governance requirements, which we tailored to datasets for biomedical AI. Against this backdrop, we introduce our new Venus score to assess the data quality and trustworthiness of biomedical datasets. Our score ranges from 0 to 10 and consists of ten questions that anyone developing a bioinformatics, medical informatics, or cheminformatics dataset should answer before the release. In this study, we first describe the EU AI Act, Datasheets for Datasets, and the Kaggle Dataset Usability Score, presenting their requirements and their drawbacks. To do so, we reverse-engineer the weights of the influential Kaggle Score for the first time and report them in this study. We distill the most important data governance requirements into ten questions tailored to the biomedical domain, comprising the Venus score. We apply the Venus score to twelve datasets from multiple subdomains, including electronic health records, medical imaging, microarray and bulk RNA-seq gene expression, cheminformatics, physiologic electrogram signals, and medical text. Analyzing the results, we surface fine-grained strengths and weaknesses of popular datasets, as well as aggregate trends. Most notably, we find a widespread tendency to gloss over sources of data inaccuracy and noise, which may hinder the reliable exploitation of data and, consequently, research results. Overall, our results confirm the applicability and utility of the Venus score to assess the trustworthiness of biomedical data.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"1"},"PeriodicalIF":4.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11716409/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142957099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Open challenges and opportunities in federated foundation models towards biomedical healthcare. 面向生物医学保健的联邦基金会模式中的开放挑战和机遇。
IF 4 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-04 DOI: 10.1186/s13040-024-00414-9
Xingyu Li, Lu Peng, Yu-Ping Wang, Weihua Zhang

This survey explores the transformative impact of foundation models (FMs) in artificial intelligence, focusing on their integration with federated learning (FL) in biomedical research. Foundation models such as ChatGPT, LLaMa, and CLIP, which are trained on vast datasets through methods including unsupervised pretraining, self-supervised learning, instructed fine-tuning, and reinforcement learning from human feedback, represent significant advancements in machine learning. These models, with their ability to generate coherent text and realistic images, are crucial for biomedical applications that require processing diverse data forms such as clinical reports, diagnostic images, and multimodal patient interactions. The incorporation of FL with these sophisticated models presents a promising strategy to harness their analytical power while safeguarding the privacy of sensitive medical data. This approach not only enhances the capabilities of FMs in medical diagnostics and personalized treatment but also addresses critical concerns about data privacy and security in healthcare. This survey reviews the current applications of FMs in federated settings, underscores the challenges, and identifies future research directions including scaling FMs, managing data diversity, and enhancing communication efficiency within FL frameworks. The objective is to encourage further research into the combined potential of FMs and FL, laying the groundwork for healthcare innovations.

本调查探讨了基础模型(FMs)在人工智能中的变革性影响,重点是它们与生物医学研究中的联邦学习(FL)的集成。ChatGPT、LLaMa和CLIP等基础模型通过无监督预训练、自监督学习、指示微调和从人类反馈中强化学习等方法在大量数据集上进行训练,代表了机器学习的重大进步。这些模型能够生成连贯的文本和逼真的图像,对于需要处理各种数据形式(如临床报告、诊断图像和多模式患者交互)的生物医学应用至关重要。将FL与这些复杂的模型结合起来,在保护敏感医疗数据隐私的同时,提供了一种很有前途的策略,可以利用它们的分析能力。这种方法不仅增强了FMs在医疗诊断和个性化治疗方面的能力,而且还解决了医疗保健中有关数据隐私和安全的关键问题。本调查回顾了FMs在联邦环境中的当前应用,强调了挑战,并确定了未来的研究方向,包括扩展FMs、管理数据多样性和提高FMs框架内的通信效率。其目的是鼓励进一步研究FMs和FL的联合潜力,为医疗保健创新奠定基础。
{"title":"Open challenges and opportunities in federated foundation models towards biomedical healthcare.","authors":"Xingyu Li, Lu Peng, Yu-Ping Wang, Weihua Zhang","doi":"10.1186/s13040-024-00414-9","DOIUrl":"10.1186/s13040-024-00414-9","url":null,"abstract":"<p><p>This survey explores the transformative impact of foundation models (FMs) in artificial intelligence, focusing on their integration with federated learning (FL) in biomedical research. Foundation models such as ChatGPT, LLaMa, and CLIP, which are trained on vast datasets through methods including unsupervised pretraining, self-supervised learning, instructed fine-tuning, and reinforcement learning from human feedback, represent significant advancements in machine learning. These models, with their ability to generate coherent text and realistic images, are crucial for biomedical applications that require processing diverse data forms such as clinical reports, diagnostic images, and multimodal patient interactions. The incorporation of FL with these sophisticated models presents a promising strategy to harness their analytical power while safeguarding the privacy of sensitive medical data. This approach not only enhances the capabilities of FMs in medical diagnostics and personalized treatment but also addresses critical concerns about data privacy and security in healthcare. This survey reviews the current applications of FMs in federated settings, underscores the challenges, and identifies future research directions including scaling FMs, managing data diversity, and enhancing communication efficiency within FL frameworks. The objective is to encourage further research into the combined potential of FMs and FL, laying the groundwork for healthcare innovations.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"2"},"PeriodicalIF":4.0,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142928515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction: Detection and classification of long terminal repeat sequences in plant LTR-retrotransposons and their analysis using explainable machine learning. 更正:植物 LTR 反转座子中长末端重复序列的检测和分类,以及利用可解释机器学习对其进行分析。
IF 4 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-30 DOI: 10.1186/s13040-024-00417-6
Jakub Horvath, Pavel Jedlicka, Marie Kratka, Zdenek Kubat, Eduard Kejnovsky, Matej Lexa
{"title":"Correction: Detection and classification of long terminal repeat sequences in plant LTR-retrotransposons and their analysis using explainable machine learning.","authors":"Jakub Horvath, Pavel Jedlicka, Marie Kratka, Zdenek Kubat, Eduard Kejnovsky, Matej Lexa","doi":"10.1186/s13040-024-00417-6","DOIUrl":"10.1186/s13040-024-00417-6","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"17 1","pages":"62"},"PeriodicalIF":4.0,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11687018/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142907814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biodata Mining
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1