Pub Date : 2024-07-15DOI: 10.2174/0115748936302984240604061302
Yilmaz Atay, Lionel Alangeh Ngobesing, Mustafa Ozgur Cingiz
Introduction: Cancer driver genes are genes responsible for cancer genesis; thus, identifying cancer-related genes is crucial in fostering cancer treatment. The accuracy in identifying cancer driver genes within the vast pool of normal passenger genes directly influences the efficacy of treatment approaches. Objective: This research aimed to effectively identify cancer driver genes using the List-based Simulated Annealing (LBSA) optimization technique. Method: The proposed model (LBSA-DRIVER) harnesses a list-based simulated annealing algorithm within a bipartite network to pinpoint cancer driver genes. The process begins with creating a bipartite graph that integrates gene mutations and expression data from carefully chosen datasets. The LBSA algorithm is then applied to the generated graph to identify and rank the genes, drawing insights from a biological interaction network. Result: Following the algorithm's development, rigorous experimental analyses have been conducted using four benchmark datasets from The Cancer Genome Atlas (TCGA) database. The datasets used were the Breast Cancer dataset (BRCA), Prostate Adenocarcinoma dataset (PRAD), Ovarian cancer dataset (OV), and Glioblastoma Multiforme dataset (GBM). Conclusion: Our findings, including precision, recall, F-score, and accuracy metrics, provide strong evidence of the effectiveness of the proposed model in identifying driver genes.
{"title":"LBSA-DRIVER: A Novel Approach to Identifying Cancer Driver Genes Using List-Based Simulated Annealing","authors":"Yilmaz Atay, Lionel Alangeh Ngobesing, Mustafa Ozgur Cingiz","doi":"10.2174/0115748936302984240604061302","DOIUrl":"https://doi.org/10.2174/0115748936302984240604061302","url":null,"abstract":"Introduction: Cancer driver genes are genes responsible for cancer genesis; thus, identifying cancer-related genes is crucial in fostering cancer treatment. The accuracy in identifying cancer driver genes within the vast pool of normal passenger genes directly influences the efficacy of treatment approaches. Objective: This research aimed to effectively identify cancer driver genes using the List-based Simulated Annealing (LBSA) optimization technique. Method: The proposed model (LBSA-DRIVER) harnesses a list-based simulated annealing algorithm within a bipartite network to pinpoint cancer driver genes. The process begins with creating a bipartite graph that integrates gene mutations and expression data from carefully chosen datasets. The LBSA algorithm is then applied to the generated graph to identify and rank the genes, drawing insights from a biological interaction network. Result: Following the algorithm's development, rigorous experimental analyses have been conducted using four benchmark datasets from The Cancer Genome Atlas (TCGA) database. The datasets used were the Breast Cancer dataset (BRCA), Prostate Adenocarcinoma dataset (PRAD), Ovarian cancer dataset (OV), and Glioblastoma Multiforme dataset (GBM). Conclusion: Our findings, including precision, recall, F-score, and accuracy metrics, provide strong evidence of the effectiveness of the proposed model in identifying driver genes.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"28 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141720127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-10DOI: 10.2174/0115748936299646240625092734
Yang Lv, Ting Liu, YuChen Ma, Hongqiang Lyu, Ze Liu
Background: The identification and functional prediction of Multifunctional Therapeutic Peptides (MFTP) play a pivotal role in drug discovery, particularly for conditions such as inflammation and hyperglycemia. Current computational methods exhibit limitations in their ability to accurately predict the multifunctionality of these peptides. Methods: We propose a novel Wide and Deep Learning Framework that integrates both deep learning and machine learning approaches. The deep learning segment processes word vectors using a neural network model, while the wide segment utilizes the physicochemical properties of peptides in a random forest-based model. This hybrid approach aims to enhance the accuracy of MFTP function prediction. Results: Our framework outperformed the existing PrMFTP predictor in terms of precision, coverage, accuracy, and absolute true values. The evaluation was conducted on both training and independent testing datasets, demonstrating the robustness and generalizability of our model. Conclusion: The proposed Wide & Deep Learning Framework offers a significant advancement in the computational prediction of MFTP functions. The availability of our model through a userfriendly web interface at MFTP-Tool.m6aminer.cn provides a valuable tool for researchers in the field of therapeutic peptide-based drug discovery, potentially accelerating the development of new treatments.
{"title":"MFTP-Tool: A Wide & Deep Learning Framework for Multi-Functional Therapeutic Peptides Prediction","authors":"Yang Lv, Ting Liu, YuChen Ma, Hongqiang Lyu, Ze Liu","doi":"10.2174/0115748936299646240625092734","DOIUrl":"https://doi.org/10.2174/0115748936299646240625092734","url":null,"abstract":"Background: The identification and functional prediction of Multifunctional Therapeutic Peptides (MFTP) play a pivotal role in drug discovery, particularly for conditions such as inflammation and hyperglycemia. Current computational methods exhibit limitations in their ability to accurately predict the multifunctionality of these peptides. Methods: We propose a novel Wide and Deep Learning Framework that integrates both deep learning and machine learning approaches. The deep learning segment processes word vectors using a neural network model, while the wide segment utilizes the physicochemical properties of peptides in a random forest-based model. This hybrid approach aims to enhance the accuracy of MFTP function prediction. Results: Our framework outperformed the existing PrMFTP predictor in terms of precision, coverage, accuracy, and absolute true values. The evaluation was conducted on both training and independent testing datasets, demonstrating the robustness and generalizability of our model. Conclusion: The proposed Wide & Deep Learning Framework offers a significant advancement in the computational prediction of MFTP functions. The availability of our model through a userfriendly web interface at MFTP-Tool.m6aminer.cn provides a valuable tool for researchers in the field of therapeutic peptide-based drug discovery, potentially accelerating the development of new treatments.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"79 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141586134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-10DOI: 10.2174/0115748936295679240620094626
Baoping Zhu, Fan Yang, Hongliang Duan, Zhipeng Gao
aims: This study aims to leverage artificial intelligence for enhancing medical diagnosis, focusing on ultrasound evaluation of fetal development and detection of fetal diseases. background: Traditional diagnostic methods in ultrasound are known for being time-consuming and laborious, prompting the need for more efficient approaches. objective: The objective of this research is to develop an end-to-end automatic diagnosis system using convolutional neural networks with ensemble learning to enhance robustness and accuracy in classifying ultrasound images. method: The study involves constructing and implementing the automatic diagnosis system, training it on a diverse dataset encompassing six categories: abdomen, brain, femur, thorax, maternal cervix, and other planes. result: Experimental results demonstrate that the proposed end-to-end system significantly improves the detection accuracy of the standard plane in ultrasound images. conclusion: The application of artificial intelligence through an ensemble learning-based automatic diagnosis system shows promise in advancing ultrasound-based medical diagnosis, particularly in fetal development assessment. other: This research contributes to the ongoing efforts in leveraging technology for more efficient and accurate medical diagnostic processes.
{"title":"Automatic Detection of Standard Planes in Fetal Ultrasound Images based on Convolutional Neural Networks and Ensemble Learning","authors":"Baoping Zhu, Fan Yang, Hongliang Duan, Zhipeng Gao","doi":"10.2174/0115748936295679240620094626","DOIUrl":"https://doi.org/10.2174/0115748936295679240620094626","url":null,"abstract":"aims: This study aims to leverage artificial intelligence for enhancing medical diagnosis, focusing on ultrasound evaluation of fetal development and detection of fetal diseases. background: Traditional diagnostic methods in ultrasound are known for being time-consuming and laborious, prompting the need for more efficient approaches. objective: The objective of this research is to develop an end-to-end automatic diagnosis system using convolutional neural networks with ensemble learning to enhance robustness and accuracy in classifying ultrasound images. method: The study involves constructing and implementing the automatic diagnosis system, training it on a diverse dataset encompassing six categories: abdomen, brain, femur, thorax, maternal cervix, and other planes. result: Experimental results demonstrate that the proposed end-to-end system significantly improves the detection accuracy of the standard plane in ultrasound images. conclusion: The application of artificial intelligence through an ensemble learning-based automatic diagnosis system shows promise in advancing ultrasound-based medical diagnosis, particularly in fetal development assessment. other: This research contributes to the ongoing efforts in leveraging technology for more efficient and accurate medical diagnostic processes.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"25 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141586136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Endometriosis is a debilitating gynecological disorder characterized by chronic pain, infertility, and the growth of endometrial tissue outside the uterus. Accurate and early detection of this condition is crucial for effective management and treatment. Methods: We developed a gene rank matrix-based model to integrate endometriosis cohorts across multiple platforms. After removing batch effects, we identified 83 genes associated with endometriosis and further refined a diagnostic model using 11 of these genes. The model was trained on two platforms and validated on two others using SVM, Random Forest, Logistic Regression, and gradient-boosting machine learning algorithms. Results: The integration via the gene rank matrix effectively mitigated batch effects. Utilizing a gradient boosting classifier with a subset of 11 genes, the model demonstrated commendable diagnostic efficacy, achieving an Area Under the Curve (AUC) of 0.77, an accuracy of 0.72, and an F1 score of 0.72 for the training dataset. When subjected to validation, the model maintained its performance, yielding an AUC of 0.769, an accuracy of 0.719, and an F1 score of 0.732. These 11 genes were found to be associated with immunosuppression. Conclusion: Our approach to integrating gene rank matrices effectively consolidates endometriosis data across diverse platforms. The diagnostic model, harnessing the predictive power of 11 specific genes, surpasses alternative models, thereby offering promising prospects for aiding clinical diagnosis of endometriosis. Further validation is imperative to elucidate the functional significance of these 11 genes. Our study underscores the potential of data integration coupled with machine learning techniques in advancing the diagnosis of intricate diseases, such as endometriosis.
{"title":"Rank Matrix Approach for Endometriosis: Integrating Data and Constructing Diagnostic Models","authors":"Ranze Xie, Deqing Hong, Jiaqi Yuan, Peng Xu, Wenbin Liu, Zheng Ye","doi":"10.2174/0115748936296151240605053713","DOIUrl":"https://doi.org/10.2174/0115748936296151240605053713","url":null,"abstract":"Background: Endometriosis is a debilitating gynecological disorder characterized by chronic pain, infertility, and the growth of endometrial tissue outside the uterus. Accurate and early detection of this condition is crucial for effective management and treatment. Methods: We developed a gene rank matrix-based model to integrate endometriosis cohorts across multiple platforms. After removing batch effects, we identified 83 genes associated with endometriosis and further refined a diagnostic model using 11 of these genes. The model was trained on two platforms and validated on two others using SVM, Random Forest, Logistic Regression, and gradient-boosting machine learning algorithms. Results: The integration via the gene rank matrix effectively mitigated batch effects. Utilizing a gradient boosting classifier with a subset of 11 genes, the model demonstrated commendable diagnostic efficacy, achieving an Area Under the Curve (AUC) of 0.77, an accuracy of 0.72, and an F1 score of 0.72 for the training dataset. When subjected to validation, the model maintained its performance, yielding an AUC of 0.769, an accuracy of 0.719, and an F1 score of 0.732. These 11 genes were found to be associated with immunosuppression. Conclusion: Our approach to integrating gene rank matrices effectively consolidates endometriosis data across diverse platforms. The diagnostic model, harnessing the predictive power of 11 specific genes, surpasses alternative models, thereby offering promising prospects for aiding clinical diagnosis of endometriosis. Further validation is imperative to elucidate the functional significance of these 11 genes. Our study underscores the potential of data integration coupled with machine learning techniques in advancing the diagnosis of intricate diseases, such as endometriosis.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"22 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141586217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-10DOI: 10.2174/0115748936303435240702112205
Freeson Kaniwa
Background: The human genome is densely populated with repetitive DNA sequences that play crucial roles in genomic functions and structures but are also implicated in over 40 human diseases. The computational challenge of identifying and characterizing these repeats is significant due to the complexity and size of the genome, which are overwhelming traditional algorithms. Methods: To address these challenges, we propose GenRepAI, a deep learning framework to navigate and analyze genomic suffix trees. GenRepAI employs supervised machine learning classifiers trained on labeled datasets of repeat annotations and unsupervised anomaly detection to identify novel repeat sequences. The models are trained using convolutional neural networks (CNNs), long short-term memory networks (LSTMs), and vision transformers to classify and annotate repeats within the human genome. Results: GenRepAI is designed to comprehensively profile repeats that underlie various neurological diseases, allowing researchers to identify pathogenic expansions. The framework will integrate into existing genomic analysis pipelines, with the capability to screen patient genomes and highlight potential causal variants for further validation. Conclusion: GenRepAI is set to become a foundational tool in genomics, leveraging artificial intelligence to enhance the characterization of repetitive sequences. It promises significant advancements in the molecular diagnosis of repeat expansion disorders and contributes to a deeper understanding of genomic structure and function, with broad applications in personalized medicine.
背景:人类基因组中存在大量重复的 DNA 序列,它们在基因组功能和结构中发挥着至关重要的作用,同时也与 40 多种人类疾病有关。由于基因组的复杂性和规模,识别和表征这些重复序列的计算难度很大,传统算法难以承受。方法:为了应对这些挑战,我们提出了 GenRepAI,这是一个导航和分析基因组后缀树的深度学习框架。GenRepAI 采用在重复注释的标记数据集上训练的监督机器学习分类器和无监督异常检测来识别新的重复序列。模型使用卷积神经网络(CNN)、长短期记忆网络(LSTM)和视觉转换器进行训练,以对人类基因组中的重复序列进行分类和注释。结果GenRepAI 旨在全面剖析导致各种神经系统疾病的重复序列,使研究人员能够识别致病性扩展。该框架将集成到现有的基因组分析管道中,能够筛选患者基因组并突出潜在的因果变异,以便进一步验证。结论GenRepAI 将成为基因组学的基础工具,利用人工智能加强重复序列的特征描述。它有望在重复扩增疾病的分子诊断方面取得重大进展,并有助于加深对基因组结构和功能的理解,在个性化医疗方面有着广泛的应用。
{"title":"GenRepAI: Utilizing Artificial Intelligence to Identify Repeats in Genomic Suffix Trees","authors":"Freeson Kaniwa","doi":"10.2174/0115748936303435240702112205","DOIUrl":"https://doi.org/10.2174/0115748936303435240702112205","url":null,"abstract":"Background: The human genome is densely populated with repetitive DNA sequences that play crucial roles in genomic functions and structures but are also implicated in over 40 human diseases. The computational challenge of identifying and characterizing these repeats is significant due to the complexity and size of the genome, which are overwhelming traditional algorithms. Methods: To address these challenges, we propose GenRepAI, a deep learning framework to navigate and analyze genomic suffix trees. GenRepAI employs supervised machine learning classifiers trained on labeled datasets of repeat annotations and unsupervised anomaly detection to identify novel repeat sequences. The models are trained using convolutional neural networks (CNNs), long short-term memory networks (LSTMs), and vision transformers to classify and annotate repeats within the human genome. Results: GenRepAI is designed to comprehensively profile repeats that underlie various neurological diseases, allowing researchers to identify pathogenic expansions. The framework will integrate into existing genomic analysis pipelines, with the capability to screen patient genomes and highlight potential causal variants for further validation. Conclusion: GenRepAI is set to become a foundational tool in genomics, leveraging artificial intelligence to enhance the characterization of repetitive sequences. It promises significant advancements in the molecular diagnosis of repeat expansion disorders and contributes to a deeper understanding of genomic structure and function, with broad applications in personalized medicine.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"17 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141586135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-03DOI: 10.2174/0115748936295986240619162816
Jiayu Xu, Chengkui Zhao, Zhenyu Wei, Weixin Xie, Qi Cheng, Min Zhang, Shuangze Han, Liqing Kang, Nan Xu, Lei Yu, Weixing Feng
Background: Chimeric Antigen Receptor (CAR)-T cell therapy has emerged as a highly effective treatment for hematological tumors. However, the associated adverse reaction, Cytokine Release Syndrome (CRS), poses a significant challenge. While numerous studies have investigated CRS biomarkers during CAR-T cell therapy, the ability to predict CRS risk prior to treatment initiation remains a crucial yet underexplored aspect. Objective: The primary purpose of this study was to address the issue of limited data, explore an alternative approach using public data to identify predictive markers for CRS risk assessment from RNA-Seq in pre-treatment patients data, and comprehend the inducible mechanisms underlying CRS. Methods: We integrated information from two public databases, the FDA Adverse Event Reporting System (FAERS) for adverse reaction reports of CAR-T cell therapy and the Cancer Genome Atlas (TCGA) for RNA-Seq data on corresponding hematological tumors. Candidate genes were screened by correlation analysis between Reported Odds Ratio (ROR) values and RNA-Seq gene expression levels, and then core factors were identified through stepwise analysis of pathway enrichment, cluster analysis, and protein interactions. Results: Our analysis highlighted the correlation between CRS risk and pre-treatment T cell activation/ proliferation, identifying key genes (IFN-γ, IL1β, IL2, IL6, and IL10) as significant CRS indicators. Conclusion: This study offers a unique perspective on predicting CRS risk before CAR-T cell therapy, circumventing the challenges of scarce clinical data by leveraging analysis of public databases. It elucidates the crucial role of T cell activation/proliferation dynamics in CRS. The analytical methods and identified markers provide a reference for the research and clinical application of CAR-T cell therapy.
{"title":"Screening Analysis of Predictive Markers for Cytokine Release Syndrome Risk in CAR-T Cell Therapy","authors":"Jiayu Xu, Chengkui Zhao, Zhenyu Wei, Weixin Xie, Qi Cheng, Min Zhang, Shuangze Han, Liqing Kang, Nan Xu, Lei Yu, Weixing Feng","doi":"10.2174/0115748936295986240619162816","DOIUrl":"https://doi.org/10.2174/0115748936295986240619162816","url":null,"abstract":"Background: Chimeric Antigen Receptor (CAR)-T cell therapy has emerged as a highly effective treatment for hematological tumors. However, the associated adverse reaction, Cytokine Release Syndrome (CRS), poses a significant challenge. While numerous studies have investigated CRS biomarkers during CAR-T cell therapy, the ability to predict CRS risk prior to treatment initiation remains a crucial yet underexplored aspect. Objective: The primary purpose of this study was to address the issue of limited data, explore an alternative approach using public data to identify predictive markers for CRS risk assessment from RNA-Seq in pre-treatment patients data, and comprehend the inducible mechanisms underlying CRS. Methods: We integrated information from two public databases, the FDA Adverse Event Reporting System (FAERS) for adverse reaction reports of CAR-T cell therapy and the Cancer Genome Atlas (TCGA) for RNA-Seq data on corresponding hematological tumors. Candidate genes were screened by correlation analysis between Reported Odds Ratio (ROR) values and RNA-Seq gene expression levels, and then core factors were identified through stepwise analysis of pathway enrichment, cluster analysis, and protein interactions. Results: Our analysis highlighted the correlation between CRS risk and pre-treatment T cell activation/ proliferation, identifying key genes (IFN-γ, IL1β, IL2, IL6, and IL10) as significant CRS indicators. Conclusion: This study offers a unique perspective on predicting CRS risk before CAR-T cell therapy, circumventing the challenges of scarce clinical data by leveraging analysis of public databases. It elucidates the crucial role of T cell activation/proliferation dynamics in CRS. The analytical methods and identified markers provide a reference for the research and clinical application of CAR-T cell therapy.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"2014 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141550364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-26DOI: 10.2174/0115748936304167240530091051
Khadijeh Shokri, Naser Farrokhi, Asadollah Ahmadikhah, Mahdi Safaeizade, Amir Mousavi
Background: Gene expression is regulated in a spatiotemporal manner, and the roles of microProteins (MiPs) in this concept have started to become clear in plants. Methods: Here, a microarray data analysis was carried out to decipher the spatiotemporal role of MiPs in embryo development. The guilt-by-association method was used to determine the corresponding regulatory factors. Results: Module network analyses and protein-protein interaction (PPI) assays suggested 13 modules for embryo development in the Arabidopsis model plant. Various biological processes such as metabolite biosynthesis, hormone transition and regulation, fatty acid and storage protein biosynthesis, and photosynthesis-related processes were prevalent. Different transcription factors (TFs) at different stages of embryo development were found and reviewed. Furthermore, 106 putative MiPs were identified that might be involved in the regulation of embryo development. Candidate hub MiPs (15) at embryo developmental stages were identified by PPI network analysis and their putative regulatory roles were discussed. Previously reported MiPs, AT1G14760 (KNOX), AT5G39860 (PRE1), and AT2G46410 (CPC), were noted to be present in modules M3 and M8. Conclusion: Molecular comprehension of regulatory factors including MiPs and TFs during embryo development allows targeted breeding of the corresponding traits and genome-based engineering of value-added new varieties.
{"title":"Insights into Co-Expression Network Analysis of MicroProteins and their Target Transcription Factors in Plant Embryo Development","authors":"Khadijeh Shokri, Naser Farrokhi, Asadollah Ahmadikhah, Mahdi Safaeizade, Amir Mousavi","doi":"10.2174/0115748936304167240530091051","DOIUrl":"https://doi.org/10.2174/0115748936304167240530091051","url":null,"abstract":"Background: Gene expression is regulated in a spatiotemporal manner, and the roles of microProteins (MiPs) in this concept have started to become clear in plants. Methods: Here, a microarray data analysis was carried out to decipher the spatiotemporal role of MiPs in embryo development. The guilt-by-association method was used to determine the corresponding regulatory factors. Results: Module network analyses and protein-protein interaction (PPI) assays suggested 13 modules for embryo development in the Arabidopsis model plant. Various biological processes such as metabolite biosynthesis, hormone transition and regulation, fatty acid and storage protein biosynthesis, and photosynthesis-related processes were prevalent. Different transcription factors (TFs) at different stages of embryo development were found and reviewed. Furthermore, 106 putative MiPs were identified that might be involved in the regulation of embryo development. Candidate hub MiPs (15) at embryo developmental stages were identified by PPI network analysis and their putative regulatory roles were discussed. Previously reported MiPs, AT1G14760 (KNOX), AT5G39860 (PRE1), and AT2G46410 (CPC), were noted to be present in modules M3 and M8. Conclusion: Molecular comprehension of regulatory factors including MiPs and TFs during embryo development allows targeted breeding of the corresponding traits and genome-based engineering of value-added new varieties.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"24 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141529313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Introduction: N6-methyldeoxyadenine (6mA) is the most prevalent DNA modification in both prokaryotes and eukaryotes. While single-molecule real-time sequencing (SMRT-seq) can detect 6mA events at the individual nucleotide level, its practical application is hindered by a high rate of false positives. Methods: We propose a computational model for identifying DNA 6mA that incorporates comprehensive site features from SMRT-seq and employs machine learning classifiers. Results: The results demonstrate that 99.54% and 96.55% of the identified DNA 6mA instances in C.reinhardtii correspond with motifs and peak regions identified by methylated DNA immunoprecipitation sequencing (MeDIP-seq), respectively. Compared to SMRT-seq, the proportion of predicted DNA 6mA instances within MeDIP-seq peak regions increases by 2% to 70% across the six bacterial strains Conclusion: Our proposed method effectively reduces the false-positive rate in DNA 6mA prediction.
引言N6-甲基脱氧腺嘌呤(6mA)是原核生物和真核生物中最常见的 DNA 修饰。虽然单分子实时测序(SMRT-seq)能在单个核苷酸水平上检测 6mA 事件,但其实际应用却受到高假阳性率的阻碍。方法:我们提出了一种识别DNA 6mA的计算模型,该模型结合了SMRT-seq的综合位点特征,并采用了机器学习分类器。结果结果表明,C.reinhardtii 中 99.54% 和 96.55% 已识别的 DNA 6mA 实例分别与甲基化 DNA 免疫沉淀测序(MeDIP-seq)所识别的主题和峰值区域相对应。与 SMRT-seq 相比,MeDIP-seq 峰区中预测的 DNA 6mA 实例比例在六种细菌菌株中增加了 2% 至 70%:我们提出的方法有效降低了 DNA 6mA 预测的假阳性率。
{"title":"Detection of DNA N6-Methyladenine Modification through SMRT-seq Features and Machine Learning Model","authors":"Yichu Guo, Yixuan Zhang, Xiaoqing Liu, Pingan He, Yuni Zeng, Qi Dai","doi":"10.2174/0115748936300671240523044154","DOIUrl":"https://doi.org/10.2174/0115748936300671240523044154","url":null,"abstract":"Introduction: N6-methyldeoxyadenine (6mA) is the most prevalent DNA modification in both prokaryotes and eukaryotes. While single-molecule real-time sequencing (SMRT-seq) can detect 6mA events at the individual nucleotide level, its practical application is hindered by a high rate of false positives. Methods: We propose a computational model for identifying DNA 6mA that incorporates comprehensive site features from SMRT-seq and employs machine learning classifiers. Results: The results demonstrate that 99.54% and 96.55% of the identified DNA 6mA instances in C.reinhardtii correspond with motifs and peak regions identified by methylated DNA immunoprecipitation sequencing (MeDIP-seq), respectively. Compared to SMRT-seq, the proportion of predicted DNA 6mA instances within MeDIP-seq peak regions increases by 2% to 70% across the six bacterial strains Conclusion: Our proposed method effectively reduces the false-positive rate in DNA 6mA prediction.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"20 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-24DOI: 10.2174/0115748936308171240605075531
Yingdi Wu, Fuzhen Cao, Juntao Li
Background: Integrating multi-omics data for cancer classification brings complementary biological insights while also facing challenges such as data integration, gene grouping, and adaptive weight construction. Objective: This paper aims to address the challenges faced by the cancer subtype classification and gene screening based on multi-omics data. Methods: Multinomial logistic regression with adaptive regularization (MLRAR) was proposed by integrating DNA methylation, gene mutation, and RNA-seq information. A data preprocessing strategy that effectively utilizes multi-omics information was presented, and the local maximum quasiclique merging (lmQCM) algorithm was implemented to group genes. Biological pathway information was utilized to evaluate the significance of gene groups, while the significance of each gene within a group was evaluated by integrating mutation information, information theory, and methylation information. Results: Compared to MRlasso, MRGL, MSGL, MROGL, AMRSOGL, and AGLRMR, the proposed method yielded improvements in subtype classification accuracy of breast cancer by 2.6%, 2.9%, 3.5%, 2.3%, 2.0%, and 1.8%, respectively. In addition, MLRAR also achieved significant improvements in ovarian cancer by 8.2%, 5.0%, 6.8%, 5.2%, 12.7%, and 6.3%, respectively. Conclusion: The proposed method can effectively deal with data integration, gene grouping, and adaptive weight construction.
{"title":"Multinomial Logistic Regression with Adaptive Regularization for Cancer Subtype Classification via Multi-omics Data","authors":"Yingdi Wu, Fuzhen Cao, Juntao Li","doi":"10.2174/0115748936308171240605075531","DOIUrl":"https://doi.org/10.2174/0115748936308171240605075531","url":null,"abstract":"Background: Integrating multi-omics data for cancer classification brings complementary biological insights while also facing challenges such as data integration, gene grouping, and adaptive weight construction. Objective: This paper aims to address the challenges faced by the cancer subtype classification and gene screening based on multi-omics data. Methods: Multinomial logistic regression with adaptive regularization (MLRAR) was proposed by integrating DNA methylation, gene mutation, and RNA-seq information. A data preprocessing strategy that effectively utilizes multi-omics information was presented, and the local maximum quasiclique merging (lmQCM) algorithm was implemented to group genes. Biological pathway information was utilized to evaluate the significance of gene groups, while the significance of each gene within a group was evaluated by integrating mutation information, information theory, and methylation information. Results: Compared to MRlasso, MRGL, MSGL, MROGL, AMRSOGL, and AGLRMR, the proposed method yielded improvements in subtype classification accuracy of breast cancer by 2.6%, 2.9%, 3.5%, 2.3%, 2.0%, and 1.8%, respectively. In addition, MLRAR also achieved significant improvements in ovarian cancer by 8.2%, 5.0%, 6.8%, 5.2%, 12.7%, and 6.3%, respectively. Conclusion: The proposed method can effectively deal with data integration, gene grouping, and adaptive weight construction.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"2016 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-31DOI: 10.2174/0115748936289033240424071522
Ge Zhang, Chenwei Ma, Chaokun Yan, Huimin Luo, Jianlin Wang, Wenjuan Liang, Junwei Luo
Background:: Cancer has emerged as the "leading killer" of human health. Survival prediction is a crucial branch of cancer prognosis. It aims to estimate patients' survival risk based on their disease conditions. Accurate and efficient survival prediction is vital in cancer patients' treatment and clinical management, preventing unnecessary suffering and conserving precious medical resources. Deep learning has been extensively applied in cancer diagnosis, prognosis, and treatment management. The decreasing cost of next-generation sequencing, continuous development of related databases, and in-depth research on multimodal deep learning have provided opportunities for establishing more functionally rich and accurate survival prediction models. Objective:: The current area of cancer survival prediction still lacks a review of multimodal deep learning methods. Methods:: We conducted a statistical analysis of the relevant research on multimodal deep learning for cancer survival prediction. We first filtered keywords from 6 known relevant papers. Then, we searched PubMed and Google Scholar for relevant publications from 2018 to 2022 using "Multimodal", "Deep Learning" and "Cancer Survival Prediction" as keywords. Then, we further searched the related publications through the backward and forward citation search. Subsequently, we conducted a detailed analysis and review of these studies based on their datasets and methods. Results:: We present a comprehensive systematic review of the multimodal deep learning research on cancer survival prediction from 2018 to 2022. Conclusion:: Multimodal deep learning has demonstrated powerful data aggregation capabilities and excellent performance in improving cancer survival prediction greatly. It has made a significant positive impact on facilitating the advancement of automated cancer diagnosis and precision oncology.
{"title":"Multimodal Deep Learning for Cancer Survival Prediction: A Review","authors":"Ge Zhang, Chenwei Ma, Chaokun Yan, Huimin Luo, Jianlin Wang, Wenjuan Liang, Junwei Luo","doi":"10.2174/0115748936289033240424071522","DOIUrl":"https://doi.org/10.2174/0115748936289033240424071522","url":null,"abstract":"Background:: Cancer has emerged as the \"leading killer\" of human health. Survival prediction is a crucial branch of cancer prognosis. It aims to estimate patients' survival risk based on their disease conditions. Accurate and efficient survival prediction is vital in cancer patients' treatment and clinical management, preventing unnecessary suffering and conserving precious medical resources. Deep learning has been extensively applied in cancer diagnosis, prognosis, and treatment management. The decreasing cost of next-generation sequencing, continuous development of related databases, and in-depth research on multimodal deep learning have provided opportunities for establishing more functionally rich and accurate survival prediction models. Objective:: The current area of cancer survival prediction still lacks a review of multimodal deep learning methods. Methods:: We conducted a statistical analysis of the relevant research on multimodal deep learning for cancer survival prediction. We first filtered keywords from 6 known relevant papers. Then, we searched PubMed and Google Scholar for relevant publications from 2018 to 2022 using \"Multimodal\", \"Deep Learning\" and \"Cancer Survival Prediction\" as keywords. Then, we further searched the related publications through the backward and forward citation search. Subsequently, we conducted a detailed analysis and review of these studies based on their datasets and methods. Results:: We present a comprehensive systematic review of the multimodal deep learning research on cancer survival prediction from 2018 to 2022. Conclusion:: Multimodal deep learning has demonstrated powerful data aggregation capabilities and excellent performance in improving cancer survival prediction greatly. It has made a significant positive impact on facilitating the advancement of automated cancer diagnosis and precision oncology.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"58 1","pages":""},"PeriodicalIF":4.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141198177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}