Faisal Binzagr, Ansar Naseem, Muhammad Umer Farooq, Nashwan Alromema
Tumour necrosis factors (TNFs) are key players in processes such as inflammation, cancer development, and autoimmune diseases. However, accurately identifying TNFs remains challenging because of their complex interactions with other cytokines. Although existing machine learning models offer some potential, they often fall short in reliably distinguishing TNFs. To address this issue, the authors developed DEEP-TNFR, a more advanced model designed specifically to predict TNFR activity. The approach incorporates features such as relative and reverse positions, along with statistical moments, and is tested on a recognised benchmark dataset. The authors explored six different deep learning classifiers, including fully connected networks (FCN), convolutional neural networks (CNN), simple RNN (RNN), long short-term memory (LSTM), bidirectional LSTM (Bi-LSTM), and gated recurrent units (GRU). The model's effectiveness was evaluated through multiple methods: self-consistency, independent set testing, and 5- and 10-fold cross-validation, using metrics, such as accuracy, specificity, sensitivity, and Matthews correlation coefficient. Among these classifiers, LSTM proved to be the most effective, outperforming the others and setting a new standard compared to previous studies. DEEP-TNFR is poised to significantly support ongoing research by enhancing the accuracy of TNFR identification.
{"title":"TNFR-LSTM: A Deep Intelligent Model for Identification of Tumour Necroses Factor Receptor (TNFR) Activity","authors":"Faisal Binzagr, Ansar Naseem, Muhammad Umer Farooq, Nashwan Alromema","doi":"10.1049/syb2.70007","DOIUrl":"10.1049/syb2.70007","url":null,"abstract":"<p>Tumour necrosis factors (TNFs) are key players in processes such as inflammation, cancer development, and autoimmune diseases. However, accurately identifying TNFs remains challenging because of their complex interactions with other cytokines. Although existing machine learning models offer some potential, they often fall short in reliably distinguishing TNFs. To address this issue, the authors developed DEEP-TNFR, a more advanced model designed specifically to predict TNFR activity. The approach incorporates features such as relative and reverse positions, along with statistical moments, and is tested on a recognised benchmark dataset. The authors explored six different deep learning classifiers, including fully connected networks (FCN), convolutional neural networks (CNN), simple RNN (RNN), long short-term memory (LSTM), bidirectional LSTM (Bi-LSTM), and gated recurrent units (GRU). The model's effectiveness was evaluated through multiple methods: self-consistency, independent set testing, and 5- and 10-fold cross-validation, using metrics, such as accuracy, specificity, sensitivity, and Matthews correlation coefficient. Among these classifiers, LSTM proved to be the most effective, outperforming the others and setting a new standard compared to previous studies. DEEP-TNFR is poised to significantly support ongoing research by enhancing the accuracy of TNFR identification.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/syb2.70007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143735506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antibiotic pollution in the environment can significantly impact soil microorganisms, such as altering the soil microbial community or emerging antibiotic-resistant bacteria. We propose three machine learning (ML) methods to investigate antibiotics' impact on microorganisms and predict microbial abundance. We examined the microbial abundances of various environmental soil samples treated with antibiotics. We developed 3 ML models: (Model 1) for predicting the most abundant bacterial classes in a specific treatment group; (Model 2) for predicting antibiotic treatment effects based on bacterial abundances; and (Model 3) for using data from short-term incubations to predict the data of community structure after stabilisation. In Model 1, the Random Forest model achieved the highest average accuracy, with a Coefficient of Variation mean of 0.05 and 0.14 in the training and test set. In Model 2, the accuracy of the random forest and SVM models have the highest accuracy (nearly 0.90). Model 3 demonstrates that the Random Forest can use data from short-term incubations to predict the abundance of bacterial communities after long-term stabilisation. This study highlights the potential of ML models as powerful tools for understanding microbial dynamics in response to antibiotic treatments. The code is publicly available at - https://github.com/DeweyYihengDu/ML_on_Microbiota.
{"title":"Investigating the Impact of Antibiotics on Environmental Microbiota Through Machine Learning Models","authors":"Yiheng Du, Khandaker Asif Ahmed, Md Rakibul Hasan, Md Zakir Hossain","doi":"10.1049/syb2.70009","DOIUrl":"10.1049/syb2.70009","url":null,"abstract":"<p>Antibiotic pollution in the environment can significantly impact soil microorganisms, such as altering the soil microbial community or emerging antibiotic-resistant bacteria. We propose three machine learning (ML) methods to investigate antibiotics' impact on microorganisms and predict microbial abundance. We examined the microbial abundances of various environmental soil samples treated with antibiotics. We developed 3 ML models: (Model 1) for predicting the most abundant bacterial classes in a specific treatment group; (Model 2) for predicting antibiotic treatment effects based on bacterial abundances; and (Model 3) for using data from short-term incubations to predict the data of community structure after stabilisation. In Model 1, the Random Forest model achieved the highest average accuracy, with a Coefficient of Variation mean of 0.05 and 0.14 in the training and test set. In Model 2, the accuracy of the random forest and SVM models have the highest accuracy (nearly 0.90). Model 3 demonstrates that the Random Forest can use data from short-term incubations to predict the abundance of bacterial communities after long-term stabilisation. This study highlights the potential of ML models as powerful tools for understanding microbial dynamics in response to antibiotic treatments. The code is publicly available at - https://github.com/DeweyYihengDu/ML_on_Microbiota.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/syb2.70009","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143717410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cancer is a serious and complex disease caused by uncontrolled cell growth and is becoming one of the leading causes of death worldwide. Anticancer peptides (ACPs), as a bioactive peptide with lower toxicity, emerge as a promising means of effectively treating cancer. Identifying ACPs is challenging due to the limitation of experimental conditions. To address this, we proposed a dual-channel-based deep learning method, termed ACP-DPE, for ACP prediction. The ACP-DPE consisted of two parallel channels: one was an embedding layer followed by the bi-directional gated recurrent unit (Bi-GRU) module, and the other was an adaptive embedding layer followed by the dilated convolution module. The Bi-GRU module captured the peptide sequence dependencies, whereas the dilated convolution module characterised the local relationship of amino acids. Experimental results show that ACP-DPE achieves an accuracy of 82.81% and a sensitivity of 86.63%, surpassing the state-of-the-art method by 3.86% and 5.1%, respectively. These findings demonstrate the effectiveness of ACP-DPE for ACP prediction and highlight its potential as a valuable tool in cancer treatment research.
癌症是由不受控制的细胞生长引起的一种严重而复杂的疾病,正在成为全世界死亡的主要原因之一。抗癌肽(anti - cancer peptides, ACPs)作为一种低毒性的生物活性肽,是一种很有前景的有效治疗癌症的手段。由于实验条件的限制,确定acp具有挑战性。为了解决这个问题,我们提出了一种基于双通道的深度学习方法,称为ACP- dpe,用于ACP预测。ACP-DPE由两个并行通道组成:一个是嵌入层,后面是双向门控循环单元(Bi-GRU)模块;另一个是自适应嵌入层,后面是扩展卷积模块。Bi-GRU模块捕获了肽序列依赖性,而扩展卷积模块表征了氨基酸的局部关系。实验结果表明,ACP-DPE的准确率为82.81%,灵敏度为86.63%,分别比现有方法高3.86%和5.1%。这些发现证明了ACP- dpe预测ACP的有效性,并突出了其作为癌症治疗研究中有价值工具的潜力。
{"title":"ACP-DPE: A Dual-Channel Deep Learning Model for Anticancer Peptide Prediction","authors":"Guohua Huang, Yujie Cao, Qi Dai, Weihong Chen","doi":"10.1049/syb2.70010","DOIUrl":"10.1049/syb2.70010","url":null,"abstract":"<p>Cancer is a serious and complex disease caused by uncontrolled cell growth and is becoming one of the leading causes of death worldwide. Anticancer peptides (ACPs), as a bioactive peptide with lower toxicity, emerge as a promising means of effectively treating cancer. Identifying ACPs is challenging due to the limitation of experimental conditions. To address this, we proposed a dual-channel-based deep learning method, termed ACP-DPE, for ACP prediction. The ACP-DPE consisted of two parallel channels: one was an embedding layer followed by the bi-directional gated recurrent unit (Bi-GRU) module, and the other was an adaptive embedding layer followed by the dilated convolution module. The Bi-GRU module captured the peptide sequence dependencies, whereas the dilated convolution module characterised the local relationship of amino acids. Experimental results show that ACP-DPE achieves an accuracy of 82.81% and a sensitivity of 86.63%, surpassing the state-of-the-art method by 3.86% and 5.1%, respectively. These findings demonstrate the effectiveness of ACP-DPE for ACP prediction and highlight its potential as a valuable tool in cancer treatment research.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/syb2.70010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, many studies have proven that Piwi-interacting RNAs (piRNAs) play key roles in various biological processes and also associate with human complicated diseases. Therefore, in order to accelerate the traditional biomedical experimental methods for determining piRNA-disease associations, many computational approaches have been proposed. However, piRNA-disease associations can be classified into known and unknown associations, each of which may provide distinct types of information. Traditional graph convolutional networks (GCNs) typically treat all edges in a graph as identical, overlooking the fact that different edge types may carry different signals and influence the learning process in unique ways. In this study, we also provide a new piRNA-disease association prediction method, called PPDAMEGCN, based on a multi-edge type graph convolutional network. First, we calculate the piRNA sequence similarity based on the piRNA sequence information and Smith–Waterman method. The disease semantic similarity is also computed by disease ontology (DO). In addition, we calculate the Gaussian interaction profile (GIP) kernel similarities of piRNA and diseases through the known piRNA-disease associations. Then, we construct the piRNA similarity network by integrating the piRNA's sequence similarity and GIP similarity. We also construct the disease similarity network by integrating disease's semantic similarity and GIP similarity. Finally, we obtain the piRNA and disease embeddings by the multi-edge type graph convolutional network model on the heterogenous piRNA-disease association network. The piRNA-disease pair association probability score is calculated by a multilayer perceptron (MLP) with its concatenated embedding. We also compare PPDAMEGCN to other piRNA-disease prediction methods. The experimental results show that our method outperforms compared methods.
{"title":"PPDAMEGCN: Predicting piRNA-Disease Associations Based on Multi-Edge Type Graph Convolutional Network","authors":"Yinglong Peng, Shuang Chu, Xindi Huang, Yan Cheng","doi":"10.1049/syb2.70011","DOIUrl":"10.1049/syb2.70011","url":null,"abstract":"<p>Recently, many studies have proven that Piwi-interacting RNAs (piRNAs) play key roles in various biological processes and also associate with human complicated diseases. Therefore, in order to accelerate the traditional biomedical experimental methods for determining piRNA-disease associations, many computational approaches have been proposed. However, piRNA-disease associations can be classified into known and unknown associations, each of which may provide distinct types of information. Traditional graph convolutional networks (GCNs) typically treat all edges in a graph as identical, overlooking the fact that different edge types may carry different signals and influence the learning process in unique ways. In this study, we also provide a new piRNA-disease association prediction method, called PPDAMEGCN, based on a multi-edge type graph convolutional network. First, we calculate the piRNA sequence similarity based on the piRNA sequence information and Smith–Waterman method. The disease semantic similarity is also computed by disease ontology (DO). In addition, we calculate the Gaussian interaction profile (GIP) kernel similarities of piRNA and diseases through the known piRNA-disease associations. Then, we construct the piRNA similarity network by integrating the piRNA's sequence similarity and GIP similarity. We also construct the disease similarity network by integrating disease's semantic similarity and GIP similarity. Finally, we obtain the piRNA and disease embeddings by the multi-edge type graph convolutional network model on the heterogenous piRNA-disease association network. The piRNA-disease pair association probability score is calculated by a multilayer perceptron (MLP) with its concatenated embedding. We also compare PPDAMEGCN to other piRNA-disease prediction methods. The experimental results show that our method outperforms compared methods.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/syb2.70011","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143689430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parkinson's disease (PD), a degenerative disorder affecting the nervous system, manifests as unbalanced movements, stiffness, tremors, and coordination difficulties. Its cause, believed to involve genetic and environmental factors, underscores the critical need for prompt diagnosis and intervention to enhance treatment effectiveness. Despite the array of available diagnostics, their reliability remains a challenge. In this study, an innovative predictor PADG-Pred is proposed for the identification of Parkinson's associated biomarkers, utilising a genomic profile. In this study, a novel predictor, PADG-Pred, which not only identifies Parkinson's associated biomarkers through genomic profiling but also uniquely integrates multiple statistical feature extraction techniques with ensemble-based classification frameworks, thereby providing a more robust and interpretable decision-making process than existing tools. The processed dataset was utilised for feature extraction through multiple statistical moments and it is further involved in extensive training of the model using diverse classification techniques, encompassing Ensemble methods; XGBoost, Random Forest, Light Gradient Boosting Machine, Bagging, ExtraTrees, and Stacking. State-of-the-art validation procedures are applied, assessing key metrics such as specificity, accuracy, sensitivity/recall, and Mathew's correlation coefficient. The outcomes demonstrate the outstanding performance of PADG-RF, showcasing accuracy metrics consistently achieving ∼91% for the independent set, ∼94% for 5-fold, and ∼96% for 10-fold in cross-validation.
{"title":"PADG-Pred: Exploring Ensemble Approaches for Identifying Parkinson's Disease Associated Biomarkers Using Genomic Sequences Analysis","authors":"Ayesha Karim, Tamim Alkhalifah, Fahad Alturise, Yaser Daanial Khan","doi":"10.1049/syb2.70006","DOIUrl":"10.1049/syb2.70006","url":null,"abstract":"<p>Parkinson's disease (PD), a degenerative disorder affecting the nervous system, manifests as unbalanced movements, stiffness, tremors, and coordination difficulties. Its cause, believed to involve genetic and environmental factors, underscores the critical need for prompt diagnosis and intervention to enhance treatment effectiveness. Despite the array of available diagnostics, their reliability remains a challenge. In this study, an innovative predictor PADG-Pred is proposed for the identification of Parkinson's associated biomarkers, utilising a genomic profile. In this study, a novel predictor, PADG-Pred, which not only identifies Parkinson's associated biomarkers through genomic profiling but also uniquely integrates multiple statistical feature extraction techniques with ensemble-based classification frameworks, thereby providing a more robust and interpretable decision-making process than existing tools. The processed dataset was utilised for feature extraction through multiple statistical moments and it is further involved in extensive training of the model using diverse classification techniques, encompassing Ensemble methods; XGBoost, Random Forest, Light Gradient Boosting Machine, Bagging, ExtraTrees, and Stacking. State-of-the-art validation procedures are applied, assessing key metrics such as specificity, accuracy, sensitivity/recall, and Mathew's correlation coefficient. The outcomes demonstrate the outstanding performance of PADG-RF, showcasing accuracy metrics consistently achieving ∼91% for the independent set, ∼94% for 5-fold, and ∼96% for 10-fold in cross-validation.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/syb2.70006","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dan Liu, Changyu Qiu, Sheng Huang, Rongli Mo, Xiaomei Lu, Yanrong Zeng, Guangshu Zhu, Chaohua Zhang, Qiang Lin
As an economically important tree species, mulberry (Morus spp.) has exhibited a remarkable tolerance for salinity, drought and heavy metals. However, the precise mechanism of metabolome-mediated drought adaptation is unclear. In this study, two new mulberry varieties—‘drought-sensitive guisangyou62 (GSY62) and highly drought-tolerant guiyou2024 (GY2024)’—after three days (62F or 2024F) and six days (62B or 2024B) of drought–stress conditions were subjected to transcriptome and metabolome analyses. The enrichment analysis demonstrated that the differentially expressed genes (DEGs) were mainly enriched in carbohydrate metabolism, amino acid metabolism, energy metabolism and secondary metabolite biosynthesis under drought–stress conditions. Notably, compared with the CK group (without drought treatment), 60 and 70 DEGs in GY2024 and GSY62 were involved in sucrose and starch biosynthesis, respectively. The genes encoding sucrose phosphate synthase 2 and 4 were downregulated in GY2024, with a lower expression. The genes encoding key enzymes in starch biosynthesis were upregulated in GY2024 and the transcriptional abundance was significantly higher than in GSY62. These results indicated that drought stress reduced sucrose synthesis but accelerated starch synthesis in mulberry.
{"title":"Transcriptome sequencing and metabolome analysis to reveal renewal evidence for drought adaptation in mulberry","authors":"Dan Liu, Changyu Qiu, Sheng Huang, Rongli Mo, Xiaomei Lu, Yanrong Zeng, Guangshu Zhu, Chaohua Zhang, Qiang Lin","doi":"10.1049/syb2.70004","DOIUrl":"10.1049/syb2.70004","url":null,"abstract":"<p>As an economically important tree species, mulberry (Morus spp.) has exhibited a remarkable tolerance for salinity, drought and heavy metals. However, the precise mechanism of metabolome-mediated drought adaptation is unclear. In this study, two new mulberry varieties—‘drought-sensitive guisangyou62 (GSY62) and highly drought-tolerant guiyou2024 (GY2024)’—after three days (62F or 2024F) and six days (62B or 2024B) of drought–stress conditions were subjected to transcriptome and metabolome analyses. The enrichment analysis demonstrated that the differentially expressed genes (DEGs) were mainly enriched in carbohydrate metabolism, amino acid metabolism, energy metabolism and secondary metabolite biosynthesis under drought–stress conditions. Notably, compared with the CK group (without drought treatment), 60 and 70 DEGs in GY2024 and GSY62 were involved in sucrose and starch biosynthesis, respectively. The genes encoding sucrose phosphate synthase 2 and 4 were downregulated in GY2024, with a lower expression. The genes encoding key enzymes in starch biosynthesis were upregulated in GY2024 and the transcriptional abundance was significantly higher than in GSY62. These results indicated that drought stress reduced sucrose synthesis but accelerated starch synthesis in mulberry.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/syb2.70004","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143497176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gong Lejun, Yu Like, Wei Xinyi, Zhou Shehai, Xu Shuhua
With the development of high-throughput sequencing technology, the analysis of single-cell RNA sequencing data has become the focus of current research. Matrix analysis and processing of downstream gene expression after preprocessing is a hot topic for researchers. This paper proposed an iterative block matrix completion algorithm, called SeqBMC, based on matrix factorisation. The algorithm is used to complete the missing value of the gene expression matrix caused by the defect of sequencing technology. The gene frequency of the matrix is used to block the matrix, and then the matrix factorisation algorithm is used to complete the small matrix after the block, and then the biological zeros that may exist in the block matrix are retained. Experimental results show that the matrix completion algorithm can significantly improve the classification performance of the gene expression matrix after completion with 86.81% F1 score, which is conducive to the recognition of cell types in sequencing data. Moreover, this completion method can be completed only by the machine learning method without too much prior knowledge related to biology and has good effects. Compared with ALRA, SeqBMC increased 5.47% accuracy and 5.03% F1 score. It indicates that SeqBMC has significant advantages in the matrix completion of single-cell RNA sequencing data.
{"title":"SeqBMC: Single-cell data processing using iterative block matrix completion algorithm based on matrix factorisation","authors":"Gong Lejun, Yu Like, Wei Xinyi, Zhou Shehai, Xu Shuhua","doi":"10.1049/syb2.70003","DOIUrl":"10.1049/syb2.70003","url":null,"abstract":"<p>With the development of high-throughput sequencing technology, the analysis of single-cell RNA sequencing data has become the focus of current research. Matrix analysis and processing of downstream gene expression after preprocessing is a hot topic for researchers. This paper proposed an iterative block matrix completion algorithm, called SeqBMC, based on matrix factorisation. The algorithm is used to complete the missing value of the gene expression matrix caused by the defect of sequencing technology. The gene frequency of the matrix is used to block the matrix, and then the matrix factorisation algorithm is used to complete the small matrix after the block, and then the biological zeros that may exist in the block matrix are retained. Experimental results show that the matrix completion algorithm can significantly improve the classification performance of the gene expression matrix after completion with 86.81% F1 score, which is conducive to the recognition of cell types in sequencing data. Moreover, this completion method can be completed only by the machine learning method without too much prior knowledge related to biology and has good effects. Compared with ALRA, SeqBMC increased 5.47% accuracy and 5.03% F1 score. It indicates that SeqBMC has significant advantages in the matrix completion of single-cell RNA sequencing data.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/syb2.70003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143397137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ali Ghulam, Muhammad Arif, Ahsanullah Unar, Maha A. Thafar, Somayah Albaradei, Apilak Worachartcheewan
Hypertension, often known as high blood pressure, is a major concern to millions of individuals globally. Recent studies have demonstrated the significant efficacy of naturally derived peptides in reducing blood pressure. Hypertension is one of the risks associated with cardiovascular disorders and other health problems. Naturally sourced bioactive peptides possessing antihypertensive properties provide considerable potential as viable substitutes for conventional pharmaceutical medications. Currently, thorough examination of antihypertensive peptide (AHTPs), by using traditional wet-lab methods is highly expensive and labours. Therefore, in-silico approaches especially machine-learning (ML) algorithms are favourable due to saving time and cost in the discovery of AHTPs. In this study, a novel ML-based predictor, called StackAHTP was developed for predicting accurate AHTPs from sequence only. The proposed method, utilise two types of feature descriptors Pseudo-Amino Acid Composition and Dipeptide Composition to encode the local and global hidden information from peptide sequences. Furthermore, the encoded features are serially merged and ranked through SHapley Additive explanations (SHAP) algorithm. Then, the top ranked are fed into three different ensemble classifiers (Bagging, Boosting, and Stacking) for enhancing the prediction performance of the model. The StackAHTPs method achieved superior performance compare to other ML classifiers (AdaBoost, XGBoost and Light Gradient Boosting (LightGBM), Bagging and Boosting) on 10-fold cross validation and independent test. The experimental outcomes demonstrate that our proposed method outperformed the existing methods and achieved an accuracy of 92.25% and F1-score of 89.67% on independent test for predicting AHTPs and non-AHTPs. The authors believe this research will remarkably contribute in predicting large-scale characterisation of AHTPs and accelerate the drug discovery process. At https://github.com/ali-ghulam/StackAHTPs you may find datasets features used.
{"title":"StackAHTPs: An explainable antihypertensive peptides identifier based on heterogeneous features and stacked learning approach","authors":"Ali Ghulam, Muhammad Arif, Ahsanullah Unar, Maha A. Thafar, Somayah Albaradei, Apilak Worachartcheewan","doi":"10.1049/syb2.70002","DOIUrl":"10.1049/syb2.70002","url":null,"abstract":"<p>Hypertension, often known as high blood pressure, is a major concern to millions of individuals globally. Recent studies have demonstrated the significant efficacy of naturally derived peptides in reducing blood pressure. Hypertension is one of the risks associated with cardiovascular disorders and other health problems. Naturally sourced bioactive peptides possessing antihypertensive properties provide considerable potential as viable substitutes for conventional pharmaceutical medications. Currently, thorough examination of antihypertensive peptide (AHTPs), by using traditional wet-lab methods is highly expensive and labours. Therefore, in-silico approaches especially machine-learning (ML) algorithms are favourable due to saving time and cost in the discovery of AHTPs. In this study, a novel ML-based predictor, called StackAHTP was developed for predicting accurate AHTPs from sequence only. The proposed method, utilise two types of feature descriptors Pseudo-Amino Acid Composition and Dipeptide Composition to encode the local and global hidden information from peptide sequences. Furthermore, the encoded features are serially merged and ranked through SHapley Additive explanations (SHAP) algorithm. Then, the top ranked are fed into three different ensemble classifiers (Bagging, Boosting, and Stacking) for enhancing the prediction performance of the model. The StackAHTPs method achieved superior performance compare to other ML classifiers (AdaBoost, XGBoost and Light Gradient Boosting (LightGBM), Bagging and Boosting) on 10-fold cross validation and independent test. The experimental outcomes demonstrate that our proposed method outperformed the existing methods and achieved an accuracy of 92.25% and F1-score of 89.67% on independent test for predicting AHTPs and non-AHTPs. The authors believe this research will remarkably contribute in predicting large-scale characterisation of AHTPs and accelerate the drug discovery process. At https://github.com/ali-ghulam/StackAHTPs you may find datasets features used.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/syb2.70002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143112153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Metal ions are significant ligands that bind to proteins and play crucial roles in cell metabolism, material transport, and signal transduction. Predicting the protein-metal ion ligand binding residues (PMILBRs) accurately is a challenging task in theoretical calculations. In this study, the authors employed fused amino acids and their derived information as feature parameters to predict PMILBRs using three classical machine learning algorithms, yielding favourable prediction results. Subsequently, deep learning algorithm was incorporated in the prediction, resulting in improved results for the sets of Ca2+ and Mg2+ compared to previous studies. The validation matrix provided the optimal prediction model for each ionic ligand binding residue, exhibiting the capability of effectively predicting the binding sites of metal ion ligands for real protein chains.
{"title":"The optimised model of predicting protein-metal ion ligand binding residues","authors":"Caiyun Yang, Xiuzhen Hu, Zhenxing Feng, Sixi Hao, Gaimei Zhang, Shaohua Chen, Guodong Guo","doi":"10.1049/syb2.70001","DOIUrl":"10.1049/syb2.70001","url":null,"abstract":"<p>Metal ions are significant ligands that bind to proteins and play crucial roles in cell metabolism, material transport, and signal transduction. Predicting the protein-metal ion ligand binding residues (PMILBRs) accurately is a challenging task in theoretical calculations. In this study, the authors employed fused amino acids and their derived information as feature parameters to predict PMILBRs using three classical machine learning algorithms, yielding favourable prediction results. Subsequently, deep learning algorithm was incorporated in the prediction, resulting in improved results for the sets of Ca<sup>2+</sup> and Mg<sup>2+</sup> compared to previous studies. The validation matrix provided the optimal prediction model for each ionic ligand binding residue, exhibiting the capability of effectively predicting the binding sites of metal ion ligands for real protein chains.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11773433/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143054115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The herbal sitz bath formula, as a complementary therapy, effectively alleviates postoperative wound pain and accelerates healing time in patients with perianal abscesses. To investigate its mechanism of action, this study conducted 16S rRNA gene sequencing and bioinformatics analysis on wound exudate samples from patients after perianal abscess surgery. Patients were randomly divided into two groups: one receiving the herbal sitz bath as an adjunctive therapy and the other without this adjunctive therapy. Samples were collected on the first and eighth days after surgery to compare the differences in microbial community composition between the two groups on the eighth day and between the first and eighth days within each group. The study revealed that the herbal sitz bath significantly altered the structure of the microbial community, increasing its diversity and abundance. By reducing Enterococcus and increasing Bifidobacterium, Faecalibacterium, and Ruminococcus, the therapy enhanced the wound's anti-infective capacity and accelerated healing. This study explored the potential mechanism of the herbal sitz bath formula as an adjunctive therapy in promoting postoperative recovery from perianal abscesses, providing valuable data for further research on the role of microorganisms in wound care. These findings contribute to optimising postoperative treatment regimens and facilitating patient recovery.
{"title":"Microbiome analysis reveals the potential mechanism of herbal sitz bath complementary therapy in accelerating postoperative recovery from perianal abscesses","authors":"Xinghua Chen, Xiutian Guo","doi":"10.1049/syb2.12114","DOIUrl":"10.1049/syb2.12114","url":null,"abstract":"<p>The herbal sitz bath formula, as a complementary therapy, effectively alleviates postoperative wound pain and accelerates healing time in patients with perianal abscesses. To investigate its mechanism of action, this study conducted 16S rRNA gene sequencing and bioinformatics analysis on wound exudate samples from patients after perianal abscess surgery. Patients were randomly divided into two groups: one receiving the herbal sitz bath as an adjunctive therapy and the other without this adjunctive therapy. Samples were collected on the first and eighth days after surgery to compare the differences in microbial community composition between the two groups on the eighth day and between the first and eighth days within each group. The study revealed that the herbal sitz bath significantly altered the structure of the microbial community, increasing its diversity and abundance. By reducing <i>Enterococcus</i> and increasing <i>Bifidobacterium</i>, <i>Faecalibacterium</i>, and <i>Ruminococcus</i>, the therapy enhanced the wound's anti-infective capacity and accelerated healing. This study explored the potential mechanism of the herbal sitz bath formula as an adjunctive therapy in promoting postoperative recovery from perianal abscesses, providing valuable data for further research on the role of microorganisms in wound care. These findings contribute to optimising postoperative treatment regimens and facilitating patient recovery.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11771788/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143025731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}