Pub Date : 2024-11-22DOI: 10.1007/s12539-024-00671-6
Qi Xu, Yitong Ma, Zuhong Lu, Kun Bi
In the field of storing images into DNA, the code tables and universal error correction codes have the potential to mitigate the effect of base errors to a certain extent. However, they prove to be ineffective in dealing with indels (insertion and deletion errors), resulting in a decline in information density and the quality of reconstructed image. This paper proposes a novel encoding and decoding method named DP-ID for storing images into DNA that improves information density and the quality of reconstructed image. Firstly, the image is compressed as bitstreams by the dynamic programming algorithm. Secondly, the bitstreams obtained are mapped to DNA, which are then interleaved. The reconstructed image is obtained by applying median filtering to remove salt-and-pepper noise. Simulation results show the reconstructed image by DP-ID at 5% error rate is better than that by other methods at 1% error rate. This robustness to high errors is compatible with the unsatisfied biological constraints caused by high information density. Wet experiments show that DP-ID can reconstruct high quality image at 5X sequencing depth. The high information density and low sequencing depth significantly reduce the cost of DNA storage, facilitating the large-scale storage of images into DNA.
在将图像存储到 DNA 中的领域,码表和通用纠错码有可能在一定程度上减轻碱基错误的影响。然而,事实证明它们无法有效地处理吲哚(插入和删除错误),导致信息密度和重建图像的质量下降。本文提出了一种名为 DP-ID 的新型编码和解码方法,用于将图像存储到 DNA 中,从而提高信息密度和重建图像的质量。首先,通过动态编程算法将图像压缩为比特流。其次,将获得的比特流映射到 DNA 中,然后进行交错。应用中值滤波去除椒盐噪声后,得到重建图像。仿真结果表明,DP-ID 在 5%误差率下重建的图像比其他方法在 1%误差率下重建的图像要好。这种对高误差的鲁棒性与高信息密度造成的无法满足的生物约束相匹配。湿实验表明,DP-ID 可以在 5 倍测序深度下重建高质量图像。高信息密度和低测序深度大大降低了 DNA 的存储成本,有利于将图像大规模存储到 DNA 中。
{"title":"DP-ID: Interleaving and Denoising to Improve the Quality of DNA Storage Image.","authors":"Qi Xu, Yitong Ma, Zuhong Lu, Kun Bi","doi":"10.1007/s12539-024-00671-6","DOIUrl":"https://doi.org/10.1007/s12539-024-00671-6","url":null,"abstract":"<p><p>In the field of storing images into DNA, the code tables and universal error correction codes have the potential to mitigate the effect of base errors to a certain extent. However, they prove to be ineffective in dealing with indels (insertion and deletion errors), resulting in a decline in information density and the quality of reconstructed image. This paper proposes a novel encoding and decoding method named DP-ID for storing images into DNA that improves information density and the quality of reconstructed image. Firstly, the image is compressed as bitstreams by the dynamic programming algorithm. Secondly, the bitstreams obtained are mapped to DNA, which are then interleaved. The reconstructed image is obtained by applying median filtering to remove salt-and-pepper noise. Simulation results show the reconstructed image by DP-ID at 5% error rate is better than that by other methods at 1% error rate. This robustness to high errors is compatible with the unsatisfied biological constraints caused by high information density. Wet experiments show that DP-ID can reconstruct high quality image at 5X sequencing depth. The high information density and low sequencing depth significantly reduce the cost of DNA storage, facilitating the large-scale storage of images into DNA.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142692977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-22DOI: 10.1007/s12539-024-00667-2
Aimin Li, Mingyue Li, Rong Fei, Saurav Mallik, Bo Hu, Yue Yu
Gene Regulatory Networks (GRNs) reveal complex interactions between genes in organisms, crucial for understanding the life system's operation. The rapid development of biotechnology, especially single-cell RNA sequencing (scRNA-seq), has generated a large amount of scRNA-seq data, which can be analyzed to explore the regulatory relationships between genes at the single-cell level. Previous models used to construct GRNs mainly aim at constructing associative relationships between genes, but usually fail to accurately reveal the causality between genes. Therefore, we present a hybrid deep learning model called EfficientNet-resDDSC (the EfficientNet with Residual Blocks and Depthwise Separable Dilated Convolutions) to infer causality between genes. The model inherits the basic structure of EfficientNet-B0 and incorporates residual blocks as well as dilated convolutions. The model's ability to extract low-level features at the primary stage is enhanced by introducing residual blocks. The model combines Depthwise Separable Convolution (DSC) in the inverted linear bottleneck layers with the dilated convolutions to expand the model's receptive fields without increasing the computational effort. This design enables the model to comprehensively reveal potential relationships among different genes in high-dimensional and high-noise single-cell data. In comparison with the five existing deep learning network models, EfficientNet-resDDSC's overall performance is significantly better than others on four datasets. In this study, EfficientNet-resDDSC was further applied to construct GRNs for breast cancer patients, focusing on the related regulatory genes of the key gene BRCA1, which contributes to the advancement of breast cancer research and treatment strategies.
{"title":"EfficientNet-resDDSC: A Hybrid Deep Learning Model Integrating Residual Blocks and Dilated Convolutions for Inferring Gene Causality in Single-Cell Data.","authors":"Aimin Li, Mingyue Li, Rong Fei, Saurav Mallik, Bo Hu, Yue Yu","doi":"10.1007/s12539-024-00667-2","DOIUrl":"https://doi.org/10.1007/s12539-024-00667-2","url":null,"abstract":"<p><p>Gene Regulatory Networks (GRNs) reveal complex interactions between genes in organisms, crucial for understanding the life system's operation. The rapid development of biotechnology, especially single-cell RNA sequencing (scRNA-seq), has generated a large amount of scRNA-seq data, which can be analyzed to explore the regulatory relationships between genes at the single-cell level. Previous models used to construct GRNs mainly aim at constructing associative relationships between genes, but usually fail to accurately reveal the causality between genes. Therefore, we present a hybrid deep learning model called EfficientNet-resDDSC (the EfficientNet with Residual Blocks and Depthwise Separable Dilated Convolutions) to infer causality between genes. The model inherits the basic structure of EfficientNet-B0 and incorporates residual blocks as well as dilated convolutions. The model's ability to extract low-level features at the primary stage is enhanced by introducing residual blocks. The model combines Depthwise Separable Convolution (DSC) in the inverted linear bottleneck layers with the dilated convolutions to expand the model's receptive fields without increasing the computational effort. This design enables the model to comprehensively reveal potential relationships among different genes in high-dimensional and high-noise single-cell data. In comparison with the five existing deep learning network models, EfficientNet-resDDSC's overall performance is significantly better than others on four datasets. In this study, EfficientNet-resDDSC was further applied to construct GRNs for breast cancer patients, focusing on the related regulatory genes of the key gene BRCA1, which contributes to the advancement of breast cancer research and treatment strategies.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142692978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-19DOI: 10.1007/s12539-024-00670-7
Sizhe Zhang, Peng Han, Haiqing Sun, Ying Su, Chen Chen, Cheng Chen, Jinyao Li, Xiaoyi Lv, Xuecong Tian, Yandan Xu
Yinchenhao Decoction (YCHD), a classic formula in traditional Chinese medicine, is believed to have the potential to treat liver diseases by modulating the Toll-like receptor 4 (TLR4) target. Therefore, a thorough exploration of the effective components and therapeutic mechanisms targeting TLR4 in YCHD is a promising strategy for liver diseases. In this study, the AIGO-DTI deep learning framework was proposed to predict the targeting probability of major components in YCHD for TLR4. Comparative evaluations with four machine learning models (RF, SVM, KNN, XGBoost) and two deep learning models (GCN, GAT) demonstrated that the AIGO-DTI framework exhibited the best overall performance, with Recall and AUC reaching 0.968 and 0.991, respectively.This study further utilized the AIGO-DTI model to identify the potential impact of Isoscopoletin, a major component of YCHD, on TLR4. Subsequent wet experiments revealed that Isoscopoletin could influence the maturation of Dendritic Cells (DCs) induced by Lipopolysaccharide (LPS) through TLR4, suggesting its therapeutic potential for liver diseases, especially hepatitis. Additionally, based on the AIGO-DTI framework, this study established an online platform named TLR4-Predict to facilitate domain experts in discovering more compounds related to TLR4. Overall, the proposed AIGO-DTI framework accurately predicts unique compounds in YCHD that interact with TLR4, providing new insights for identifying and screening lead compounds targeting TLR4.
{"title":"Discovery of Active Ingredient of Yinchenhao Decoction Targeting TLR4 for Hepatic Inflammatory Diseases Based on Deep Learning Approach.","authors":"Sizhe Zhang, Peng Han, Haiqing Sun, Ying Su, Chen Chen, Cheng Chen, Jinyao Li, Xiaoyi Lv, Xuecong Tian, Yandan Xu","doi":"10.1007/s12539-024-00670-7","DOIUrl":"10.1007/s12539-024-00670-7","url":null,"abstract":"<p><p>Yinchenhao Decoction (YCHD), a classic formula in traditional Chinese medicine, is believed to have the potential to treat liver diseases by modulating the Toll-like receptor 4 (TLR4) target. Therefore, a thorough exploration of the effective components and therapeutic mechanisms targeting TLR4 in YCHD is a promising strategy for liver diseases. In this study, the AIGO-DTI deep learning framework was proposed to predict the targeting probability of major components in YCHD for TLR4. Comparative evaluations with four machine learning models (RF, SVM, KNN, XGBoost) and two deep learning models (GCN, GAT) demonstrated that the AIGO-DTI framework exhibited the best overall performance, with Recall and AUC reaching 0.968 and 0.991, respectively.This study further utilized the AIGO-DTI model to identify the potential impact of Isoscopoletin, a major component of YCHD, on TLR4. Subsequent wet experiments revealed that Isoscopoletin could influence the maturation of Dendritic Cells (DCs) induced by Lipopolysaccharide (LPS) through TLR4, suggesting its therapeutic potential for liver diseases, especially hepatitis. Additionally, based on the AIGO-DTI framework, this study established an online platform named TLR4-Predict to facilitate domain experts in discovering more compounds related to TLR4. Overall, the proposed AIGO-DTI framework accurately predicts unique compounds in YCHD that interact with TLR4, providing new insights for identifying and screening lead compounds targeting TLR4.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142667815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-15DOI: 10.1007/s12539-024-00669-0
Wanjing Zhang, Mingyang Zhang, Min Zhu
Enhancer-promoter interactions (EPIs) are crucial in gene transcription regulation and cell differentiation. Traditional biological experiments are costly and time-consuming, motivating the development of computational prediction methods. However, existing EPI prediction methods inadequately capture the intricate direct interactions between enhancer and promoter sequences, which limits their prediction performance to some extent. In this work, we propose an innovative attention-based approach RAEPI, which uses convolutional neural networks to extract initial features of enhancers and promoters, combined with a specially designed Restricted Attention mechanism with Query-Key-Value constrained to simulate the interactions between them for further feature extraction. To improve cross-cell line prediction, we employ a transfer learning strategy for pre-training. Furthermore, we extracted sequence motifs to evaluate the RAEPI's effectiveness from a visualization perspective. Experimental results show that RAEPI achieves competitive prediction performance to existing methods on the benchmark dataset.
{"title":"RAEPI: Predicting Enhancer-Promoter Interactions Based on Restricted Attention Mechanism.","authors":"Wanjing Zhang, Mingyang Zhang, Min Zhu","doi":"10.1007/s12539-024-00669-0","DOIUrl":"https://doi.org/10.1007/s12539-024-00669-0","url":null,"abstract":"<p><p>Enhancer-promoter interactions (EPIs) are crucial in gene transcription regulation and cell differentiation. Traditional biological experiments are costly and time-consuming, motivating the development of computational prediction methods. However, existing EPI prediction methods inadequately capture the intricate direct interactions between enhancer and promoter sequences, which limits their prediction performance to some extent. In this work, we propose an innovative attention-based approach RAEPI, which uses convolutional neural networks to extract initial features of enhancers and promoters, combined with a specially designed Restricted Attention mechanism with Query-Key-Value constrained to simulate the interactions between them for further feature extraction. To improve cross-cell line prediction, we employ a transfer learning strategy for pre-training. Furthermore, we extracted sequence motifs to evaluate the RAEPI's effectiveness from a visualization perspective. Experimental results show that RAEPI achieves competitive prediction performance to existing methods on the benchmark dataset.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142638852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The development of peptide drug is hindered by the risk of amyloidogenic aggregation; if peptides tend to aggregate in this manner, they may be unsuitable for drug design. Computational methods aimed at predicting amyloidogenic sequences often face challenges in extracting high-quality features, and their predictive performance can be enchanced. To surmount these challenges, iAmyP was introduced as a specialized computational tool designed for predicting amyloidogenic hexapeptides. Utilizing multi-view learning, iAmyP incorporated sequence, structural, and evolutionary features, performing feature selection and feature fusion through recursive feature elimination and attention mechanisms. This amalgamation of features and subsequent feature selection and fusion lead to optimal performance facilitated by an optimization algorithm based on sequence least squares programming. Notably, iAmyP exhibited robust generalization for peptides with lengths of 7-10 amino acids. The role of hydrophobic amino acids in the aggregation process is critical, and a thorough analysis have significantly enhanced our insight into their significance in amyloidogenic hexapeptides. This tool represented an advancement in the development of peptide therapeutics by providing an understanding of amyloidogenic aggregation, establishing itself as a valuable framework for assessing amyloidogenic sequences. The data and code can be freely accessed at https://github.com/xialab-ahu/iAmyP .
{"title":"iAmyP: A Multi-view Learning for Amyloidogenic Hexapeptides Identification Based on Sequence Least Squares Programming.","authors":"Jinling Cai, Jianping Zhao, Yannan Bin, Junfeng Xia, Chunhou Zheng","doi":"10.1007/s12539-024-00666-3","DOIUrl":"https://doi.org/10.1007/s12539-024-00666-3","url":null,"abstract":"<p><p>The development of peptide drug is hindered by the risk of amyloidogenic aggregation; if peptides tend to aggregate in this manner, they may be unsuitable for drug design. Computational methods aimed at predicting amyloidogenic sequences often face challenges in extracting high-quality features, and their predictive performance can be enchanced. To surmount these challenges, iAmyP was introduced as a specialized computational tool designed for predicting amyloidogenic hexapeptides. Utilizing multi-view learning, iAmyP incorporated sequence, structural, and evolutionary features, performing feature selection and feature fusion through recursive feature elimination and attention mechanisms. This amalgamation of features and subsequent feature selection and fusion lead to optimal performance facilitated by an optimization algorithm based on sequence least squares programming. Notably, iAmyP exhibited robust generalization for peptides with lengths of 7-10 amino acids. The role of hydrophobic amino acids in the aggregation process is critical, and a thorough analysis have significantly enhanced our insight into their significance in amyloidogenic hexapeptides. This tool represented an advancement in the development of peptide therapeutics by providing an understanding of amyloidogenic aggregation, establishing itself as a valuable framework for assessing amyloidogenic sequences. The data and code can be freely accessed at https://github.com/xialab-ahu/iAmyP .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142638847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-14DOI: 10.1007/s12539-024-00644-9
Gaili Li, Yongna Yuan, Ruisheng Zhang
The investigation of molecular interactions between ligands and their target molecules is becoming more significant as protein structure data continues to develop. In this study, we introduce PLA-STGCNnet, a deep fusion spatial-temporal graph neural network designed to study protein-ligand interactions based on the 3D structural data of protein-ligand complexes. Unlike 1D protein sequences or 2D ligand graphs, the 3D graph representation offers a more precise portrayal of the complex interactions between proteins and ligands. Research studies have shown that our fusion model, PLA-STGCNnet, outperforms individual algorithms in accurately predicting binding affinity. The advantage of a fusion model is the ability to fully combine the advantages of multiple different models and improve overall performance by combining their features and outputs. Our fusion model shows satisfactory performance on different data sets, which proves its generalization ability and stability. The fusion-based model showed good performance in protein-ligand affinity prediction, and we successfully applied the model to drug screening. Our research underscores the promise of fusion spatial-temporal graph neural networks in addressing complex challenges in protein-ligand affinity prediction. The Python scripts for implementing various model components are accessible at https://github.com/ligaili01/PLA-STGCN.
{"title":"Predicting Protein-Ligand Binding Affinity Using Fusion Model of Spatial-Temporal Graph Neural Network and 3D Structure-Based Complex Graph.","authors":"Gaili Li, Yongna Yuan, Ruisheng Zhang","doi":"10.1007/s12539-024-00644-9","DOIUrl":"https://doi.org/10.1007/s12539-024-00644-9","url":null,"abstract":"<p><p>The investigation of molecular interactions between ligands and their target molecules is becoming more significant as protein structure data continues to develop. In this study, we introduce PLA-STGCNnet, a deep fusion spatial-temporal graph neural network designed to study protein-ligand interactions based on the 3D structural data of protein-ligand complexes. Unlike 1D protein sequences or 2D ligand graphs, the 3D graph representation offers a more precise portrayal of the complex interactions between proteins and ligands. Research studies have shown that our fusion model, PLA-STGCNnet, outperforms individual algorithms in accurately predicting binding affinity. The advantage of a fusion model is the ability to fully combine the advantages of multiple different models and improve overall performance by combining their features and outputs. Our fusion model shows satisfactory performance on different data sets, which proves its generalization ability and stability. The fusion-based model showed good performance in protein-ligand affinity prediction, and we successfully applied the model to drug screening. Our research underscores the promise of fusion spatial-temporal graph neural networks in addressing complex challenges in protein-ligand affinity prediction. The Python scripts for implementing various model components are accessible at https://github.com/ligaili01/PLA-STGCN.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142619766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate prediction of anticancer drug responses is essential for developing personalized treatment plans in order to improve cancer patient survival rates and reduce healthcare costs. To this end, we propose a drug sensitivity prediction model based on multi-stage multi-modal drug representations (ModDRDSP) to reflect the properties of drugs more comprehensively, and to better model the complex interactions between cells and drugs. Specifically, we adopt the SMILES representation learning method based on the deep hierarchical bi-directional GRU network (DSBiGRU) and the molecular graph representation learning method based on the deep message-crossing network (DMCN) for the multi-modal information of drugs. Additionally, we integrate the multi-omics information of cell lines based on a convolutional neural network (CNN). Finally, we use an ensemble deep forest algorithm for the prediction of drug sensitivity. After validation, the ModDRDSP shows impressive performance which outperforms the four current industry-leading models. More importantly, ablation experiments demonstrate the validity of each module of the proposed model, and case studies show the good results of ModDRDSP for predicting drug sensitivity, further establishing the superiority of ModDRDSP in terms of performance.
{"title":"Drug Sensitivity Prediction Based on Multi-stage Multi-modal Drug Representation Learning.","authors":"Jinmiao Song, Mingjie Wei, Shuang Zhao, Hui Zhai, Qiguo Dai, Xiaodong Duan","doi":"10.1007/s12539-024-00668-1","DOIUrl":"https://doi.org/10.1007/s12539-024-00668-1","url":null,"abstract":"<p><p>Accurate prediction of anticancer drug responses is essential for developing personalized treatment plans in order to improve cancer patient survival rates and reduce healthcare costs. To this end, we propose a drug sensitivity prediction model based on multi-stage multi-modal drug representations (ModDRDSP) to reflect the properties of drugs more comprehensively, and to better model the complex interactions between cells and drugs. Specifically, we adopt the SMILES representation learning method based on the deep hierarchical bi-directional GRU network (DSBiGRU) and the molecular graph representation learning method based on the deep message-crossing network (DMCN) for the multi-modal information of drugs. Additionally, we integrate the multi-omics information of cell lines based on a convolutional neural network (CNN). Finally, we use an ensemble deep forest algorithm for the prediction of drug sensitivity. After validation, the ModDRDSP shows impressive performance which outperforms the four current industry-leading models. More importantly, ablation experiments demonstrate the validity of each module of the proposed model, and case studies show the good results of ModDRDSP for predicting drug sensitivity, further establishing the superiority of ModDRDSP in terms of performance.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142619765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1007/s12539-024-00660-9
Yajing Guo, Xiujuan Lei, Shuyu Li
Circular RNA (circRNA) has the capacity to bind with RNA binding protein (RBP), thereby exerting a substantial impact on diseases. Predicting binding sites aids in comprehending the interaction mechanism, thereby offering insights for disease treatment strategies. Here, we propose a novel approach based on temporal convolutional network (TCN) and cross multi-head attention mechanism to predict circRNA-RBP binding sites (circTCA). First, we employ two distinct encoding methodologies to obtain two raw matrices of circRNA sequences. Then, two parallel TCN blocks extract shallow and abstract features of the two matrices separately. The fusion of the two is achieved through cross multi-head attention mechanism and after this, global expectation pooling assigns weights to the concatenated feature. Finally, the task of classifying the input sequence is entrusted to a fully connected (FC) layer. We compare circTCA with other five methods and conduct ablation experiments to demonstrate its effectiveness. We also conduct feature visualization and assess the motifs extracted by circTCA with existing motifs. All in all, circTCA is effective for binding sites prediction of circRNA and RBP.
{"title":"An Integrated TCN-CrossMHA Model for Predicting circRNA-RBP Binding Sites.","authors":"Yajing Guo, Xiujuan Lei, Shuyu Li","doi":"10.1007/s12539-024-00660-9","DOIUrl":"https://doi.org/10.1007/s12539-024-00660-9","url":null,"abstract":"<p><p>Circular RNA (circRNA) has the capacity to bind with RNA binding protein (RBP), thereby exerting a substantial impact on diseases. Predicting binding sites aids in comprehending the interaction mechanism, thereby offering insights for disease treatment strategies. Here, we propose a novel approach based on temporal convolutional network (TCN) and cross multi-head attention mechanism to predict circRNA-RBP binding sites (circTCA). First, we employ two distinct encoding methodologies to obtain two raw matrices of circRNA sequences. Then, two parallel TCN blocks extract shallow and abstract features of the two matrices separately. The fusion of the two is achieved through cross multi-head attention mechanism and after this, global expectation pooling assigns weights to the concatenated feature. Finally, the task of classifying the input sequence is entrusted to a fully connected (FC) layer. We compare circTCA with other five methods and conduct ablation experiments to demonstrate its effectiveness. We also conduct feature visualization and assess the motifs extracted by circTCA with existing motifs. All in all, circTCA is effective for binding sites prediction of circRNA and RBP.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142581680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1007/s12539-024-00664-5
Yuhong Su, Xincheng Zeng, Lingfeng Zhang, Yanlin Bian, Yangjing Wang, Buyong Ma
Antibodies against Aβ peptide have been recently approved to treat Alzheimer's disease, underscoring the importance of understanding their interactions for developing more potent treatments. Here we investigated the interaction between anti-Aβ antibodies and various peptides using a deep learning model. Our model, ABTrans, was trained on dodecapeptide sequences from phage display experiments and known anti-Aβ antibody sequences sourced from public sources. It classified the binding ability between anti-Aβ antibodies and dodecapeptides into four levels: not binding, weak binding, medium binding, and strong binding, achieving an accuracy of 0.83. Using ABTrans, we examined the cross-reaction of anti-Aβ antibodies with other human amyloidogenic proteins, revealing that Aducanumab and Donanemab exhibited the least cross-reactivity. Additionally, we systematically screened interactions between eleven selected anti-Aβ antibodies and all human proteins to identify potential off-target candidates.
{"title":"ABTrans: A Transformer-based Model for Predicting Interaction between Anti-Aβ Antibodies and Peptides.","authors":"Yuhong Su, Xincheng Zeng, Lingfeng Zhang, Yanlin Bian, Yangjing Wang, Buyong Ma","doi":"10.1007/s12539-024-00664-5","DOIUrl":"https://doi.org/10.1007/s12539-024-00664-5","url":null,"abstract":"<p><p>Antibodies against Aβ peptide have been recently approved to treat Alzheimer's disease, underscoring the importance of understanding their interactions for developing more potent treatments. Here we investigated the interaction between anti-Aβ antibodies and various peptides using a deep learning model. Our model, ABTrans, was trained on dodecapeptide sequences from phage display experiments and known anti-Aβ antibody sequences sourced from public sources. It classified the binding ability between anti-Aβ antibodies and dodecapeptides into four levels: not binding, weak binding, medium binding, and strong binding, achieving an accuracy of 0.83. Using ABTrans, we examined the cross-reaction of anti-Aβ antibodies with other human amyloidogenic proteins, revealing that Aducanumab and Donanemab exhibited the least cross-reactivity. Additionally, we systematically screened interactions between eleven selected anti-Aβ antibodies and all human proteins to identify potential off-target candidates.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142521780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CRISPR/Cas base editors offer precise conversion of single nucleotides without inducing double-strand breaks. This technology finds extensive applications in gene therapy, gene function analysis, and other domains. However, a crucial challenge lies in selecting the appropriate guide RNAs (gRNAs) for base editing. Although various gRNAs design tools exist, creating a simplified base-editing library with diverse protospacer adjacent motifs (PAM) sequences for gRNAs screening remains a challenge. We present a user-friendly web tool, BES-Designer ( https://bes-designer.aielab.net ), for gRNAs design based on base editors, aimed at streamlining the creation of a base-editing library. BES-Designer incorporates our proposed rules for target sequence simplification, helping researchers narrow down the scope of biological experiments in the lab. It allows users to design target sequences with various PAMs and editing types simultaneously, and prioritize them in the simplified base-editing library. This tool has been experimentally proven to achieve a 30% simplification efficiency on the base-editing-library.
{"title":"BES-Designer: A Web Tool to Design Guide RNAs for Base Editing to Simplify Library.","authors":"Qian Zhou, Qian Gao, Yujia Gao, Youhua Zhang, Yanjun Chen, Min Li, Pengcheng Wei, Zhenyu Yue","doi":"10.1007/s12539-024-00663-6","DOIUrl":"https://doi.org/10.1007/s12539-024-00663-6","url":null,"abstract":"<p><p>CRISPR/Cas base editors offer precise conversion of single nucleotides without inducing double-strand breaks. This technology finds extensive applications in gene therapy, gene function analysis, and other domains. However, a crucial challenge lies in selecting the appropriate guide RNAs (gRNAs) for base editing. Although various gRNAs design tools exist, creating a simplified base-editing library with diverse protospacer adjacent motifs (PAM) sequences for gRNAs screening remains a challenge. We present a user-friendly web tool, BES-Designer ( https://bes-designer.aielab.net ), for gRNAs design based on base editors, aimed at streamlining the creation of a base-editing library. BES-Designer incorporates our proposed rules for target sequence simplification, helping researchers narrow down the scope of biological experiments in the lab. It allows users to design target sequences with various PAMs and editing types simultaneously, and prioritize them in the simplified base-editing library. This tool has been experimentally proven to achieve a 30% simplification efficiency on the base-editing-library.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142521781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}