Pub Date : 2024-09-10DOI: 10.1109/tcbb.2024.3456302
Vincent Deman, Marine Ciantar, Laurent Naudin, Philippe Castera, Anne-Sophie Beignon
{"title":"Combining Zhegalkin Polynomials and SAT Solving for Context-specific Boolean Modeling of Biological Systems","authors":"Vincent Deman, Marine Ciantar, Laurent Naudin, Philippe Castera, Anne-Sophie Beignon","doi":"10.1109/tcbb.2024.3456302","DOIUrl":"https://doi.org/10.1109/tcbb.2024.3456302","url":null,"abstract":"","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"2 1","pages":""},"PeriodicalIF":4.5,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142182852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-09DOI: 10.1109/tcbb.2024.3456575
Takatora Suzuki, Han Guo, Momoko Hayamizu
{"title":"Bridging Between Deviation Indices for Non-Tree-Based Phylogenetic Networks","authors":"Takatora Suzuki, Han Guo, Momoko Hayamizu","doi":"10.1109/tcbb.2024.3456575","DOIUrl":"https://doi.org/10.1109/tcbb.2024.3456575","url":null,"abstract":"","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"19 1","pages":""},"PeriodicalIF":4.5,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142182857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Relation extraction, a crucial task in understanding the intricate relationships between entities in biomedical domains, has predominantly focused on binary relations within single sentences. However, in practical biomedical scenarios, relationships often extend across multiple sentences, leading to extraction errors with potential impacts on clinical decision-making and medical diagnosis. To overcome this limitation, we present a novel cross-sentence relation extraction framework that integrates and enhances coreference resolution and relation extraction models. Coreference resolution serves as the foundation, breaking sentence boundaries and linking entities across sentences. Our framework incorporates pre-trained deep language representations and leverages graph LSTMs to effectively model cross-sentence entity mentions. The use of a self-attentive Transformer architecture and external semantic information further enhances the modeling of intricate relationships. Comprehensive experiments conducted on two standard datasets, namely the BioNLP dataset and THYME dataset, demonstrate the state-of-the-art performance of our proposed approach.
{"title":"Relation Extraction in Biomedical Texts: A Cross-Sentence Approach.","authors":"Zhijing Li, Liwei Tian, Yiping Jiang, Yucheng Huang","doi":"10.1109/TCBB.2024.3451348","DOIUrl":"10.1109/TCBB.2024.3451348","url":null,"abstract":"<p><p>Relation extraction, a crucial task in understanding the intricate relationships between entities in biomedical domains, has predominantly focused on binary relations within single sentences. However, in practical biomedical scenarios, relationships often extend across multiple sentences, leading to extraction errors with potential impacts on clinical decision-making and medical diagnosis. To overcome this limitation, we present a novel cross-sentence relation extraction framework that integrates and enhances coreference resolution and relation extraction models. Coreference resolution serves as the foundation, breaking sentence boundaries and linking entities across sentences. Our framework incorporates pre-trained deep language representations and leverages graph LSTMs to effectively model cross-sentence entity mentions. The use of a self-attentive Transformer architecture and external semantic information further enhances the modeling of intricate relationships. Comprehensive experiments conducted on two standard datasets, namely the BioNLP dataset and THYME dataset, demonstrate the state-of-the-art performance of our proposed approach.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142142977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-06DOI: 10.1109/TCBB.2024.3455381
Hao Lu, Zhiqiang Wei, Kun Zhang, Xuze Wang, Liaqat Ali, Hao Liu
Retrosynthesis prediction is a fundamental problem in organic chemistry and drug synthesis. We proposed an end-to-end deep learning model called CTsynther (Contrastive Transformer for single-step retrosynthesis prediction model) that could provide single-step retrosynthesis prediction without external reaction templates or specialized knowledge. The model introduced the concept of contrastive learning in Transformer architecture and employed a contrastive learning language representation model at the SMILES sentence level to enhance model inference by learning similarities and differences between various samples. Mixed global and local attention mechanisms allow the model to capture features and dependencies between different atoms to improve generalization. We further investigated the embedding representations of SMILES learned automatically from the model. Visualization results show that the model could effectively acquire information about identical molecules and improve prediction performance. Experiments showed that the accuracy of retrosynthesis reached 53.5% and 64.4% for with and without reaction types, respectively. The validity of the predicted reactants is improved, showing competitiveness compared with semi-template methods.
{"title":"CTsynther: Contrastive Transformer model for end-to-end retrosynthesis prediction.","authors":"Hao Lu, Zhiqiang Wei, Kun Zhang, Xuze Wang, Liaqat Ali, Hao Liu","doi":"10.1109/TCBB.2024.3455381","DOIUrl":"https://doi.org/10.1109/TCBB.2024.3455381","url":null,"abstract":"<p><p>Retrosynthesis prediction is a fundamental problem in organic chemistry and drug synthesis. We proposed an end-to-end deep learning model called CTsynther (Contrastive Transformer for single-step retrosynthesis prediction model) that could provide single-step retrosynthesis prediction without external reaction templates or specialized knowledge. The model introduced the concept of contrastive learning in Transformer architecture and employed a contrastive learning language representation model at the SMILES sentence level to enhance model inference by learning similarities and differences between various samples. Mixed global and local attention mechanisms allow the model to capture features and dependencies between different atoms to improve generalization. We further investigated the embedding representations of SMILES learned automatically from the model. Visualization results show that the model could effectively acquire information about identical molecules and improve prediction performance. Experiments showed that the accuracy of retrosynthesis reached 53.5% and 64.4% for with and without reaction types, respectively. The validity of the predicted reactants is improved, showing competitiveness compared with semi-template methods.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142142976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-03DOI: 10.1109/TCBB.2024.3453499
Bin Liu, Grigorios Tsoumakas
In drug discovery, identifying drug-target interactions (DTIs) via experimental approaches is a tedious and expensive procedure. Computational methods efficiently predict DTIs and recommend a small part of potential interacting pairs for further experimental confirmation, accelerating the drug discovery process. Although fusing heterogeneous drug and target similarities can improve the prediction ability, the existing similarity combination methods ignore the interaction consistency for neighbour entities. Furthermore, area under the precision-recall curve (AUPR) and area under the receiver operating characteristic curve (AUC) are two widely used evaluation metrics in DTI prediction. However, the two metrics are seldom considered as losses within existing DTI prediction methods. We propose a local interaction consistency (LIC) aware similarity integration method to fuse vital information from diverse views for DTI prediction models. Furthermore, we propose two matrix factorization (MF) methods that optimize AUPR and AUC using convex surrogate losses respectively, and then develop an ensemble MF approach that takes advantage of the two area under the curve metrics by combining the two single metric based MF models. Experimental results under different prediction settings show that the proposed methods outperform various competitors in terms of the metric(s) they optimize and are reliable in discovering potential new DTIs.
{"title":"Integrating Similarities Via Local Interaction Consistency and Optimizing Area Under the Curve Measures Via Matrix Factorization for Drug-Target Interaction Prediction.","authors":"Bin Liu, Grigorios Tsoumakas","doi":"10.1109/TCBB.2024.3453499","DOIUrl":"10.1109/TCBB.2024.3453499","url":null,"abstract":"<p><p>In drug discovery, identifying drug-target interactions (DTIs) via experimental approaches is a tedious and expensive procedure. Computational methods efficiently predict DTIs and recommend a small part of potential interacting pairs for further experimental confirmation, accelerating the drug discovery process. Although fusing heterogeneous drug and target similarities can improve the prediction ability, the existing similarity combination methods ignore the interaction consistency for neighbour entities. Furthermore, area under the precision-recall curve (AUPR) and area under the receiver operating characteristic curve (AUC) are two widely used evaluation metrics in DTI prediction. However, the two metrics are seldom considered as losses within existing DTI prediction methods. We propose a local interaction consistency (LIC) aware similarity integration method to fuse vital information from diverse views for DTI prediction models. Furthermore, we propose two matrix factorization (MF) methods that optimize AUPR and AUC using convex surrogate losses respectively, and then develop an ensemble MF approach that takes advantage of the two area under the curve metrics by combining the two single metric based MF models. Experimental results under different prediction settings show that the proposed methods outperform various competitors in terms of the metric(s) they optimize and are reliable in discovering potential new DTIs.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142125626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-30DOI: 10.1109/TCBB.2024.3452055
Qingzhou Shi, Kai Zheng, Haoyuan Li, Bo Wang, Xiao Liang, Xinyu Li, Jianxin Wang
Piwi-interacting RNAs (piRNAs) are increasingly recognized as potential biomarkers for various diseases. Investig-ating the complex relationship between piRNAs and diseases through computational methods can reduce the costs and risks associated with biological experiments. Fast kernel learning (FKL) is a classical method for multi-source data fusion that is widely employed in association prediction research. However, biological networks are noisy due to the limitations of measurement technology and inherent natural variation, which can hamper the effectiveness of the network-based ideal kernel. The conventional FKL method does not address this issue. In this study, we propose a low-rank fast kernel learning (LRFKL) algorithm, which consists of low-rank representation (LRR) and the FKL algorithm. The LRFKL algorithm is designed to mitigate the effects of noise on the network-based ideal kernel. Using LRFKL, we propose a novel approach for predicting piRNA-disease associations called LKLPDA. Specifically, we first compute the similarity matrices for piRNAs and diseases. Then we use the LRFKL to fuse the similarity matrices for piRNAs and diseases separately. Finally, the LKLPDA employs AutoGluon-Tabular for predictive analysis. Computational results show that LKLPDA effectively predicts piRNA-disease associations with higher accuracy compared to previous methods. In addition, case studies confirm the reliability of the model in predicting piRNA-disease associations. Availability and implementation: The LKLPDA software and data are freely available at https://github.com/Shiqzz/LKLPDA-master.git.
{"title":"LKLPDA: A Low-Rank Fast Kernel Learning Approach for Predicting piRNA-Disease Associations.","authors":"Qingzhou Shi, Kai Zheng, Haoyuan Li, Bo Wang, Xiao Liang, Xinyu Li, Jianxin Wang","doi":"10.1109/TCBB.2024.3452055","DOIUrl":"10.1109/TCBB.2024.3452055","url":null,"abstract":"<p><p>Piwi-interacting RNAs (piRNAs) are increasingly recognized as potential biomarkers for various diseases. Investig-ating the complex relationship between piRNAs and diseases through computational methods can reduce the costs and risks associated with biological experiments. Fast kernel learning (FKL) is a classical method for multi-source data fusion that is widely employed in association prediction research. However, biological networks are noisy due to the limitations of measurement technology and inherent natural variation, which can hamper the effectiveness of the network-based ideal kernel. The conventional FKL method does not address this issue. In this study, we propose a low-rank fast kernel learning (LRFKL) algorithm, which consists of low-rank representation (LRR) and the FKL algorithm. The LRFKL algorithm is designed to mitigate the effects of noise on the network-based ideal kernel. Using LRFKL, we propose a novel approach for predicting piRNA-disease associations called LKLPDA. Specifically, we first compute the similarity matrices for piRNAs and diseases. Then we use the LRFKL to fuse the similarity matrices for piRNAs and diseases separately. Finally, the LKLPDA employs AutoGluon-Tabular for predictive analysis. Computational results show that LKLPDA effectively predicts piRNA-disease associations with higher accuracy compared to previous methods. In addition, case studies confirm the reliability of the model in predicting piRNA-disease associations. Availability and implementation: The LKLPDA software and data are freely available at https://github.com/Shiqzz/LKLPDA-master.git.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142106990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}