Pub Date : 2024-09-17DOI: 10.1109/TNB.2024.3462461
Xiaohua Wan;Yulong Hu;Dehui Qiu;Juan Zhang;Xiaotong Wang;Fa Zhang;Bin Hu
The features of the sublingual veins, including swelling, varicose patterns, and cyanosis, are pivotal in differentiating symptoms and selecting treatments in Traditional Chinese Medicine (TCM) tongue diagnosis. These features serve as a crucial reflection of the human blood circulation status. Nevertheless, the automatic and precise extraction of sublingual vein features remains a formidable challenge, constrained by the scarcity of datasets for sublingual images and the interference of noise from non-tongue and non-sublingual vein elements. In this paper, we present an innovative tongue feature extraction method that relies on focusing specifically on segmenting the sublingual vein rather than the entire tongue base. To achieve this, we have developed a sublingual vein segmentation framework utilizing a Polyp-PVT network, effectively eliminating noise from the surrounding regions of the sublingual vein. Furthermore, we pioneer the utilization of a transformer-based approach, such as the Swin-Transformer network, to extract sublingual vein features, leveraging the remarkable capabilities of transformer networks. To complement our methodology, we have constructed a comprehensive dataset of sublingual vein images, facilitating the segmentation and classification of sublingual veins. Experimental results have demonstrated that our tongue feature extraction method, coupled with sublingual vein segmentation, significantly outperforms existing tongue feature extraction techniques.
{"title":"A Novel Framework for Tongue Feature Extraction Framework Based on Sublingual Vein Segmentation","authors":"Xiaohua Wan;Yulong Hu;Dehui Qiu;Juan Zhang;Xiaotong Wang;Fa Zhang;Bin Hu","doi":"10.1109/TNB.2024.3462461","DOIUrl":"10.1109/TNB.2024.3462461","url":null,"abstract":"The features of the sublingual veins, including swelling, varicose patterns, and cyanosis, are pivotal in differentiating symptoms and selecting treatments in Traditional Chinese Medicine (TCM) tongue diagnosis. These features serve as a crucial reflection of the human blood circulation status. Nevertheless, the automatic and precise extraction of sublingual vein features remains a formidable challenge, constrained by the scarcity of datasets for sublingual images and the interference of noise from non-tongue and non-sublingual vein elements. In this paper, we present an innovative tongue feature extraction method that relies on focusing specifically on segmenting the sublingual vein rather than the entire tongue base. To achieve this, we have developed a sublingual vein segmentation framework utilizing a Polyp-PVT network, effectively eliminating noise from the surrounding regions of the sublingual vein. Furthermore, we pioneer the utilization of a transformer-based approach, such as the Swin-Transformer network, to extract sublingual vein features, leveraging the remarkable capabilities of transformer networks. To complement our methodology, we have constructed a comprehensive dataset of sublingual vein images, facilitating the segmentation and classification of sublingual veins. Experimental results have demonstrated that our tongue feature extraction method, coupled with sublingual vein segmentation, significantly outperforms existing tongue feature extraction techniques.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"24 3","pages":"269-279"},"PeriodicalIF":3.7,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-12DOI: 10.1109/TNB.2024.3456031
Elham Baradari;Ozgur B. Akan
Parkinson’s disease (PD) is a progressive neurodegenerative disease, and it is caused by the loss of dopaminergic neurons in the basal ganglia (BG). Currently, there is no definite cure for PD, and available treatments mainly aim to alleviate its symptoms. Due to impaired neurotransmitter-based information transmission in PD, molecular communication-based approaches can be employed as potential solutions to address this issue. Molecular Communications (MC) is a bio-inspired communication method utilizing molecules to carry information. This mode of communication stands out for developing bio-compatible nanomachines for diagnosing and treating, particularly in addressing neurodegenerative diseases like PD, due to its compatibility with biological systems. This study presents a novel treatment method that introduces an Intelligent Dopamine Rate Modulator (IDRM), which is located in the synaptic gap between the substantia nigra pars compacta (SNc) and striatum to compensate for insufficiency dopamine release in BG caused by PD. For storing dopamine in the IDRM, dopamine compound (DAC) is swallowed and crossed through the digestive system, blood circulatory system, blood-brain barrier (BBB), and brain extracellular matrix uptakes with IDRMs. Here, the DAC concentration is calculated in these regions, revealing that the required exogenous dopamine consistently reaches IDRM. Therefore, the perpetual dopamine insufficiency in BG associated with PD can be compensated. This method reduces drug side effects because dopamine is not released in other brain regions. Unlike other treatments, this approach targets the root cause of PD rather than just reducing symptoms.
{"title":"Molecular Communication-Based Intelligent Dopamine Rate Modulator for Parkinson’s Disease Treatment","authors":"Elham Baradari;Ozgur B. Akan","doi":"10.1109/TNB.2024.3456031","DOIUrl":"10.1109/TNB.2024.3456031","url":null,"abstract":"Parkinson’s disease (PD) is a progressive neurodegenerative disease, and it is caused by the loss of dopaminergic neurons in the basal ganglia (BG). Currently, there is no definite cure for PD, and available treatments mainly aim to alleviate its symptoms. Due to impaired neurotransmitter-based information transmission in PD, molecular communication-based approaches can be employed as potential solutions to address this issue. Molecular Communications (MC) is a bio-inspired communication method utilizing molecules to carry information. This mode of communication stands out for developing bio-compatible nanomachines for diagnosing and treating, particularly in addressing neurodegenerative diseases like PD, due to its compatibility with biological systems. This study presents a novel treatment method that introduces an Intelligent Dopamine Rate Modulator (IDRM), which is located in the synaptic gap between the substantia nigra pars compacta (SNc) and striatum to compensate for insufficiency dopamine release in BG caused by PD. For storing dopamine in the IDRM, dopamine compound (DAC) is swallowed and crossed through the digestive system, blood circulatory system, blood-brain barrier (BBB), and brain extracellular matrix uptakes with IDRMs. Here, the DAC concentration is calculated in these regions, revealing that the required exogenous dopamine consistently reaches IDRM. Therefore, the perpetual dopamine insufficiency in BG associated with PD can be compensated. This method reduces drug side effects because dopamine is not released in other brain regions. Unlike other treatments, this approach targets the root cause of PD rather than just reducing symptoms.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"24 2","pages":"136-144"},"PeriodicalIF":3.7,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1109/TNB.2024.3457755
Zicheng Wang;Haojie Wang;Yanfeng Wang;Junwei Sun
Currently, DNA strand displacement (DSD) as the theoretical basis of DNA chemical reaction networks (CRNs) has promoted the development of chaotic synchronization technique. This paper introduces the synchronization technology of two isomorphic three-dimensional chaotic systems based on DNA strand displacement under state observer. By studying the theoretical knowledge of DNA molecules, multiple DSD reactions are used to construct three-dimensional chaotic system. Based on two isomorphic chaotic systems, the linear transformation system and the state observer system are designed according to the theory of state observer construction. In addition, in order to realize the synchronization of chaotic systems, a coupling controller is designed between the drive system and the linear transformation system, and a soft variable-structure controller is designed between the state observer system and the response system. Through multiple DSD reactions, the chemical reaction networks of four chaotic systems and two controllers are constructed, and they are cascaded to realize the synchronization of two isomorphic three-dimensional chaotic systems. Numerical simulations verify the effectiveness and robustness of the scheme. Our work will extend and provide a reference for new methods to achieve synchronization of chaotic systems using DSD.
{"title":"State Observer Synchronization of Three-Dimensional Chaotic Oscillatory Systems Based on DNA Strand Displacement","authors":"Zicheng Wang;Haojie Wang;Yanfeng Wang;Junwei Sun","doi":"10.1109/TNB.2024.3457755","DOIUrl":"10.1109/TNB.2024.3457755","url":null,"abstract":"Currently, DNA strand displacement (DSD) as the theoretical basis of DNA chemical reaction networks (CRNs) has promoted the development of chaotic synchronization technique. This paper introduces the synchronization technology of two isomorphic three-dimensional chaotic systems based on DNA strand displacement under state observer. By studying the theoretical knowledge of DNA molecules, multiple DSD reactions are used to construct three-dimensional chaotic system. Based on two isomorphic chaotic systems, the linear transformation system and the state observer system are designed according to the theory of state observer construction. In addition, in order to realize the synchronization of chaotic systems, a coupling controller is designed between the drive system and the linear transformation system, and a soft variable-structure controller is designed between the state observer system and the response system. Through multiple DSD reactions, the chemical reaction networks of four chaotic systems and two controllers are constructed, and they are cascaded to realize the synchronization of two isomorphic three-dimensional chaotic systems. Numerical simulations verify the effectiveness and robustness of the scheme. Our work will extend and provide a reference for new methods to achieve synchronization of chaotic systems using DSD.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"24 2","pages":"145-156"},"PeriodicalIF":3.7,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The analysis and comprehension of multi-omics data has emerged as a prominent topic in the field of bioinformatics and data science. However, the sparsity characteristics and high dimensionality of omics data pose difficulties in terms of extracting meaningful information. Moreover, the heterogeneity inherent in multiple omics sources makes the effective integration of multi-omics data challenging To tackle these challenges, we propose MFCC-SAtt, a multi-level feature contrast clustering model based on self-attention to extract informative features from multi-omics data. MFCC-SAtt treats each omics type as a distinct modality and employs autoencoders with self-attention for each modality to integrate and compress their respective features into a shared feature space. By utilizing a multi-level feature extraction framework along with incorporating a semantic information extractor, we mitigate optimization conflicts arising from different learning objectives. Additionally, MFCC-SAtt guides deep clustering based on multi-level features which further enhances the quality of output labels. By conducting extensive experiments on multi-omics data, we have validated the exceptional performance of MFCC-SAtt. For instance, in a pan-cancer clustering task, MFCC-SAtt achieved an accuracy of over 80.38%.
{"title":"Strategic Multi-Omics Data Integration via Multi-Level Feature Contrasting and Matching","authors":"Jinli Zhang;Hongwei Ren;Zongli Jiang;Zheng Chen;Ziwei Yang;Yasuko Matsubara;Yasushi Sakurai","doi":"10.1109/TNB.2024.3456797","DOIUrl":"10.1109/TNB.2024.3456797","url":null,"abstract":"The analysis and comprehension of multi-omics data has emerged as a prominent topic in the field of bioinformatics and data science. However, the sparsity characteristics and high dimensionality of omics data pose difficulties in terms of extracting meaningful information. Moreover, the heterogeneity inherent in multiple omics sources makes the effective integration of multi-omics data challenging To tackle these challenges, we propose MFCC-SAtt, a multi-level feature contrast clustering model based on self-attention to extract informative features from multi-omics data. MFCC-SAtt treats each omics type as a distinct modality and employs autoencoders with self-attention for each modality to integrate and compress their respective features into a shared feature space. By utilizing a multi-level feature extraction framework along with incorporating a semantic information extractor, we mitigate optimization conflicts arising from different learning objectives. Additionally, MFCC-SAtt guides deep clustering based on multi-level features which further enhances the quality of output labels. By conducting extensive experiments on multi-omics data, we have validated the exceptional performance of MFCC-SAtt. For instance, in a pan-cancer clustering task, MFCC-SAtt achieved an accuracy of over 80.38%.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"23 4","pages":"579-590"},"PeriodicalIF":3.7,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-03DOI: 10.1109/TNB.2024.3453372
Abdullah Baz;Jacob Wekalao;Ngaira Mandela;Shobhit K. Patel
This paper presents a terahertz metasurface based sensor design incorporating graphene and other plasmonic materials for highly sensitive detection of different chemicals. The proposed sensor employs the combination of multiple resonator designs - including circular and square ring resonators - to attain enhanced sensitivity among other performance parameters. Machine learning techniques like Random Forest regression, are employed to enhance the sensor design and predict its performance. The optimized sensor demonstrates excellent sensitivity of 417 GHzRIU$^{mathbf {-{1}}}$ and a low detection limit of 0.264 RIU for ethanol and benzene detection. Furthermore, the integration of machine learning cuts down the simulation time and computational requirements by approximately 90% without compromising accuracy. The sensor’s unique design and performance characteristics, including its high-quality factor of 14.476, position it as a promising candidate for environmental monitoring and chemical sensing applications. Moreover, it also demonstrates potential for 2-bit encoding applications through strategic modulation of graphene chemical potential values. On the other hand, it also shows prospects of 2-bit encoding applications via the modulation of graphene chemical. This work provides a major advancement to the terahertz sensing application by proposing new materials, structures, and methods in computation in order to develop a high-performance chemical sensor.
{"title":"Design and Performance Evaluation of Machine Learning-Based Terahertz Metasurface Chemical Sensor","authors":"Abdullah Baz;Jacob Wekalao;Ngaira Mandela;Shobhit K. Patel","doi":"10.1109/TNB.2024.3453372","DOIUrl":"10.1109/TNB.2024.3453372","url":null,"abstract":"This paper presents a terahertz metasurface based sensor design incorporating graphene and other plasmonic materials for highly sensitive detection of different chemicals. The proposed sensor employs the combination of multiple resonator designs - including circular and square ring resonators - to attain enhanced sensitivity among other performance parameters. Machine learning techniques like Random Forest regression, are employed to enhance the sensor design and predict its performance. The optimized sensor demonstrates excellent sensitivity of 417 GHzRIU<inline-formula> <tex-math>$^{mathbf {-{1}}}$ </tex-math></inline-formula> and a low detection limit of 0.264 RIU for ethanol and benzene detection. Furthermore, the integration of machine learning cuts down the simulation time and computational requirements by approximately 90% without compromising accuracy. The sensor’s unique design and performance characteristics, including its high-quality factor of 14.476, position it as a promising candidate for environmental monitoring and chemical sensing applications. Moreover, it also demonstrates potential for 2-bit encoding applications through strategic modulation of graphene chemical potential values. On the other hand, it also shows prospects of 2-bit encoding applications via the modulation of graphene chemical. This work provides a major advancement to the terahertz sensing application by proposing new materials, structures, and methods in computation in order to develop a high-performance chemical sensor.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"24 2","pages":"128-135"},"PeriodicalIF":3.7,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142125606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Circular RNAs (circRNAs) play a crucial role in gene regulation and association with diseases because of their unique closed continuous loop structure, which is more stable and conserved than ordinary linear RNAs. As fundamental work to clarify their functions, a large number of computational approaches for identifying circRNA formation have been proposed. However, these methods fail to fully utilize the important characteristics of back-splicing events, i.e., the positional information of the splice sites and the interaction features of its flanking sequences, for predicting circRNAs. To this end, we hereby propose a novel approach called SIDE for predicting circRNA back-splicing events using only raw RNA sequences. Technically, SIDE employs a dual encoder to capture global and interactive features of the RNA sequence, and then a decoder designed by the contrastive learning to fuse out discriminative features improving the prediction of circRNAs formation. Empirical results on three real-world datasets show the effectiveness of SIDE. Further analysis also reveals that the effectiveness of SIDE.
环状 RNA(circRNA)因其独特的闭合连续环状结构而在基因调控和疾病相关方面发挥着至关重要的作用,这种结构比普通线性 RNA 更稳定、更保守。作为阐明其功能的基础性工作,人们提出了大量识别 circRNA 形成的计算方法。然而,这些方法未能充分利用反向剪接事件的重要特征,即剪接位点的位置信息及其侧翼序列的相互作用特征来预测 circRNA。为此,我们提出了一种名为 SIDE 的新方法,仅利用原始 RNA 序列预测 circRNA 的反向剪接事件。在技术上,SIDE 采用双重编码器捕捉 RNA 序列的全局和交互特征,然后通过对比学习设计解码器,融合出辨别特征,从而提高 circRNA 形成的预测能力。在三个真实世界数据集上的实证结果表明了 SIDE 的有效性。进一步的分析还显示了 SIDE 的有效性。
{"title":"A Representation Learning Approach for Predicting circRNA Back-Splicing Event via Sequence-Interaction-Aware Dual Encoder","authors":"Chengxin He;Lei Duan;Huiru Zheng;Xinye Wang;Lili Guan;Jiaxuan Xu","doi":"10.1109/TNB.2024.3454079","DOIUrl":"10.1109/TNB.2024.3454079","url":null,"abstract":"Circular RNAs (circRNAs) play a crucial role in gene regulation and association with diseases because of their unique closed continuous loop structure, which is more stable and conserved than ordinary linear RNAs. As fundamental work to clarify their functions, a large number of computational approaches for identifying circRNA formation have been proposed. However, these methods fail to fully utilize the important characteristics of back-splicing events, i.e., the positional information of the splice sites and the interaction features of its flanking sequences, for predicting circRNAs. To this end, we hereby propose a novel approach called SIDE for predicting circRNA back-splicing events using only raw RNA sequences. Technically, SIDE employs a dual encoder to capture global and interactive features of the RNA sequence, and then a decoder designed by the contrastive learning to fuse out discriminative features improving the prediction of circRNAs formation. Empirical results on three real-world datasets show the effectiveness of SIDE. Further analysis also reveals that the effectiveness of SIDE.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"23 4","pages":"603-611"},"PeriodicalIF":3.7,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142125605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-16DOI: 10.1109/TNB.2024.3444922
Xiangjin Hu;Haoran Yi;Hao Cheng;Yijing Zhao;Dongqi Zhang;Jinxin Li;Jingjing Ruan;Jin Zhang;Xinguo Lu
Computational synthetic lethality (SL) method has become a promising strategy to identify SL gene pairs for targeted cancer therapy and cancer medicine development. Feature representation for integrating various biological networks is crutial to improve the identification performance. However, previous feature representation, such as matrix factorization and graph neural network, projects gene features onto latent variables by keeping a specific geometric metric. There is a lack of models of gene representational latent space with considerating multiple dimentionalities correlation and preserving latent geometric structures in both sample and feature spaces. Therefore, we propose a novel method to model gene Latent Space using matrix Tri-Factorization (LSTF) to obtain gene representation with embedding variables resulting from the potential interpretation of synthetic lethality. Meanwhile, manifold subspace regularization is applied to the tri-factorization to capture the geometrical manifold structure in the latent space with gene PPI functional and GO semantic embeddings. Then, SL gene pairs are identified by the reconstruction of the associations with gene representations in the latent space. The experimental results illustrate that LSTF is superior to other state-of-the-art methods. Case study demonstrate the effectiveness of the predicted SL associations.
计算合成致死率(SL)方法已成为为癌症靶向治疗和癌症药物开发识别SL基因对的一种有前途的策略。整合各种生物网络的特征表示对于提高识别性能至关重要。然而,以往的特征表示方法,如矩阵因式分解和图神经网络,都是通过保持特定的几何度量将基因特征投射到潜在变量上。目前还缺乏同时考虑多维度相关性和保留样本空间与特征空间中潜在几何结构的基因表征潜在空间模型。因此,我们提出了一种利用矩阵三因子化(LSTF)对基因潜空间进行建模的新方法,以获得具有合成致死率潜在解释所产生的嵌入变量的基因表征。同时,将流形子空间正则化应用于三因子化,以捕捉潜空间中带有基因 PPI 功能嵌入和 GO 语义嵌入的几何流形结构。然后,通过重建潜空间中与基因表征的关联来识别 SL 基因对。实验结果表明,LSTF 优于其他最先进的方法。案例研究证明了预测 SL 关联的有效性。
{"title":"Multiple Heterogeneous Networks Representation With Latent Space for Synthetic Lethality Prediction","authors":"Xiangjin Hu;Haoran Yi;Hao Cheng;Yijing Zhao;Dongqi Zhang;Jinxin Li;Jingjing Ruan;Jin Zhang;Xinguo Lu","doi":"10.1109/TNB.2024.3444922","DOIUrl":"10.1109/TNB.2024.3444922","url":null,"abstract":"Computational synthetic lethality (SL) method has become a promising strategy to identify SL gene pairs for targeted cancer therapy and cancer medicine development. Feature representation for integrating various biological networks is crutial to improve the identification performance. However, previous feature representation, such as matrix factorization and graph neural network, projects gene features onto latent variables by keeping a specific geometric metric. There is a lack of models of gene representational latent space with considerating multiple dimentionalities correlation and preserving latent geometric structures in both sample and feature spaces. Therefore, we propose a novel method to model gene Latent Space using matrix Tri-Factorization (LSTF) to obtain gene representation with embedding variables resulting from the potential interpretation of synthetic lethality. Meanwhile, manifold subspace regularization is applied to the tri-factorization to capture the geometrical manifold structure in the latent space with gene PPI functional and GO semantic embeddings. Then, SL gene pairs are identified by the reconstruction of the associations with gene representations in the latent space. The experimental results illustrate that LSTF is superior to other state-of-the-art methods. Case study demonstrate the effectiveness of the predicted SL associations.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"23 4","pages":"564-571"},"PeriodicalIF":3.7,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141992287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-14DOI: 10.1109/TNB.2024.3443244
Wenjie Yao;Ankang Wei;Zhen Xiao;Weizhong Zhao;Xianjun Shen;Xingpeng Jiang;Tingting He
Detecting side effects of drugs is a fundamental task in drug development. With the expansion of publicly available biomedical data, researchers have proposed many computational methods for predicting drug-side effect associations (DSAs), among which network-based methods attract wide attention in the biomedical field. However, the problem of data scarcity poses a great challenge for existing DSAs prediction models. Although several data augmentation methods have been proposed to address this issue, most of existing methods employ a random way to manipulate the original networks, which ignores the causality of existence of DSAs, leading to the poor performance on the task of DSAs prediction. In this paper, we propose a counterfactual inference-based data augmentation method for improving the performance of the task. First, we construct a heterogeneous information network (HIN) by integrating multiple biomedical data. Based on the community detection on the HIN, a counterfactual inference-based method is designed to derive augmented links, and an augmented HIN is obtained accordingly. Then, a meta-path-based graph neural network is applied to learn high-quality representations of drugs and side effects, on which the predicted DSAs are obtained. Finally, comprehensive experiments are conducted, and the results demonstrate the effectiveness of the proposed counterfactual inference-based data augmentation for the task of DSAs prediction.
{"title":"An Improved Framework for Drug-Side Effect Associations Prediction via Counterfactual Inference-Based Data Augmentation","authors":"Wenjie Yao;Ankang Wei;Zhen Xiao;Weizhong Zhao;Xianjun Shen;Xingpeng Jiang;Tingting He","doi":"10.1109/TNB.2024.3443244","DOIUrl":"10.1109/TNB.2024.3443244","url":null,"abstract":"Detecting side effects of drugs is a fundamental task in drug development. With the expansion of publicly available biomedical data, researchers have proposed many computational methods for predicting drug-side effect associations (DSAs), among which network-based methods attract wide attention in the biomedical field. However, the problem of data scarcity poses a great challenge for existing DSAs prediction models. Although several data augmentation methods have been proposed to address this issue, most of existing methods employ a random way to manipulate the original networks, which ignores the causality of existence of DSAs, leading to the poor performance on the task of DSAs prediction. In this paper, we propose a counterfactual inference-based data augmentation method for improving the performance of the task. First, we construct a heterogeneous information network (HIN) by integrating multiple biomedical data. Based on the community detection on the HIN, a counterfactual inference-based method is designed to derive augmented links, and an augmented HIN is obtained accordingly. Then, a meta-path-based graph neural network is applied to learn high-quality representations of drugs and side effects, on which the predicted DSAs are obtained. Finally, comprehensive experiments are conducted, and the results demonstrate the effectiveness of the proposed counterfactual inference-based data augmentation for the task of DSAs prediction.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"23 4","pages":"540-547"},"PeriodicalIF":3.7,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141982181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-13DOI: 10.1109/TNB.2024.3442912
Ghazaleh Babanejaddehaki;Aijun An;Heidar Davoudi
Given the persistent global challenge presented by rapidly spreading diseases, as evidenced notably by the widespread impact of the COVID-19 pandemic on both human health and economies worldwide, the necessity of developing effective infectious disease prediction models has become of utmost importance. In this context, the utilization of online social media platforms as valuable tools in healthcare settings has gained prominence, offering direct avenues for disseminating critical health information to the public in a timely and accessible manner. Propelled by the ubiquitous accessibility of the internet through computers and mobile devices, these platforms promise to revolutionize traditional detection methods, providing more immediate and reliable epidemiological insights. Leveraging this paradigm shift, our proposed framework harnesses Twitter data associated with infectious disease symptoms, employing ontology to identify and curate relevant tweets. Central to our methodology is a hybrid model that integrates XGBoost and Bidirectional Long Short-Term Memory (BiLSTM) architectures. The integration of XGBoost addresses the challenge of handling small dataset sizes, inherent during outbreaks due to limited time series data. XGBoost serves as a cornerstone for minimizing the loss function and identifying optimal features from our multivariate time series data. Subsequently, the combined dataset, comprising original features and predicted values by XGBoost, is channeled into the BiLSTM for further processing. Through extensive experimentation with a dataset spanning multiple infectious disease outbreaks, our hybrid model demonstrates superior predictive performance compared to state-of-the-art and baseline models. By enhancing forecasting accuracy and outbreak tracking capabilities, our model offers promising prospects for assisting health authorities in mitigating fatalities and proactively preparing for potential outbreaks.
{"title":"Ontology-Based Data Collection for a Hybrid Outbreak Detection Method Using Social Media","authors":"Ghazaleh Babanejaddehaki;Aijun An;Heidar Davoudi","doi":"10.1109/TNB.2024.3442912","DOIUrl":"10.1109/TNB.2024.3442912","url":null,"abstract":"Given the persistent global challenge presented by rapidly spreading diseases, as evidenced notably by the widespread impact of the COVID-19 pandemic on both human health and economies worldwide, the necessity of developing effective infectious disease prediction models has become of utmost importance. In this context, the utilization of online social media platforms as valuable tools in healthcare settings has gained prominence, offering direct avenues for disseminating critical health information to the public in a timely and accessible manner. Propelled by the ubiquitous accessibility of the internet through computers and mobile devices, these platforms promise to revolutionize traditional detection methods, providing more immediate and reliable epidemiological insights. Leveraging this paradigm shift, our proposed framework harnesses Twitter data associated with infectious disease symptoms, employing ontology to identify and curate relevant tweets. Central to our methodology is a hybrid model that integrates XGBoost and Bidirectional Long Short-Term Memory (BiLSTM) architectures. The integration of XGBoost addresses the challenge of handling small dataset sizes, inherent during outbreaks due to limited time series data. XGBoost serves as a cornerstone for minimizing the loss function and identifying optimal features from our multivariate time series data. Subsequently, the combined dataset, comprising original features and predicted values by XGBoost, is channeled into the BiLSTM for further processing. Through extensive experimentation with a dataset spanning multiple infectious disease outbreaks, our hybrid model demonstrates superior predictive performance compared to state-of-the-art and baseline models. By enhancing forecasting accuracy and outbreak tracking capabilities, our model offers promising prospects for assisting health authorities in mitigating fatalities and proactively preparing for potential outbreaks.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"23 4","pages":"591-602"},"PeriodicalIF":3.7,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141975609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-12DOI: 10.1109/TNB.2024.3441689
Yan Wang;Jie Hong;Yuting Lu;Nan Sheng;Yuan Fu;Lili Yang;Lingyu Meng;Lan Huang;Hao Wang
Pancreatic cancer is one of the most malignant cancers with rapid progression and poor prognosis. The use of transcriptional data can be effective in finding new biomarkers for pancreatic cancer. Many network-based methods used to identify cancer biomarkers are proposed, among which the combination of network controllability appears. However, most of the existing methods do not study RNA, rely on priori and mutations information, or can only achieve classification tasks. In this study, we propose a method combined Relational Graph Convolutional Network and Deep Q-Network called RDDriver to identify pancreatic cancer biomarkers based on multi-layer heterogeneous transcriptional regulation network. Firstly, we construct a regulation network containing long non-coding RNA, microRNA, and messenger RNA. Secondly, Relational Graph Convolutional Network is used to learn the node representation. Finally, we use the idea of Deep Q-Network to build a model, which score and prioritize each RNA with the Popov-Belevitch-Hautus criterion. We train RDDriver on three small simulated networks, and calculate the average score after applying the model parameters to the regulation networks separately. To demonstrate the effectiveness of the method, we perform experiments for comparison between RDDriver and other eight methods based on the approximate benchmark of three types cancer drivers RNAs.
{"title":"A Controllability Reinforcement Learning Method for Pancreatic Cancer Biomarker Identification","authors":"Yan Wang;Jie Hong;Yuting Lu;Nan Sheng;Yuan Fu;Lili Yang;Lingyu Meng;Lan Huang;Hao Wang","doi":"10.1109/TNB.2024.3441689","DOIUrl":"10.1109/TNB.2024.3441689","url":null,"abstract":"Pancreatic cancer is one of the most malignant cancers with rapid progression and poor prognosis. The use of transcriptional data can be effective in finding new biomarkers for pancreatic cancer. Many network-based methods used to identify cancer biomarkers are proposed, among which the combination of network controllability appears. However, most of the existing methods do not study RNA, rely on priori and mutations information, or can only achieve classification tasks. In this study, we propose a method combined Relational Graph Convolutional Network and Deep Q-Network called RDDriver to identify pancreatic cancer biomarkers based on multi-layer heterogeneous transcriptional regulation network. Firstly, we construct a regulation network containing long non-coding RNA, microRNA, and messenger RNA. Secondly, Relational Graph Convolutional Network is used to learn the node representation. Finally, we use the idea of Deep Q-Network to build a model, which score and prioritize each RNA with the Popov-Belevitch-Hautus criterion. We train RDDriver on three small simulated networks, and calculate the average score after applying the model parameters to the regulation networks separately. To demonstrate the effectiveness of the method, we perform experiments for comparison between RDDriver and other eight methods based on the approximate benchmark of three types cancer drivers RNAs.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"23 4","pages":"556-563"},"PeriodicalIF":3.7,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10633729","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141971020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}