2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)最新文献_第10页

EEG Based Depression Recognition by Employing Static and Dynamic Network Metrics 基于静态和动态网络度量的脑电抑郁症识别

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9994864

Shuting Sun, Chang Yan, Juntong Lyu, Yueran Xin, Jieyuan Zheng, Zhaolong Yu, B. Hu

Neural circuit dysfunction underlies the biological mechanisms of major depressive disorder (MDD). However, little is known about how the brain’s dynamic connectomes differentiate between depressed patients and normal controls. As a result, we collected resting-state Electroencephalography from 16 MDD patients and 16 controls using 128-electrode geodesic sensor net. Static and dynamic network metrics were later applied to explore the abnormal topological structure of MDD patients and identify them from normal controls using traditional machine learning algorithms with feature selection methods. Results showed that the MDD tend to have a more randomized formation both in static and dynamic network. We also found that the combined static-dynamic feature set usually outperformed others with a highest accuracy of 79.25% under delta band. Lower frequency band (delta, theta) showed relatively better outcomes compared to higher frequency band (alpha, beta). It also indicate the role of functional segregation features as a potential biomarker for depression. In conclusion, neuropathological mechanism of depression may be more objectively quantified and evaluated from the perspective of combining static and dynamic network.

神经回路功能障碍是重度抑郁症(MDD)的生物学机制基础。然而，对于大脑的动态连接体如何区分抑郁症患者和正常对照组，人们知之甚少。因此，我们使用128电极测地传感器网收集了16名重度抑郁症患者和16名对照者的静息状态脑电图。随后，静态和动态网络指标被用于探索MDD患者的异常拓扑结构，并使用传统的带有特征选择方法的机器学习算法将其从正常对照中识别出来。结果表明，无论在静态网络还是动态网络中，MDD的形成都趋于随机化。我们还发现，静态-动态组合特征集通常优于其他特征集，在delta波段下准确率最高，达到79.25%。较低的频带(delta, theta)与较高的频带(alpha, beta)相比，表现出相对更好的结果。这也表明功能分离特征作为抑郁症的潜在生物标志物的作用。综上所述，从静态网络与动态网络相结合的角度，可以更客观地量化和评价抑郁症的神经病理机制。

{"title":"EEG Based Depression Recognition by Employing Static and Dynamic Network Metrics","authors":"Shuting Sun, Chang Yan, Juntong Lyu, Yueran Xin, Jieyuan Zheng, Zhaolong Yu, B. Hu","doi":"10.1109/BIBM55620.2022.9994864","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9994864","url":null,"abstract":"Neural circuit dysfunction underlies the biological mechanisms of major depressive disorder (MDD). However, little is known about how the brain’s dynamic connectomes differentiate between depressed patients and normal controls. As a result, we collected resting-state Electroencephalography from 16 MDD patients and 16 controls using 128-electrode geodesic sensor net. Static and dynamic network metrics were later applied to explore the abnormal topological structure of MDD patients and identify them from normal controls using traditional machine learning algorithms with feature selection methods. Results showed that the MDD tend to have a more randomized formation both in static and dynamic network. We also found that the combined static-dynamic feature set usually outperformed others with a highest accuracy of 79.25% under delta band. Lower frequency band (delta, theta) showed relatively better outcomes compared to higher frequency band (alpha, beta). It also indicate the role of functional segregation features as a potential biomarker for depression. In conclusion, neuropathological mechanism of depression may be more objectively quantified and evaluated from the perspective of combining static and dynamic network.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121207013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrating Prior Knowledge with Graph Encoder for Gene Regulatory Inference from Single-cell RNA-Seq Data 基于先验知识与图编码器的单细胞RNA-Seq基因调控推理

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995287

Jiawei Li, Fan Yang, Fang Wang, Yu Rong, P. Zhao, Shizhan Chen, Jianhua Yao, Jijun Tang, Fei Guo

Inferring gene regulatory networks based on single-cell transcriptomes is critical for systematically understanding cell-specific regulatory networks and discovering drug targets in tumor cells. Here we show that existing methods mainly perform co-expression analysis and apply the image-based model to deal with the non-euclidean scRNA-seq data, which may not reasonably handle the dropout problem and not fully take advantage of the validated gene regulatory topology. We propose a graph-based end-to-end deep learning model for GRN inference (GRNInfer) with the help of known regulatory relations through transductive learning. The robustness and superiority of the model are demonstrated by comparative experiments.

基于单细胞转录组推断基因调控网络对于系统地理解细胞特异性调控网络和发现肿瘤细胞中的药物靶点至关重要。现有方法主要进行共表达分析，并采用基于图像的模型处理非欧几里得scRNA-seq数据，可能无法合理处理dropout问题，也无法充分利用已验证的基因调控拓扑。我们提出了一种基于图的端到端深度学习模型，用于GRN推理(GRNInfer)，该模型通过转导学习帮助已知的调节关系。通过对比实验验证了该模型的鲁棒性和优越性。

引用次数: 0

An integrated Extreme learning machine based on kernel risk-sensitive loss of q-Gaussian and voting mechanism for sample classification 基于q-高斯核风险敏感损失和投票机制的样本分类集成极限学习机

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9994976

Zhi-Yuan Li, Ying-Lian Gao, Zhen Niu, Shasha Yuan, C. Zheng, Jin-Xing Liu

Ensemble learning is to train and combine multiple learners to complete the corresponding learning tasks. It can improve the stability of the overall model, and a good ensemble method can further improve the accuracy of the model. At the same time, as one of the outstanding representatives of machine learning, Extreme Learning Machine has attracted the continuous attention of experts and scholars. to get a better representation of the feature space, we extend the Gaussian kernel in the kernel risk-sensitive loss and propose a Kernel Risk-Sensitive Loss of q-Gaussian kernel and Hyper-graph Regularized Extreme Learning Machine method. Since the contingency in the ELM training process cannot be completely avoided, the stability of most ELM methods is affected to some extent. What’s more, we introduce the voting mechanism and a new ELM classification model named Kernel Risk-Sensitive Loss of q-Gaussian kernel and Hyper-graph Regularized Integrated Extreme Learning Machine based on Voting Mechanism is proposed. It improves the stability of the model through the idea of ensemble learning. We apply the new model on six real data sets, and through observation and analysis of experimental results, we find that the new model has certain competitiveness, especially in classification accuracy and stability.

集成学习是训练和组合多个学习者来完成相应的学习任务。它可以提高整体模型的稳定性，良好的集成方法可以进一步提高模型的精度。同时，作为机器学习领域的杰出代表之一，Extreme learning machine也吸引了专家学者的不断关注。为了更好地表示特征空间，我们将高斯核扩展到核风险敏感损失中，提出了一种q-高斯核核风险敏感损失和超图正则化极限学习机方法。由于ELM训练过程中的偶然性无法完全避免，大多数ELM方法的稳定性都会受到一定程度的影响。在此基础上，引入了投票机制，提出了一种新的ELM分类模型——q-高斯核核风险敏感损失模型和基于投票机制的超图正则化集成极限学习机。通过集成学习的思想提高了模型的稳定性。我们将新模型应用于6个真实数据集上，通过实验结果的观察和分析，发现新模型具有一定的竞争力，特别是在分类精度和稳定性方面。

{"title":"An integrated Extreme learning machine based on kernel risk-sensitive loss of q-Gaussian and voting mechanism for sample classification","authors":"Zhi-Yuan Li, Ying-Lian Gao, Zhen Niu, Shasha Yuan, C. Zheng, Jin-Xing Liu","doi":"10.1109/BIBM55620.2022.9994976","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9994976","url":null,"abstract":"Ensemble learning is to train and combine multiple learners to complete the corresponding learning tasks. It can improve the stability of the overall model, and a good ensemble method can further improve the accuracy of the model. At the same time, as one of the outstanding representatives of machine learning, Extreme Learning Machine has attracted the continuous attention of experts and scholars. to get a better representation of the feature space, we extend the Gaussian kernel in the kernel risk-sensitive loss and propose a Kernel Risk-Sensitive Loss of q-Gaussian kernel and Hyper-graph Regularized Extreme Learning Machine method. Since the contingency in the ELM training process cannot be completely avoided, the stability of most ELM methods is affected to some extent. What’s more, we introduce the voting mechanism and a new ELM classification model named Kernel Risk-Sensitive Loss of q-Gaussian kernel and Hyper-graph Regularized Integrated Extreme Learning Machine based on Voting Mechanism is proposed. It improves the stability of the model through the idea of ensemble learning. We apply the new model on six real data sets, and through observation and analysis of experimental results, we find that the new model has certain competitiveness, especially in classification accuracy and stability.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116742978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Mechanism of Action of Network Pharmacology Integrated with Molecular Docking to Explore Wumei Pills in Treating Gastric Cancer 网络药理学结合分子对接探讨乌梅丸治疗胃癌的作用机制

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995670

Zhongwen Lu, Shuang Zhang, Fei Teng, Xuanhe Tian, Xijian Liu, Xiaochun Han

Objective: This study aimed to explore the mechanism of action of Wumei Pills (WMP) in treating gastric cancer (GC) based on network pharmacology and molecular docking. Methods: The Wumei Pills’ active ingredients were obtained from the traditional Chinese medicine system pharmacology database, and the target sites were obtained from the PharmMapper database. GC’ s target genes were identified through GeneCards, the Therapeutic Target Database, and other databases. The intersection of the two was used to determine the target of active ingredients of WMP that were related to GC. Cytoscape 3.7.0 was used to establish the network map of “ compound-traditional Chinese medicine-ingredient-target” to screen the core components. The Search Tool for the Retrieval of Interacting Genes/Proteins database and Cytoscape 3.7.0 were used to analyze and visualize potential genes of WMP in treating GC. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway enrichment were conducted through Metascape. The “ target-critical path” network diagram was created by screening relevant pathways with the enrichment score. KM plotter and Gene Expression Profiling Interactive Analysis database were used to draw GC related survival curve online for core genes. AutoDock Vina and PyMol software were used to conduct molecular docking and visualization. Results: There were 99 intersection targets of the active ingredients of WMP and the disease. Protein-protein interaction network topology analysis revealed ALB, EGFR, SRC, and other key targets. Molecular docking results showed that the key active components had good binding with the core target, and ALB and ESR1 genes were significant in survival analysis. Conclusion:WMP could treat GC via beta-sitosterol, stigmasterol, and other active ingredients that acted on ALB, EGFR, SRC, and other targets. The mechanism could be related to the epithelial cell signal transduction pathway in Helicobacter pylori infection, which played a multi-target and multi-pathway therapeutic role.

目的:基于网络药理学和分子对接，探讨乌梅丸治疗胃癌的作用机制。方法:从中药系统药理学数据库中获取乌梅丸的有效成分，从PharmMapper数据库中获取靶点。GC的靶基因通过GeneCards、Therapeutic target Database等数据库进行鉴定。利用两者的交集来确定WMP中与GC相关的有效成分的目标。采用Cytoscape 3.7.0软件建立“复方-中药-成分-靶点”网络图谱，筛选核心成分。使用相互作用基因/蛋白数据库检索工具和Cytoscape 3.7.0对WMP治疗GC的潜在基因进行分析和可视化。通过metscape进行基因本体和京都基因与基因组百科全书路径富集。通过富集评分筛选相关通路，形成“目标-关键通路”网络图。利用KM绘图仪和基因表达谱交互分析数据库在线绘制核心基因GC相关生存曲线。使用AutoDock Vina和PyMol软件进行分子对接和可视化。结果:WMP有效成分与本病有99个交叉靶点。蛋白-蛋白相互作用网络拓扑分析揭示了ALB、EGFR、SRC等关键靶点。分子对接结果显示，关键活性成分与核心靶点结合良好，ALB和ESR1基因在生存分析中具有显著性。结论:WMP可通过β -谷甾醇、豆甾醇等作用于ALB、EGFR、SRC等靶点的活性成分治疗GC。其机制可能与幽门螺杆菌感染的上皮细胞信号转导通路有关，具有多靶点、多通路的治疗作用。

{"title":"The Mechanism of Action of Network Pharmacology Integrated with Molecular Docking to Explore Wumei Pills in Treating Gastric Cancer","authors":"Zhongwen Lu, Shuang Zhang, Fei Teng, Xuanhe Tian, Xijian Liu, Xiaochun Han","doi":"10.1109/BIBM55620.2022.9995670","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9995670","url":null,"abstract":"Objective: This study aimed to explore the mechanism of action of Wumei Pills (WMP) in treating gastric cancer (GC) based on network pharmacology and molecular docking. Methods: The Wumei Pills’ active ingredients were obtained from the traditional Chinese medicine system pharmacology database, and the target sites were obtained from the PharmMapper database. GC’ s target genes were identified through GeneCards, the Therapeutic Target Database, and other databases. The intersection of the two was used to determine the target of active ingredients of WMP that were related to GC. Cytoscape 3.7.0 was used to establish the network map of “ compound-traditional Chinese medicine-ingredient-target” to screen the core components. The Search Tool for the Retrieval of Interacting Genes/Proteins database and Cytoscape 3.7.0 were used to analyze and visualize potential genes of WMP in treating GC. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway enrichment were conducted through Metascape. The “ target-critical path” network diagram was created by screening relevant pathways with the enrichment score. KM plotter and Gene Expression Profiling Interactive Analysis database were used to draw GC related survival curve online for core genes. AutoDock Vina and PyMol software were used to conduct molecular docking and visualization. Results: There were 99 intersection targets of the active ingredients of WMP and the disease. Protein-protein interaction network topology analysis revealed ALB, EGFR, SRC, and other key targets. Molecular docking results showed that the key active components had good binding with the core target, and ALB and ESR1 genes were significant in survival analysis. Conclusion:WMP could treat GC via beta-sitosterol, stigmasterol, and other active ingredients that acted on ALB, EGFR, SRC, and other targets. The mechanism could be related to the epithelial cell signal transduction pathway in Helicobacter pylori infection, which played a multi-target and multi-pathway therapeutic role.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123868116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Effect of cell coupling between pacemaker cells on the biological pacemaker in cardiac tissue model 心脏组织模型中起搏器细胞间偶联对生物起搏器的影响

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995116

Yacong Li, Lei Ma, Qince Li, Henggui Zhang, Kuanquan Wang

Biological pacemaker is a therapy for cardiac rhythm disease, which can be transformed from ventricular myocytes (VMs) by overexpressing HCN gene which codes the expression of hyperpolarization-activated current (${mathrm {I}}_{mathrm{f}}$) and knocking off Kir2.1 gene which codes inward-rectifier potassium current (${mathrm {I}}_{mathrm{K1}}$). Our previous study built a biological pacemaker single cell model and clarified the underlying mechanisms of how gene expressing levels influence the pacemaking activity of single pacemaker cell. But the pacemaking ability of pacemaker tissue has not been researched systematically. And what factors may have effects on pacemaker’s synchronization and spontaneous beating propagation are not clear. Biological research indicated that both sinoatrial node and pacemaker cells has less expression of connexin than unexcitable cardiac cells, which provides a possibility that improve pacemaking ability of pacemaker by decreasing its cell coupling. Another possible factor is the number of pacemaker cells. According to the common sense, increasing cell number can promote pacemaking behaviours, but overmuch pacemaker cells is unreasonable in clinic. As a result, the balance between pacemaker number and cell coupling is important when applying biological pacemaker. In this study, we constructed a two-dimensional cardiac tissue model with the description of electrophysiology to illustrate the relationship between gap junction and cell number. Based on this model, we modified the cell coupling between pacemaker cells by adjusting the diffusion coefficient of tissue with different pacemaker number. In different condition, the synchronization, pacemaking cycle length and electrical signal propagation were evaluated. It can be concluded that weakening cell coupling among pacemaker cells can lift the efficiency of bio-pacemaker therapy. This study may contribute to produce effective pacemaker in clinic.

生物起搏器是一种心律疾病的治疗方法，通过过表达编码超极化激活电流(${ mathm {I}}_{ mathm {f}}$)表达的HCN基因和敲除编码向内整流钾电流(${ mathm {I}}_{ mathm {K1}}$)的Kir2.1基因，可以从心室肌细胞(vm)转化为心肌细胞。我们之前的研究建立了生物起搏器单细胞模型，阐明了基因表达水平影响单个起搏器细胞起搏活性的潜在机制。但目前对起搏器组织的起搏能力还没有系统的研究。而究竟是什么因素影响了起搏器的同步和自发搏动的传播，目前还不清楚。生物学研究表明，窦房结和起搏器细胞的连接蛋白表达均低于不可兴奋的心脏细胞，这为通过降低起搏器细胞偶联来提高起搏器的起搏能力提供了可能。另一个可能的因素是起搏器细胞的数量。根据常识，增加细胞数量可以促进起搏行为，但过多的起搏细胞在临床上是不合理的。因此，在应用生物起搏器时，起搏器数量和细胞耦合之间的平衡是很重要的。在这项研究中，我们构建了一个具有电生理学描述的二维心脏组织模型，以说明间隙连接与细胞数量的关系。在此模型的基础上，通过调节不同数量起搏器组织的扩散系数来调节起搏器细胞间的细胞耦合。在不同的条件下，对同步、起搏周期长度和电信号传播进行了评价。由此可见，减弱起搏器细胞间的细胞偶联可以提高生物起搏器治疗的效率。本研究可为临床生产有效的起搏器提供参考。

{"title":"Effect of cell coupling between pacemaker cells on the biological pacemaker in cardiac tissue model","authors":"Yacong Li, Lei Ma, Qince Li, Henggui Zhang, Kuanquan Wang","doi":"10.1109/BIBM55620.2022.9995116","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9995116","url":null,"abstract":"Biological pacemaker is a therapy for cardiac rhythm disease, which can be transformed from ventricular myocytes (VMs) by overexpressing HCN gene which codes the expression of hyperpolarization-activated current (${mathrm {I}}_{mathrm{f}}$) and knocking off Kir2.1 gene which codes inward-rectifier potassium current (${mathrm {I}}_{mathrm{K1}}$). Our previous study built a biological pacemaker single cell model and clarified the underlying mechanisms of how gene expressing levels influence the pacemaking activity of single pacemaker cell. But the pacemaking ability of pacemaker tissue has not been researched systematically. And what factors may have effects on pacemaker’s synchronization and spontaneous beating propagation are not clear. Biological research indicated that both sinoatrial node and pacemaker cells has less expression of connexin than unexcitable cardiac cells, which provides a possibility that improve pacemaking ability of pacemaker by decreasing its cell coupling. Another possible factor is the number of pacemaker cells. According to the common sense, increasing cell number can promote pacemaking behaviours, but overmuch pacemaker cells is unreasonable in clinic. As a result, the balance between pacemaker number and cell coupling is important when applying biological pacemaker. In this study, we constructed a two-dimensional cardiac tissue model with the description of electrophysiology to illustrate the relationship between gap junction and cell number. Based on this model, we modified the cell coupling between pacemaker cells by adjusting the diffusion coefficient of tissue with different pacemaker number. In different condition, the synchronization, pacemaking cycle length and electrical signal propagation were evaluated. It can be concluded that weakening cell coupling among pacemaker cells can lift the efficiency of bio-pacemaker therapy. This study may contribute to produce effective pacemaker in clinic.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121591447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A rehabilitation activity monitoring method based on Shallow-CNN 一种基于Shallow-CNN的康复活动监测方法

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995387

Si-Jiu Wu, Tianyu Huang, Yihao Li

This paper proposes a shallow convolutional neural network (CNN) model to improve the efficiency and accuracy of real-time human activity recognition (HAR). In the traditional convolutional network, an Mix-Patch-Layer (MPL) block based on the attention mechanism is added to enhance the expressiveness of the network extracted features. This block makes the features in the network focus on the information between different parts of itself, which makes up for the loss of global information in temporal data features. Experiments show that the block can improve real-time human recognition accuracy and efficiency with a shallow network.

为了提高实时人体活动识别(HAR)的效率和准确性，提出了一种浅卷积神经网络(CNN)模型。在传统的卷积网络中，增加了一个基于注意机制的混合补丁层(Mix-Patch-Layer, MPL)块来增强网络提取特征的表达性。该块使得网络中的特征集中于自身不同部分之间的信息，弥补了时态数据特征中全局信息的缺失。实验表明，该分块可以提高人类实时识别的精度和效率。

引用次数: 0

scSAGAN: A scRNA-seq data imputation method based on Semi-Supervised Learning and Probabilistic Latent Semantic Analysis scSAGAN:一种基于半监督学习和概率潜在语义分析的scRNA-seq数据输入方法

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995463

Zehao Xiong, Xiangtao Chen, Jiawei Luo, Cong Shen, Zhongyuan Xu

single-cell RNA-sequencing (scRNA-seq) technology can reveal cellular heterogeneity with high throughput and resolution, facilitating the profiling of single-cell transcriptomes. However, due to some experimental factors, a large number of missing values are generated in scRNA-seq data, which are called dropout events, and this phenomenon affects the downstream analysis. Imputation is an effective denoising method, but existing imputation methods still face a huge challenge: lack of interpretability. In this study, we propose single-cell Self-Attention Generative Adversarial Networks(scSAGAN), a semi-supervised imputation method for scRNA-seq data. scSAGAN mainly uses Semi-Supervised Learning (SSL) and Probabilistic Latent Semantic Analysis (PLSA), which can not only learn the potential characteristics of different types of cells but explain their imputation behavior. In clustering experiments, scSAGAN exhibits better clustering performance than all baselines on 7 datasets. Next, we interpret the imputation behavior of scSAGAN on datasets such as Alzheimer’s disease and find causative genes associated with the corresponding datasets. scSAGAN is currently an open-source method, available at https://github.com/zehaoxiongl23/scSAGAN.

单细胞rna测序(scRNA-seq)技术能够以高通量和高分辨率揭示细胞异质性，为单细胞转录组分析提供便利。然而，由于一些实验因素，在scRNA-seq数据中产生了大量缺失值，称为dropout事件，这种现象影响了下游分析。归算是一种有效的去噪方法，但现有的归算方法仍然面临着可解释性不足的巨大挑战。在这项研究中，我们提出了单细胞自注意生成对抗网络(scSAGAN)，这是一种针对scRNA-seq数据的半监督插补方法。scSAGAN主要采用半监督学习(Semi-Supervised Learning, SSL)和概率潜语义分析(Probabilistic Latent Semantic Analysis, PLSA)，不仅可以学习不同类型细胞的潜在特征，还可以解释它们的imputation行为。在聚类实验中，scSAGAN在7个数据集上表现出比所有基线更好的聚类性能。接下来，我们解释scSAGAN在阿尔茨海默病等数据集上的归算行为，并找到与相应数据集相关的致病基因。scSAGAN目前是一种开源方法，可在https://github.com/zehaoxiongl23/scSAGAN上获得。

{"title":"scSAGAN: A scRNA-seq data imputation method based on Semi-Supervised Learning and Probabilistic Latent Semantic Analysis","authors":"Zehao Xiong, Xiangtao Chen, Jiawei Luo, Cong Shen, Zhongyuan Xu","doi":"10.1109/BIBM55620.2022.9995463","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9995463","url":null,"abstract":"single-cell RNA-sequencing (scRNA-seq) technology can reveal cellular heterogeneity with high throughput and resolution, facilitating the profiling of single-cell transcriptomes. However, due to some experimental factors, a large number of missing values are generated in scRNA-seq data, which are called dropout events, and this phenomenon affects the downstream analysis. Imputation is an effective denoising method, but existing imputation methods still face a huge challenge: lack of interpretability. In this study, we propose single-cell Self-Attention Generative Adversarial Networks(scSAGAN), a semi-supervised imputation method for scRNA-seq data. scSAGAN mainly uses Semi-Supervised Learning (SSL) and Probabilistic Latent Semantic Analysis (PLSA), which can not only learn the potential characteristics of different types of cells but explain their imputation behavior. In clustering experiments, scSAGAN exhibits better clustering performance than all baselines on 7 datasets. Next, we interpret the imputation behavior of scSAGAN on datasets such as Alzheimer’s disease and find causative genes associated with the corresponding datasets. scSAGAN is currently an open-source method, available at https://github.com/zehaoxiongl23/scSAGAN.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122465997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

3D ARCNN: An Asymmetric Residual CNN for Decreasing False Positive Rate of Lung Nodules Detection 三维ARCNN:用于降低肺结节假阳性率的非对称残余CNN

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9994973

Bo Liu, Hong Song, Qiang Li, Yucong Lin, Jian Yang

Lung cancer is with the highest morbidity and mortality, and early detection of cancerous changes is essential to reduce the risk of death. To achieve this, it is necessary to reduce the false positive rate of detection. In this paper, we propose a novel asymmetric residual network, called 3D ARCNN, to reduce false positive rate of lung nodules detection. 3D ARCNN consists of asymmetric convolutional and multilayer cascaded residual network structures. To solve the problem of deep neural network with large amounts of parameters and poor reproduction ability, the proposed model uses asymmetric convolution to reduce model parameters and enhance the generalization ability of the model. In addition, the model uses an internally cascaded multi-stage residual to prevent the gradient vanishing and exploding problems of deep networks. Experiments are performed on the public dataset LUNA16. Our method achieved high detection sensitivity of 91.6%, 92.7%, 93.2% and 95.8% at 1, 2, 4 and 8 false positives per scan, respectively, which got an average CPM index of 0.912. Experimental results show that the proposed 3D ARCNN is very useful for reducing the false positive rate of lung nodules in the clinic.

肺癌的发病率和死亡率最高，早期发现癌变对于降低死亡风险至关重要。为了实现这一目标，有必要降低检测的假阳性率。在本文中，我们提出了一种新的不对称残余网络，称为3D ARCNN，以降低肺结节检测的假阳性率。三维ARCNN由非对称卷积和多层级联残差网络结构组成。为了解决深度神经网络参数量大、再现能力差的问题，本文提出的模型采用非对称卷积来减少模型参数，增强模型的泛化能力。此外，该模型采用内部级联的多级残差来防止深度网络的梯度消失和爆炸问题。实验在公共数据集LUNA16上进行。每次扫描1次、2次、4次和8次假阳性时，该方法的检测灵敏度分别为91.6%、92.7%、93.2%和95.8%，平均CPM指数为0.912。实验结果表明，本文提出的三维ARCNN在临床上对于降低肺结节的假阳性率是非常有用的。

{"title":"3D ARCNN: An Asymmetric Residual CNN for Decreasing False Positive Rate of Lung Nodules Detection","authors":"Bo Liu, Hong Song, Qiang Li, Yucong Lin, Jian Yang","doi":"10.1109/BIBM55620.2022.9994973","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9994973","url":null,"abstract":"Lung cancer is with the highest morbidity and mortality, and early detection of cancerous changes is essential to reduce the risk of death. To achieve this, it is necessary to reduce the false positive rate of detection. In this paper, we propose a novel asymmetric residual network, called 3D ARCNN, to reduce false positive rate of lung nodules detection. 3D ARCNN consists of asymmetric convolutional and multilayer cascaded residual network structures. To solve the problem of deep neural network with large amounts of parameters and poor reproduction ability, the proposed model uses asymmetric convolution to reduce model parameters and enhance the generalization ability of the model. In addition, the model uses an internally cascaded multi-stage residual to prevent the gradient vanishing and exploding problems of deep networks. Experiments are performed on the public dataset LUNA16. Our method achieved high detection sensitivity of 91.6%, 92.7%, 93.2% and 95.8% at 1, 2, 4 and 8 false positives per scan, respectively, which got an average CPM index of 0.912. Experimental results show that the proposed 3D ARCNN is very useful for reducing the false positive rate of lung nodules in the clinic.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128089242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Multi-level translocation events analysis in solid-state nanopore current traces 固态纳米孔电流迹线中多级易位事件分析

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995453

Xinlong Liu, Zepeng Sun, W. Liu, Feng Qiao, Li Cui, Jing Yang, Jingjie Sha, Jian Li, Li-Qun Xu

Solid-state nanopores have shown impressive performances in several sequencing research scenarios, such as biomolecule conformation detection, biomarker identification, and protein fingerprinting. In all these scenarios, accurate event detection is the fundamental step toward data analysis. Most existing event detection methods use either user-defined thresholds or adaptive thresholds determined automatically by the data. The former class depends heavily on human expertise, which is labor-intensive; the latter appears to be more advanced, however, the setting of threshold parameters is somewhat tricky. Hence, the results are usually inconsistent among different methods. In this paper, we develop a novel event detection method, where the selection threshold is computed following the principle governed by an analytical expression. Unlike other methods, each event’s starting and ending points are located based on the slope rather than picking the first point whose current value goes across the baseline. Moreover, we add a method to determine whether multiple levels are present within each event. We then evaluate the method on two groups of current traces generated by short ssDNA and 48.5kb λ-DNA samples, respectively. The results show that our method performs well on detecting challenging translocation events with relatively low amplitudes, and is also able to accurately locate the starting/end points of each level of the events.

固体纳米孔在生物分子构象检测、生物标志物鉴定和蛋白质指纹图谱等测序研究中表现出了令人印象深刻的性能。在所有这些场景中，准确的事件检测是数据分析的基本步骤。大多数现有的事件检测方法使用用户定义的阈值或由数据自动确定的自适应阈值。前一类严重依赖人力专业知识，这是劳动密集型的;后者似乎更高级，然而，阈值参数的设置有些棘手。因此，不同方法的结果往往不一致。在本文中，我们开发了一种新的事件检测方法，其中选择阈值的计算遵循由解析表达式支配的原则。与其他方法不同的是，每个事件的起始点和结束点都是基于斜率来定位的，而不是选择当前值越过基线的第一个点。此外，我们还添加了一个方法来确定每个事件中是否存在多个级别。然后，我们分别在两组由短ssDNA和48.5kb λ-DNA样本产生的电流迹上对该方法进行了评估。结果表明，我们的方法在检测相对较低振幅的挑战性易位事件上表现良好，并且能够准确定位每个级别事件的开始/结束点。

{"title":"Multi-level translocation events analysis in solid-state nanopore current traces","authors":"Xinlong Liu, Zepeng Sun, W. Liu, Feng Qiao, Li Cui, Jing Yang, Jingjie Sha, Jian Li, Li-Qun Xu","doi":"10.1109/BIBM55620.2022.9995453","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9995453","url":null,"abstract":"Solid-state nanopores have shown impressive performances in several sequencing research scenarios, such as biomolecule conformation detection, biomarker identification, and protein fingerprinting. In all these scenarios, accurate event detection is the fundamental step toward data analysis. Most existing event detection methods use either user-defined thresholds or adaptive thresholds determined automatically by the data. The former class depends heavily on human expertise, which is labor-intensive; the latter appears to be more advanced, however, the setting of threshold parameters is somewhat tricky. Hence, the results are usually inconsistent among different methods. In this paper, we develop a novel event detection method, where the selection threshold is computed following the principle governed by an analytical expression. Unlike other methods, each event’s starting and ending points are located based on the slope rather than picking the first point whose current value goes across the baseline. Moreover, we add a method to determine whether multiple levels are present within each event. We then evaluate the method on two groups of current traces generated by short ssDNA and 48.5kb λ-DNA samples, respectively. The results show that our method performs well on detecting challenging translocation events with relatively low amplitudes, and is also able to accurately locate the starting/end points of each level of the events.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"10 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132605757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Longest k-tuple Common Sub-Strings 最长的k元组公共子字符串

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995199

Tiantian Li, Daming Zhu, Haitao Jiang, Haodi Feng, Xuefeng Cui

We focus on a new problem that is formulated to find a longest k-tuple of common sub-strings (abbr. k-CSSs) of two or more strings. We present a suffix tree based algorithm for this problem, which can find a longest k-CSS of m strings in $O(kmn^{k})$ time and $O(kmn)$ space where n is the length sum of the m strings. This algorithm can be used to approximate the longest k-CSS problem to a performance ratio $frac{1}{epsilon}$ in $O(kmn^{lceilepsilon krceil})$ time for $epsilonin(0,1]$. Since the algorithm has the space complexity in linear order of n, it will show advantage in comparing particularly long strings. This algorithm proves that the problem that asks to find a longest gapped pattern of non-constant number of strings is polynomial time solvable if the gap number is restricted constant, although the problem without any restriction on the gap number was proved NP-Hard. Using a C++ tool that is reliant on the algorithm, we performed experiments of finding longest 2-CSSs, 3-CSSs and 5-CSSs of 2 ~ 14 COVID-19 S-proteins. Under the help of longest 2-CSSs and 3-CSSs of COVID-19 S-proteins, we identified the mutation sites in the S-proteins of two COVID-19 variants Delta and Omicron. The algorithm based tool is available for downloading at https://github.com/lytt0/k-CSS.

我们关注的是一个新的问题，该问题被表述为寻找两个或多个字符串的公共子字符串(缩写为k- css)的最长k元组。我们提出了一种基于后缀树的算法，该算法可以在$O(kmn^{k})$时间和$O(kmn)$空间中找到m个字符串的最长k-CSS，其中n为m个字符串的长度和。该算法可用于将最长k-CSS问题近似为$epsilonin(0,1]$在$O(kmn^{lceilepsilon krceil})$时间内的性能比率$frac{1}{epsilon}$。由于该算法的空间复杂度为n的线性数量级，因此在比较特别长的字符串时将显示出优势。该算法证明了当间隙数为限制常数时，求非常数串最长间隙模式的问题是多项式时间可解的，尽管不限制间隙数的问题被证明为NP-Hard。利用依赖于该算法的c++工具，我们对214个COVID-19 s蛋白进行了最长2- css、3- css和5- css的实验。在COVID-19 s蛋白最长的2-CSSs和3-CSSs的帮助下，我们确定了两个COVID-19变体Delta和Omicron的s蛋白突变位点。基于算法的工具可从https://github.com/lytt0/k-CSS下载。

{"title":"Longest k-tuple Common Sub-Strings","authors":"Tiantian Li, Daming Zhu, Haitao Jiang, Haodi Feng, Xuefeng Cui","doi":"10.1109/BIBM55620.2022.9995199","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9995199","url":null,"abstract":"We focus on a new problem that is formulated to find a longest k-tuple of common sub-strings (abbr. k-CSSs) of two or more strings. We present a suffix tree based algorithm for this problem, which can find a longest k-CSS of m strings in $O(kmn^{k})$ time and $O(kmn)$ space where n is the length sum of the m strings. This algorithm can be used to approximate the longest k-CSS problem to a performance ratio $frac{1}{epsilon}$ in $O(kmn^{lceilepsilon krceil})$ time for $epsilonin(0,1]$. Since the algorithm has the space complexity in linear order of n, it will show advantage in comparing particularly long strings. This algorithm proves that the problem that asks to find a longest gapped pattern of non-constant number of strings is polynomial time solvable if the gap number is restricted constant, although the problem without any restriction on the gap number was proved NP-Hard. Using a C++ tool that is reliant on the algorithm, we performed experiments of finding longest 2-CSSs, 3-CSSs and 5-CSSs of 2 ~ 14 COVID-19 S-proteins. Under the help of longest 2-CSSs and 3-CSSs of COVID-19 S-proteins, we identified the mutation sites in the S-proteins of two COVID-19 variants Delta and Omicron. The algorithm based tool is available for downloading at https://github.com/lytt0/k-CSS.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132781884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0