CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics最新文献

英文中文

Fine-Tuning A Lightweight Convolutional Neural Networks for COVID-19 Diagnosis 基于轻量级卷积神经网络的COVID-19诊断

CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics

Pub Date : 2020-11-19 DOI: 10.1145/3429210.3429218

Jaturong Kongmanee, Thanyathorn Thanapattheerakul

In this paper, we compare the performance of the deep neural network-based image classifiers and fine-tune with different hyperparameter configurations for an automatic COVID-19 diagnosis from various and limited chest x-ray image dataset provided by Deep Learning and Artificial Intelligence Summer School 3 (DLAI3). We show that high accuracy results can be obtained using the transfer learning technique combined with a well fine-tuned Convolutional Neural Network. Moreover, we seek for not only smaller deep learning architectures with less trainable parameters to reduce the training and inference time of AI applications for mobile and edge devices, but also relatively high performance. The results from the DLAI3 hackathon session show that our model outperforms other submitted models in terms of effectiveness and generalization.

在本文中，我们比较了基于深度神经网络的图像分类器和微调在不同超参数配置下的性能，用于自动诊断COVID-19，这些数据来自深度学习和人工智能暑期学校3 (DLAI3)提供的各种有限的胸部x射线图像数据集。我们表明，使用迁移学习技术结合良好的微调卷积神经网络可以获得高精度的结果。此外，我们不仅寻求具有较少可训练参数的更小的深度学习架构，以减少移动和边缘设备的AI应用程序的训练和推理时间，而且还寻求相对高性能。DLAI3黑客马拉松会议的结果表明，我们的模型在有效性和泛化方面优于其他提交的模型。

引用次数: 3

COVID19 Chest X-Ray Classification with Simple Convolutional Neural Network 基于简单卷积神经网络的新型冠状病毒胸片分类

CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics

Pub Date : 2020-11-19 DOI: 10.1145/3429210.3429216

Chenqi Li, Maggie Wang, Grace Wu, Khadija Rana, Nipon Charoenkitkarn, Jonathan H. Chan

COVID-19 outbreak calls for the urgent need of quick, accurate, and accessible methods for detection. Convolutional neural networks applied to chest x-ray images is a promising solution; however, x-ray device configurations vary and data quality across different datasets are inconsistent. This leads to overfitting on a particular set of training data. This paper aims to explore methods to mitigate overfitting.

COVID-19疫情要求迫切需要快速、准确和可获取的检测方法。卷积神经网络应用于胸部x射线图像是一个很有前途的解决方案;然而，x射线设备的配置各不相同，不同数据集的数据质量也不一致。这将导致对特定训练数据集的过拟合。本文旨在探讨缓解过拟合的方法。

引用次数: 3

Computational study of inertial effects in toroidal and helical microchannels 环形和螺旋微通道惯性效应的计算研究

CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics

Pub Date : 2020-11-19 DOI: 10.1145/3429210.3429222

K. Kovalcíková, A. Bugánová, I. Cimrák

Fluid flow characteristics and their dependence on device geometry and fluid parameters were studied in this article. We computationally examined three types of microfluidic channel geometries: torus, cylindrical annulus with squared cross-section, and helix. Several parameters varied over simulations: kinematic viscosity of the fluid, the velocity of the fluid, curvature radius, cross-section dimension, and pitch for helical channels. We analyzed the velocity distribution of the primary flow, as well as shape of Dean vortices for secondary inertial flow, and in case of helical channels, we analyzed also S-shaped streamlines within non-perpendicular cross-section. We analyzed also dependence of the secondary flow velocity and vorticity on average velocity of primary flow. We could confirm that the Dean effect is present in our numerical simulations, and it can be further investigated as a sorting tool for cells in suspension.

本文研究了流体的流动特性及其与装置几何形状和流体参数的关系。我们计算检查了三种类型的微流体通道几何形状:环面，圆柱环与平方截面，和螺旋。在模拟过程中，有几个参数发生了变化:流体的运动粘度、流体的速度、曲率半径、横截面尺寸和螺旋通道的螺距。我们分析了一次流的速度分布，以及二次惯性流的迪安涡的形状，对于螺旋通道，我们还分析了非垂直截面内的s形流线。分析了二次流速度和涡度对一次流平均速度的依赖关系。我们可以在我们的数值模拟中证实Dean效应的存在，并且可以进一步研究它作为悬浮细胞的分选工具。

引用次数: 1

In-Silico Study for Potential Inhibitors of Both HSP72 and HSC70 Proteins in the Treatment of Cancer HSP72和HSC70蛋白潜在抑制剂治疗癌症的计算机研究

CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics

Pub Date : 2020-11-19 DOI: 10.1145/3429210.3429226

Mohammad Kawsar Sharif Siam, Afsana Karim, Mohammad Umer Sharif Shohan

HSP90 (Heat shock protein 90), molecular chaperone contains various oncogenic client proteins, which play a significant role in initiating cancer cell hallmarks. The “HSP90-addiction” of cancer cells, makes it a suitable target in cancer treatment. Inhibition of HSP90 mitigates the tumor progression but results in over-expression of the HSP70 family (The 70-kDa heat shock proteins). HSP70 family is expressed abundantly in human tumors. High expression of HSP70 in cancer cells is responsible for tumor progression. It has been found that, inhibition of both Heat shock 70 kDa protein 1a, HSP72 and Heat shock cognate 71-kDa proteins and HSC70 (two isoforms of the HSP70 family) simultaneously lead to the inhibition of HSP90 client proteins. In this study, molecular docking approach was done in search of the best possible inhibitors of HSP72 and HSC70. Zafirlukast was used as a reference drug that is a potent inhibitor of both the isoforms HSP72 and HSC70. The binding affinity of Zafirlukast with HSP72 (PDB ID-5AQZ) and HSC70 (PDB ID-4H5N) is -10.5 and -9.9 kcal/mol respectively. 100 potential inhibitors (Anti-diabetic drugs, anti-rheumatic drugs, anti-inflammatory, statins and small molecule inhibitors) were screened through In silico approach and Apoptozole was found to be a potential inhibitor of both HSP72 and HSC70 with strong binding affinities of -11.0 and -10.2 kcal/mol respectively. Protein-ligand interaction was monitored and visualized by discovery studio to better understand the nature of intermolecular bonds. Furthermore, ADMET properties were obtained from admetSAR 2.0 and were compared with reference drug for validation.

HSP90(热休克蛋白90)是一种分子伴侣蛋白，它含有多种致癌客户蛋白，在启动癌细胞标志中起重要作用。癌细胞对hsp90的“依赖性”，使其成为癌症治疗的合适靶点。抑制HSP90可减缓肿瘤进展，但会导致HSP70家族(70-kDa热休克蛋白)的过度表达。HSP70家族在人类肿瘤中大量表达。HSP70在肿瘤细胞中的高表达与肿瘤的进展有关。研究发现，抑制热休克70kda蛋白1a、HSP72和热休克同源的71-kDa蛋白和HSC70 (HSP70家族的两种亚型)同时导致HSP90客户蛋白的抑制。本研究采用分子对接的方法寻找HSP72和HSC70的最佳抑制剂。Zafirlukast被用作对照药物，它是HSP72和HSC70亚型的有效抑制剂。Zafirlukast与HSP72 (PDB ID-5AQZ)和HSC70 (PDB ID-4H5N)的结合亲和力分别为-10.5和-9.9 kcal/mol。通过In silico方法筛选了100种潜在抑制剂(抗糖尿病药、抗风湿药、抗炎药、他汀类药物和小分子抑制剂)，发现凋亡唑是HSP72和HSC70的潜在抑制剂，结合亲和力分别为-11.0和-10.2 kcal/mol。探索工作室对蛋白质-配体相互作用进行了监测和可视化，以更好地了解分子间键的性质。此外，admetSAR 2.0获得ADMET特性，并与参比药物进行比较验证。

{"title":"In-Silico Study for Potential Inhibitors of Both HSP72 and HSC70 Proteins in the Treatment of Cancer","authors":"Mohammad Kawsar Sharif Siam, Afsana Karim, Mohammad Umer Sharif Shohan","doi":"10.1145/3429210.3429226","DOIUrl":"https://doi.org/10.1145/3429210.3429226","url":null,"abstract":"HSP90 (Heat shock protein 90), molecular chaperone contains various oncogenic client proteins, which play a significant role in initiating cancer cell hallmarks. The “HSP90-addiction” of cancer cells, makes it a suitable target in cancer treatment. Inhibition of HSP90 mitigates the tumor progression but results in over-expression of the HSP70 family (The 70-kDa heat shock proteins). HSP70 family is expressed abundantly in human tumors. High expression of HSP70 in cancer cells is responsible for tumor progression. It has been found that, inhibition of both Heat shock 70 kDa protein 1a, HSP72 and Heat shock cognate 71-kDa proteins and HSC70 (two isoforms of the HSP70 family) simultaneously lead to the inhibition of HSP90 client proteins. In this study, molecular docking approach was done in search of the best possible inhibitors of HSP72 and HSC70. Zafirlukast was used as a reference drug that is a potent inhibitor of both the isoforms HSP72 and HSC70. The binding affinity of Zafirlukast with HSP72 (PDB ID-5AQZ) and HSC70 (PDB ID-4H5N) is -10.5 and -9.9 kcal/mol respectively. 100 potential inhibitors (Anti-diabetic drugs, anti-rheumatic drugs, anti-inflammatory, statins and small molecule inhibitors) were screened through In silico approach and Apoptozole was found to be a potential inhibitor of both HSP72 and HSC70 with strong binding affinities of -11.0 and -10.2 kcal/mol respectively. Protein-ligand interaction was monitored and visualized by discovery studio to better understand the nature of intermolecular bonds. Furthermore, ADMET properties were obtained from admetSAR 2.0 and were compared with reference drug for validation.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126523446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Genome-wide repeat expansions in complex disorders: beyond the coding sequence 复杂疾病的全基因组重复扩增:超越编码序列

CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics

Pub Date : 2020-11-19 DOI: 10.1145/3429210.3429231

R. Yuen

Identification of underlying genetic factors has provided important information on the functional pathways involved in many of complex disorders. However, the casual genetic factors identified in many complex disorders so far generally confer less risk than expected from the empirical estimates of their heritability. Tandem DNA repeats make up around 6% of the human genome and have been associated with more than 40 monogenic disorders, but their involvement in complex disorders is largely unknown. I will present our novel approach to detect genome-wide tandem repeat expansions. This approach has led to the identification of rare tandem repeat expansions contributing to autism spectrum disorder and other related conditions. It provides a model to search for missing heritability in other complex disorders.

鉴定潜在的遗传因素为许多复杂疾病的功能通路提供了重要信息。然而，到目前为止，在许多复杂疾病中发现的偶然遗传因素通常比对其遗传能力的经验估计所期望的风险要小。串联DNA重复序列约占人类基因组的6%，与40多种单基因疾病有关，但它们在复杂疾病中的作用在很大程度上是未知的。我将介绍我们的新方法来检测全基因组串联重复扩增。这种方法已经确定了罕见的串联重复扩增导致自闭症谱系障碍和其他相关疾病。它为寻找其他复杂疾病中缺失的遗传性提供了一个模型。

引用次数: 0

Beginner's guide to microbiome analysis: Bioinformatics guidelines and practical concepts for amplicon-based microbiome analysis. 微生物组分析初学者指南:基于扩增子的微生物组分析的生物信息学指南和实用概念。

CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics

Pub Date : 2020-11-19 DOI: 10.1145/3429210.3429211

Pichahpuk Uthaipaisanwong, Pantakan Puengrang, C. Rangsiwutisak, Photchanathorn Prombun, Athisri Sitthipunya, Natchaphon Rajudom, K. Kusonmano

The advent of next-generation sequencing (NGS) allows to study living organisms by reading genetic materials in a high-throughput manner. The technology has opened up a field of microbial research in several areas such as medicine, agriculture, energy, and environment, to study a whole microbial community in an environment of interest without culturing. Bioinformatics analysis is a need in order to characterize and analyze microbiota in the studied samples. In this tutorial, we will give an overview of microbiome analysis based on high-throughput 16S rRNA genes sequencing, a commonly-used target sequence to classify bacteria and archaea. With biological and technology backgrounds, microbiome data from short-read sequencing platform will be elucidated followed by all important computational steps for microbiome analysis. The steps include data preprocessing, amplicon sequence variant analysis, taxonomy assignment, data normalization, and diversity analyses. Practical concepts and codes for the microbiome analysis will be demonstrated step by step providing a basic guideline for beginner.

下一代测序技术(NGS)的出现，使得通过高通量读取遗传物质来研究生物体成为可能。该技术在医学、农业、能源和环境等多个领域开辟了微生物研究领域，无需培养即可在感兴趣的环境中研究整个微生物群落。生物信息学分析是表征和分析研究样品中微生物群的必要条件。在本教程中，我们将概述基于高通量16S rRNA基因测序的微生物组分析，这是一种常用的目标序列，用于分类细菌和古细菌。在生物学和技术背景下，将对来自短读测序平台的微生物组数据进行阐明，然后进行微生物组分析的所有重要计算步骤。步骤包括数据预处理、扩增子序列变异分析、分类分配、数据规范化和多样性分析。微生物组分析的实用概念和代码将逐步展示，为初学者提供基本指导。

{"title":"Beginner's guide to microbiome analysis: Bioinformatics guidelines and practical concepts for amplicon-based microbiome analysis.","authors":"Pichahpuk Uthaipaisanwong, Pantakan Puengrang, C. Rangsiwutisak, Photchanathorn Prombun, Athisri Sitthipunya, Natchaphon Rajudom, K. Kusonmano","doi":"10.1145/3429210.3429211","DOIUrl":"https://doi.org/10.1145/3429210.3429211","url":null,"abstract":"The advent of next-generation sequencing (NGS) allows to study living organisms by reading genetic materials in a high-throughput manner. The technology has opened up a field of microbial research in several areas such as medicine, agriculture, energy, and environment, to study a whole microbial community in an environment of interest without culturing. Bioinformatics analysis is a need in order to characterize and analyze microbiota in the studied samples. In this tutorial, we will give an overview of microbiome analysis based on high-throughput 16S rRNA genes sequencing, a commonly-used target sequence to classify bacteria and archaea. With biological and technology backgrounds, microbiome data from short-read sequencing platform will be elucidated followed by all important computational steps for microbiome analysis. The steps include data preprocessing, amplicon sequence variant analysis, taxonomy assignment, data normalization, and diversity analyses. Practical concepts and codes for the microbiome analysis will be demonstrated step by step providing a basic guideline for beginner.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121750683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Detecting Covid-19 in Chest X-Rays using Transfer Learning with VGG16 利用VGG16迁移学习检测胸片中的Covid-19

CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics

Pub Date : 2020-11-19 DOI: 10.1145/3429210.3429213

Amy Y Chen, Jonathan Jaegerman, Dunja Matić, Hassaan Inayatali, Nipon Charoenkitkarn, Jonathan H. Chan

Covid-19 is a novel epidemic that has hugely impacted countries worldwide [13]; and for which there is a need for quick and accurate screening methods. Current testing methods include the reverse transcription-polymerase chain reaction test and medical diagnosis using computed tomography scans. Both of these require expensive technologies as well as highly-trained practitioners and thus are in short supply [18]. Less developed countries and overloaded hospitals have increased the demand for cheap, easy and accurate screening methods [4]. X-ray devices are now cheap, portable and easy to use; there are few professionals, however, who are capable of manually identifying Covid-19 from a chest x-ray. We suggest implementing a machine learning model that incorporates transfer learning to automatically detect Covid-19 from chest x-ray images. The suggested model is built on top of the VGG16 architecture and pre-trained ImageNet weights. Compared with the VGG19, Inception-V3, Inception-ResNet, Xception, RestNet152-V2, and DenseNet201 models, the VGG16 model achieved the highest testing accuracy of 98% on 10 epochs as well as high positive-class accuracy. Gradient-weighted class activation mapping (Grad-CAM) was also applied to detect the regions that have a greater impact on the model classification decision.

Covid-19是一种新型流行病，对世界各国产生了巨大影响[13];因此需要快速准确的筛查方法。目前的检测方法包括逆转录聚合酶链反应测试和使用计算机断层扫描进行医学诊断。这两种方法都需要昂贵的技术和训练有素的从业人员，因此供不应求[18]。欠发达国家和超负荷的医院增加了对廉价、简便、准确的筛查方法的需求[4]。x射线设备现在便宜、便携且易于使用;然而，很少有专业人士能够从胸部x光片中手动识别Covid-19。我们建议实施一种结合迁移学习的机器学习模型，从胸部x射线图像中自动检测Covid-19。建议的模型建立在VGG16架构和预训练的ImageNet权重之上。与VGG19、Inception-V3、Inception-ResNet、Xception、RestNet152-V2和DenseNet201模型相比，VGG16模型在10个epoch上的测试准确率最高，达到98%，具有较高的正类准确率。采用梯度加权类激活映射(Gradient-weighted class activation mapping, Grad-CAM)检测对模型分类决策影响较大的区域。

{"title":"Detecting Covid-19 in Chest X-Rays using Transfer Learning with VGG16","authors":"Amy Y Chen, Jonathan Jaegerman, Dunja Matić, Hassaan Inayatali, Nipon Charoenkitkarn, Jonathan H. Chan","doi":"10.1145/3429210.3429213","DOIUrl":"https://doi.org/10.1145/3429210.3429213","url":null,"abstract":"Covid-19 is a novel epidemic that has hugely impacted countries worldwide [13]; and for which there is a need for quick and accurate screening methods. Current testing methods include the reverse transcription-polymerase chain reaction test and medical diagnosis using computed tomography scans. Both of these require expensive technologies as well as highly-trained practitioners and thus are in short supply [18]. Less developed countries and overloaded hospitals have increased the demand for cheap, easy and accurate screening methods [4]. X-ray devices are now cheap, portable and easy to use; there are few professionals, however, who are capable of manually identifying Covid-19 from a chest x-ray. We suggest implementing a machine learning model that incorporates transfer learning to automatically detect Covid-19 from chest x-ray images. The suggested model is built on top of the VGG16 architecture and pre-trained ImageNet weights. Compared with the VGG19, Inception-V3, Inception-ResNet, Xception, RestNet152-V2, and DenseNet201 models, the VGG16 model achieved the highest testing accuracy of 98% on 10 epochs as well as high positive-class accuracy. Gradient-weighted class activation mapping (Grad-CAM) was also applied to detect the regions that have a greater impact on the model classification decision.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126480930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Uncovering RNA and DNA Modifications from Native Sequences 从天然序列中发现RNA和DNA修饰

CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics

Pub Date : 2020-11-19 DOI: 10.1145/3429210.3429232

I. Nookaew

Ribonucleotides modifications to mRNA play important roles biological regulations. Over 170 types of RNA modifications have been experimentally validated. Their detection traditionally relies on specific antibody-based enrichment and analytical chemistry tools; these approaches are labor intensive and can detect only one or a few modifications at a time. This is insufficient to truly assess complete transcriptomes for sequence-specific identification and quantitation of epigenetic signals. Recently, we were the first to use third-generation Oxford Nanopore Technology (ONT) sequencing to directly sequence cellular RNA in native from, at a transcriptomic level. We determined that the method can uncover RNA modifications of any type. Based on the principle that such modifications are absent on cDNA or synthetical unmodified RNA, we conducted a study that compared sequence features of native modified RNA with unmodified RNA of the same sequence. We developed a bioinformatics tool, ELIGOS (Epitranscriptional Landscape Inferring from Glitches of ONT Signals), that successfully identified modified RNA bases from the native RNA sequences. ELIGOS accurately predicts known classes of RNA methylation sites (AUC > 0.93) in rRNAs from E. coli, yeast, and human cells, by using either unmodified in vitro transcribed RNA or our developed background-error model, which mimics the systematic error in native RNA sequences. The validity of the approach was illustrated in transcriptomes of yeast, mouse, and human cells. We further apply ELIGOS in detection of DNA adducts and for distinguishing individual alkylated DNA adducts. We analyzed a library of 16 plasmids containing site-specifically inserted O6- or N2-alkyl-deoxyguanosine lesions differing in sizes, functional group, regiochemistries, and abasic site. Based on the native DNA sequences, ELIGOS can accurately identified the location of individual DNA adducts. Moreover, individual DNA adducts were clearly distinguished from each other at the signal level. ELIGOS software is publicly available and can be used to detect possible RNA and DNA modification sites at genome-scale from native RNA/DNA sequences.

核糖核苷酸对mRNA的修饰在生物调控中起着重要作用。超过170种RNA修饰已被实验证实。它们的检测传统上依赖于基于特异性抗体的富集和分析化学工具;这些方法是劳动密集型的，一次只能检测到一个或几个修改。这不足以真正评估完整的转录组序列特异性鉴定和表观遗传信号的定量。最近，我们首次使用第三代牛津纳米孔技术(ONT)测序，在转录组水平上直接对原生细胞RNA进行测序。我们确定该方法可以发现任何类型的RNA修饰。基于cDNA或合成的未修饰RNA不存在这种修饰的原则，我们进行了一项研究，比较了天然修饰RNA与相同序列的未修饰RNA的序列特征。我们开发了一个生物信息学工具ELIGOS (Epitranscriptional Landscape Inferring from Glitches of ONT Signals)，成功地从天然RNA序列中鉴定出修饰的RNA碱基。ELIGOS通过使用未经修饰的体外转录RNA或我们开发的模拟天然RNA序列系统误差的背景误差模型，准确预测大肠杆菌、酵母和人类细胞中rnas中已知的RNA甲基化位点(AUC > 0.93)。该方法的有效性在酵母、小鼠和人类细胞的转录组中得到了证明。我们进一步将ELIGOS应用于DNA加合物的检测和区分单个烷基化DNA加合物。我们分析了一个包含16个质粒的文库，这些质粒含有位点特异性插入的O6-或n2 -烷基脱氧鸟苷损伤，其大小、官能团、区域化学和基本位点不同。基于天然DNA序列，ELIGOS可以准确地识别单个DNA加合物的位置。此外，单个DNA加合物在信号水平上被清楚地区分开来。ELIGOS软件是公开可用的，可用于从天然RNA/DNA序列中检测基因组尺度上可能的RNA和DNA修饰位点。

{"title":"Uncovering RNA and DNA Modifications from Native Sequences","authors":"I. Nookaew","doi":"10.1145/3429210.3429232","DOIUrl":"https://doi.org/10.1145/3429210.3429232","url":null,"abstract":"Ribonucleotides modifications to mRNA play important roles biological regulations. Over 170 types of RNA modifications have been experimentally validated. Their detection traditionally relies on specific antibody-based enrichment and analytical chemistry tools; these approaches are labor intensive and can detect only one or a few modifications at a time. This is insufficient to truly assess complete transcriptomes for sequence-specific identification and quantitation of epigenetic signals. Recently, we were the first to use third-generation Oxford Nanopore Technology (ONT) sequencing to directly sequence cellular RNA in native from, at a transcriptomic level. We determined that the method can uncover RNA modifications of any type. Based on the principle that such modifications are absent on cDNA or synthetical unmodified RNA, we conducted a study that compared sequence features of native modified RNA with unmodified RNA of the same sequence. We developed a bioinformatics tool, ELIGOS (Epitranscriptional Landscape Inferring from Glitches of ONT Signals), that successfully identified modified RNA bases from the native RNA sequences. ELIGOS accurately predicts known classes of RNA methylation sites (AUC > 0.93) in rRNAs from E. coli, yeast, and human cells, by using either unmodified in vitro transcribed RNA or our developed background-error model, which mimics the systematic error in native RNA sequences. The validity of the approach was illustrated in transcriptomes of yeast, mouse, and human cells. We further apply ELIGOS in detection of DNA adducts and for distinguishing individual alkylated DNA adducts. We analyzed a library of 16 plasmids containing site-specifically inserted O6- or N2-alkyl-deoxyguanosine lesions differing in sizes, functional group, regiochemistries, and abasic site. Based on the native DNA sequences, ELIGOS can accurately identified the location of individual DNA adducts. Moreover, individual DNA adducts were clearly distinguished from each other at the signal level. ELIGOS software is publicly available and can be used to detect possible RNA and DNA modification sites at genome-scale from native RNA/DNA sequences.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120967435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DLPAlign: A Deep Learning based Progressive Alignment Method for Multiple Protein Sequences DLPAlign:一种基于深度学习的多蛋白质序列渐进比对方法

CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics

Pub Date : 2020-11-19 DOI: 10.1145/3429210.3429221

Mengmeng Kuang, Yong Liu, Lufei Gao

This paper proposed a novel and straightforward approach to improve the accuracy of progressive multiple protein sequence alignment method. We trained a decision-making model based on the convolutional neural networks and bi-directional long short term memory networks, and progressively aligned the input protein sequences by calculating different posterior probability matrices. To evaluate this method, we have implemented a multiple sequence alignment tool called DLPAlign and compared its performance with eleven leading alignment methods on three empirical alignment benchmarks (BAliBASE, OXBench and SABMark). Our results show that DLPAlign can get the best total-column scores on the three benchmarks. When evaluated against the 711 low similarity families with average PID ≤ 30%, DLPAlign improved about 2.8% over the second-best MSA software. Besides, we compared the performance of DLPAlign and other alignment tools on a real-life application, namely protein secondary structure prediction on four protein sequences related to SARS-COV-2, and DLPAlign provides the best result in all cases.

本文提出了一种新颖、直观的方法来提高渐进式多蛋白序列比对方法的准确性。我们基于卷积神经网络和双向长短期记忆网络训练了一个决策模型，并通过计算不同的后验概率矩阵逐步对齐输入的蛋白质序列。为了评估该方法，我们实现了一个名为DLPAlign的多序列比对工具，并在三个经验比对基准(BAliBASE, OXBench和SABMark)上将其性能与11种领先的比对方法进行了比较。我们的结果表明，DLPAlign可以在三个基准测试中获得最佳的总列分数。当对平均PID≤30%的711个低相似性家族进行评估时，DLPAlign比第二好的MSA软件提高了约2.8%。此外，我们比较了DLPAlign与其他比对工具在实际应用中的性能，即对4个与SARS-COV-2相关的蛋白质序列进行蛋白质二级结构预测，在所有情况下，DLPAlign都提供了最好的结果。

{"title":"DLPAlign: A Deep Learning based Progressive Alignment Method for Multiple Protein Sequences","authors":"Mengmeng Kuang, Yong Liu, Lufei Gao","doi":"10.1145/3429210.3429221","DOIUrl":"https://doi.org/10.1145/3429210.3429221","url":null,"abstract":"This paper proposed a novel and straightforward approach to improve the accuracy of progressive multiple protein sequence alignment method. We trained a decision-making model based on the convolutional neural networks and bi-directional long short term memory networks, and progressively aligned the input protein sequences by calculating different posterior probability matrices. To evaluate this method, we have implemented a multiple sequence alignment tool called DLPAlign and compared its performance with eleven leading alignment methods on three empirical alignment benchmarks (BAliBASE, OXBench and SABMark). Our results show that DLPAlign can get the best total-column scores on the three benchmarks. When evaluated against the 711 low similarity families with average PID ≤ 30%, DLPAlign improved about 2.8% over the second-best MSA software. Besides, we compared the performance of DLPAlign and other alignment tools on a real-life application, namely protein secondary structure prediction on four protein sequences related to SARS-COV-2, and DLPAlign provides the best result in all cases.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130873120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Covid 19 Prediction from X Ray Images Using Fully Connected Convolutional Neural Network 基于全连接卷积神经网络的X射线图像预测Covid - 19

CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics

Pub Date : 2020-11-19 DOI: 10.1145/3429210.3429233

Sanghamita Bhoumik, Sayantan Chatterjee, Ankur Sarkar, Abhishek Kumar, Ferdin Joe John Joseph

COVID 19 pandemic has paralyzed the whole world irrespective of any discrimination. To contain the infection effective testing of people plays a vital role. Usually, chest X-ray image-based diagnosis using manual methods is carried out, which is not only time-consuming but also paves way for asymptomatic patients to transmit the virus at a faster pace. Chest X-ray image analysis using a fully connected convolutional neural network (CNN) has been proposed in this paper to solve the purpose. The fully connected CNN with two variants of convolution especially DSC has proved its efficiency in detecting COVID 19 infections.

COVID - 19大流行使整个世界陷入瘫痪，没有任何歧视。有效的人员检测对控制感染起着至关重要的作用。通常，基于胸部x线图像的诊断采用人工方法，不仅耗时，而且为无症状患者更快地传播病毒铺平了道路。为了解决这一问题，本文提出了使用全连接卷积神经网络(CNN)对胸部x线图像进行分析。采用两种卷积变体的全连接CNN，特别是DSC，已经证明了其检测COVID - 19感染的效率。

引用次数: 6

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀