首页 > 最新文献

2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)最新文献

英文 中文
Deep Learning for Brain Tumor Segmentation using Magnetic Resonance Images 基于磁共振图像的深度学习脑肿瘤分割
Surbhi Gupta, Manoj Gupta
Cancer is one of the most significant causes of death worldwide, accounting for millions of deaths each year. The fatality rate of cancer is getting higher. Over the last three decades, deep neural networks have been critical in cancer research. This article described the development of a system for fully automated segmentation of brain tumor. In this study, we have proposed a unique ensemble of Convolutional Neural Networks (ConvNet) for segmenting gliomas from MR images. Two fully linked ConvNets constituted the ensemble model (2D-ConvNet and 3-D ConvNet). The novel model is validated against a single dataset from the Brain Tumor Segmentation (BraTS) challenge, specifically BraTS_2018. The prediction results obtained using the proposed methodology on the BraTS_2018 datasets demonstrate the suggested architecture's efficiency.
癌症是全世界最重要的死亡原因之一,每年造成数百万人死亡。癌症的致死率越来越高。在过去的三十年里,深度神经网络在癌症研究中发挥了关键作用。本文描述了一种全自动脑肿瘤分割系统的开发。在这项研究中,我们提出了一种独特的卷积神经网络(ConvNet)集合,用于从MR图像中分割胶质瘤。两个完全连接的ConvNet (2D-ConvNet和3d -ConvNet)构成了集成模型。该新模型针对来自脑肿瘤分割(BraTS)挑战的单个数据集进行了验证,特别是BraTS_2018。在BraTS_2018数据集上使用所提出的方法获得的预测结果证明了所建议架构的有效性。
{"title":"Deep Learning for Brain Tumor Segmentation using Magnetic Resonance Images","authors":"Surbhi Gupta, Manoj Gupta","doi":"10.1109/CIBCB49929.2021.9562890","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562890","url":null,"abstract":"Cancer is one of the most significant causes of death worldwide, accounting for millions of deaths each year. The fatality rate of cancer is getting higher. Over the last three decades, deep neural networks have been critical in cancer research. This article described the development of a system for fully automated segmentation of brain tumor. In this study, we have proposed a unique ensemble of Convolutional Neural Networks (ConvNet) for segmenting gliomas from MR images. Two fully linked ConvNets constituted the ensemble model (2D-ConvNet and 3-D ConvNet). The novel model is validated against a single dataset from the Brain Tumor Segmentation (BraTS) challenge, specifically BraTS_2018. The prediction results obtained using the proposed methodology on the BraTS_2018 datasets demonstrate the suggested architecture's efficiency.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128121856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Adversarial Deep Evolutionary Learning for Drug Design 药物设计的对抗性深度进化学习
Sheriff Abouchekeir, A. Tchagang, Yifeng Li
The design of a new therapeutic agent is a time-consuming and expensive process. The rise of machine intelligence provides a grand opportunity of expeditiously discovering novel drug candidates through smart search in the vast molecular structural space. In this paper, we propose a new approach called adversarial deep evolutionary learning (ADEL) to search for novel molecules in the latent space of an adversarial generative model and keep improving the latent representation space. In ADEL, a custom-made adversarial autoencoder (AAE) model is developed and trained under a deep evolutionary learning (DEL) process. This involves an initial training of the AAE model, followed by an integration of multi-objective evolutionary optimization in the continuous latent representation space of the AAE rather than the discrete structural space of molecules. By using the AAE, an arbitrary distribution can be provided to the training of AAE such that the latent representation space is set to that distribution. This allows for a starting latent space from which new samples can be produced. Throughout the process of learning, new samples of high-quality are generated after each iteration of training and then added back into the full dataset. Therefore, allowing for a more comprehensive procedure of understanding the data structure. This combination of evolving data and continuous learning not only enables improvement in the generative model, but the data as well. By comparing ADEL to the previous work in DEL, we see that ADEL can obtain better property distributions.
一种新的治疗剂的设计是一个耗时且昂贵的过程。机器智能的兴起为在广阔的分子结构空间中通过智能搜索快速发现新的候选药物提供了巨大的机会。在本文中,我们提出了一种新的方法,称为对抗深度进化学习(ADEL),在对抗生成模型的潜在空间中寻找新的分子,并不断改进潜在表示空间。在深度进化学习(DEL)过程中,开发了定制的对抗自编码器(AAE)模型并对其进行了训练。这包括对AAE模型进行初始训练,然后在AAE的连续潜在表示空间(而不是分子的离散结构空间)中集成多目标进化优化。通过使用AAE,可以为AAE的训练提供一个任意分布,从而将潜在表示空间设置为该分布。这允许一个开始的潜在空间,从中可以产生新的样本。在整个学习过程中,每次训练迭代后都会生成新的高质量样本,然后再添加回完整数据集。因此,允许更全面的过程来理解数据结构。不断发展的数据和持续学习的结合不仅可以改进生成模型,也可以改进数据。通过将ADEL与之前在DEL中的工作进行比较,我们可以看到ADEL可以获得更好的属性分布。
{"title":"Adversarial Deep Evolutionary Learning for Drug Design","authors":"Sheriff Abouchekeir, A. Tchagang, Yifeng Li","doi":"10.1109/CIBCB49929.2021.9562949","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562949","url":null,"abstract":"The design of a new therapeutic agent is a time-consuming and expensive process. The rise of machine intelligence provides a grand opportunity of expeditiously discovering novel drug candidates through smart search in the vast molecular structural space. In this paper, we propose a new approach called adversarial deep evolutionary learning (ADEL) to search for novel molecules in the latent space of an adversarial generative model and keep improving the latent representation space. In ADEL, a custom-made adversarial autoencoder (AAE) model is developed and trained under a deep evolutionary learning (DEL) process. This involves an initial training of the AAE model, followed by an integration of multi-objective evolutionary optimization in the continuous latent representation space of the AAE rather than the discrete structural space of molecules. By using the AAE, an arbitrary distribution can be provided to the training of AAE such that the latent representation space is set to that distribution. This allows for a starting latent space from which new samples can be produced. Throughout the process of learning, new samples of high-quality are generated after each iteration of training and then added back into the full dataset. Therefore, allowing for a more comprehensive procedure of understanding the data structure. This combination of evolving data and continuous learning not only enables improvement in the generative model, but the data as well. By comparing ADEL to the previous work in DEL, we see that ADEL can obtain better property distributions.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122815010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Retinal Disease Classification from OCT Images Using Deep Learning Algorithms 基于深度学习算法的OCT图像视网膜疾病分类
Jongwoo Kim, L. Tran
Optical Coherence Tomography (OCT) is a noninvasive test that takes cross-section pictures of the retina layer of the eye and allows ophthalmologists to diagnose based on the retina's layers. Therefore, it is an important modality for the detection and quantification of retinal diseases and retinal abnormalities. Since OCT provides several images for each patient, it is a time consuming work for ophthalmologists to analyze the images. This paper proposes deep learning models that categorize patients' OCT images into four categories such as Choroidal neovascularization (CNV), Diabetic macular edema (DME), Drusen, and Normal. Two different models are proposed. One is using three binary Convolutional Neural Network (CNN) classifiers and the other is using four binary CNN classifiers. Several CNNs, such as VGG16, VGG19, ResNet50, ResNet152, DenseNet121, and InceptionV3, are adapted as feature extractors to develop the binary classifiers. Among them, the proposed model using VGG16 for CNV vs. Other classes, VGG16 for DME vs. other classes, VGG19 for Drusen vs. Other classes, and InceptionV3 for Normal vs. other classes shows the best performance with 0.987 accuracy, 0.987 sensitivity, and 0.996 specificity. The binary classifier for Normal class has 0.999 accuracy. These results show their potential to work as a second reader for ophthalmologists.
光学相干断层扫描(OCT)是一种非侵入性检查,它可以拍摄眼睛视网膜层的横截面照片,并允许眼科医生根据视网膜层进行诊断。因此,它是检测和定量视网膜疾病和视网膜异常的重要方式。由于OCT为每位患者提供多张图像,因此眼科医生分析图像是一项耗时的工作。本文提出了深度学习模型,将患者的OCT图像分为脉络膜新生血管(CNV)、糖尿病性黄斑水肿(DME)、Drusen和正常四类。提出了两种不同的模型。一种是使用三个二进制卷积神经网络(CNN)分类器,另一种是使用四个二进制CNN分类器。采用VGG16、VGG19、ResNet50、ResNet152、DenseNet121、InceptionV3等几种cnn作为特征提取器开发二值分类器。其中,采用VGG16对CNV vs. Other类、VGG16对DME vs. Other类、VGG19对Drusen vs. Other类、InceptionV3对Normal vs. Other类的模型,准确率0.987、灵敏度0.987、特异性0.996,表现最佳。对于Normal类,二元分类器的准确率为0.999。这些结果显示了它们作为眼科医生的第二阅读器的潜力。
{"title":"Retinal Disease Classification from OCT Images Using Deep Learning Algorithms","authors":"Jongwoo Kim, L. Tran","doi":"10.1109/CIBCB49929.2021.9562919","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562919","url":null,"abstract":"Optical Coherence Tomography (OCT) is a noninvasive test that takes cross-section pictures of the retina layer of the eye and allows ophthalmologists to diagnose based on the retina's layers. Therefore, it is an important modality for the detection and quantification of retinal diseases and retinal abnormalities. Since OCT provides several images for each patient, it is a time consuming work for ophthalmologists to analyze the images. This paper proposes deep learning models that categorize patients' OCT images into four categories such as Choroidal neovascularization (CNV), Diabetic macular edema (DME), Drusen, and Normal. Two different models are proposed. One is using three binary Convolutional Neural Network (CNN) classifiers and the other is using four binary CNN classifiers. Several CNNs, such as VGG16, VGG19, ResNet50, ResNet152, DenseNet121, and InceptionV3, are adapted as feature extractors to develop the binary classifiers. Among them, the proposed model using VGG16 for CNV vs. Other classes, VGG16 for DME vs. other classes, VGG19 for Drusen vs. Other classes, and InceptionV3 for Normal vs. other classes shows the best performance with 0.987 accuracy, 0.987 sensitivity, and 0.996 specificity. The binary classifier for Normal class has 0.999 accuracy. These results show their potential to work as a second reader for ophthalmologists.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123235276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
An Efficient Boolean Modelling Approach for Genetic Network Inference 遗传网络推理的一种高效布尔建模方法
Hasini Nakulugamuwa Gamage, M. Chetty, Adrian B. R. Shatte, J. Hallinan
The inference of Gene Regulatory Networks (GRNs) from time series gene expression data is an effective approach for unveiling important underlying gene-gene relationships and dynamics. While various computational models exist for accurate inference of GRNs, many are computationally inefficient, and do not focus on simultaneous inference of both network topology and dynamics. In this paper, we introduce a simple, Boolean network model-based solution for efficient inference of GRNs. First, the microarray expression data are discretized using the average gene expression value as a threshold. This step permits an experimental approach of defining the maximum indegree of a network. Next, regulatory genes, including the self-regulations for each target gene, are inferred using estimated multivariate mutual information-based Min-Redundancy Max-Relevance Criterion, and further accurate inference is performed by a swapping operation. Subsequently, we introduce a new method, combining Boolean network regulation modelling and Pearson correlation coefficient to identify the interaction types (inhibition or activation) of the regulatory genes. This method is utilized for the efficient determination of the optimal regulatory rule, consisting AND, OR, and NOT operators, by defining the accurate application of the NOT operation in conjunction and disjunction Boolean functions. The proposed approach is evaluated using two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network. Although the Structural Accuracy is approximately the same as existing methods (MIBNI, REVEAL, Best-Fit, BIBN, and CST), the proposed method outperforms all these methods with respect to efficiency and Dynamic Accuracy.
从时间序列基因表达数据推断基因调控网络(GRNs)是揭示重要的潜在基因关系和动态的有效方法。虽然存在各种计算模型来精确推断grn,但许多计算效率低下,并且不关注网络拓扑和动态的同时推断。在本文中,我们介绍了一个简单的,基于布尔网络模型的解决方案,用于grn的有效推理。首先,使用平均基因表达值作为阈值对微阵列表达数据进行离散化。这一步允许用实验方法来定义网络的最大程度。其次,利用估计的多元互信息最小冗余最大相关准则推断调控基因,包括每个目标基因的自我调节,并通过交换操作进行进一步的准确推断。随后,我们引入了一种新的方法,结合布尔网络调控模型和Pearson相关系数来识别调控基因的相互作用类型(抑制或激活)。该方法通过定义NOT运算在合取布尔函数和析取布尔函数中的准确应用,有效地确定由AND、OR和NOT运算符组成的最优调节规则。采用大肠杆菌基因调控网络和裂变酵母细胞周期网络的两个真实基因表达数据集对所提出的方法进行了评估。虽然结构精度与现有方法(MIBNI、REVEAL、Best-Fit、BIBN和CST)大致相同,但该方法在效率和动态精度方面优于所有这些方法。
{"title":"An Efficient Boolean Modelling Approach for Genetic Network Inference","authors":"Hasini Nakulugamuwa Gamage, M. Chetty, Adrian B. R. Shatte, J. Hallinan","doi":"10.1109/CIBCB49929.2021.9562881","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562881","url":null,"abstract":"The inference of Gene Regulatory Networks (GRNs) from time series gene expression data is an effective approach for unveiling important underlying gene-gene relationships and dynamics. While various computational models exist for accurate inference of GRNs, many are computationally inefficient, and do not focus on simultaneous inference of both network topology and dynamics. In this paper, we introduce a simple, Boolean network model-based solution for efficient inference of GRNs. First, the microarray expression data are discretized using the average gene expression value as a threshold. This step permits an experimental approach of defining the maximum indegree of a network. Next, regulatory genes, including the self-regulations for each target gene, are inferred using estimated multivariate mutual information-based Min-Redundancy Max-Relevance Criterion, and further accurate inference is performed by a swapping operation. Subsequently, we introduce a new method, combining Boolean network regulation modelling and Pearson correlation coefficient to identify the interaction types (inhibition or activation) of the regulatory genes. This method is utilized for the efficient determination of the optimal regulatory rule, consisting AND, OR, and NOT operators, by defining the accurate application of the NOT operation in conjunction and disjunction Boolean functions. The proposed approach is evaluated using two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network. Although the Structural Accuracy is approximately the same as existing methods (MIBNI, REVEAL, Best-Fit, BIBN, and CST), the proposed method outperforms all these methods with respect to efficiency and Dynamic Accuracy.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127872118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Genome-scale prediction of bacterial promoters 细菌启动子的基因组尺度预测
Miria Bernardino, R. Beiko
Proteins are responsible for many tasks including cell growth and metabolism. Transcription, the process where genes are used as templates for the production of a messenger RNA intermediate used in the synthesis of proteins, is regulated to ensure that the cell has the appropriate response according to its current needs. An essential step in transcription is the binding of a group of proteins, collectively known as RNA polymerase, to short promoter sequences upstream of the genes to be transcribed. Automated identification of promoters and nearby regulatory sequences can help to predict which genes are likely to be active under a given set of conditions. However, promoters are short, highly variable, and belong to subclasses that sometimes overlap, making their recognition a very difficult problem. Several tools have been developed to identify promoters in DNA, but methods are generally tested on small, balanced subsets of genomic sequence, and the results may not reflect their expected performance on genomes with millions of DNA base pairs in length where only $sim$ 1% of sequence is expected to correspond to promoters. Here we introduce Expositor, a neural-network-based method that uses different types of DNA encodings and tunable sensitivity and specificity parameters. Although the performance of Expositor on balanced datasets was comparable to that of other approaches, at the genome scale our approach finds the highest number of promoters (70% against 46%) with the smallest number of false positives. We also examined the accuracy of Expositor in distinguishing different classes of promoters, and found that misclassification between classes was consistent with the biological similarity between promoters. Expositor source code and pretrained model, and the datasets used for training and testing can be accessed at https://github.com/beiko-lab/Expositor.
蛋白质负责许多任务,包括细胞生长和新陈代谢。转录是指基因被用作合成蛋白质的信使RNA中间体的模板的过程,它受到调控,以确保细胞根据当前的需要做出适当的反应。转录的一个重要步骤是将一组蛋白质(统称为RNA聚合酶)与待转录基因上游的短启动子序列结合。启动子和附近调控序列的自动识别可以帮助预测哪些基因在给定的条件下可能是活跃的。然而,启动子是短的,高度可变的,并且属于有时重叠的子类,使它们的识别成为一个非常困难的问题。已经开发了几种工具来识别DNA中的启动子,但是方法通常在基因组序列的小而平衡的子集上进行测试,并且结果可能无法反映它们在具有数百万DNA碱基对长度的基因组上的预期性能,其中只有$ $ $ 1%的序列预计与启动子对应。在这里,我们介绍Expositor,这是一种基于神经网络的方法,它使用不同类型的DNA编码和可调的灵敏度和特异性参数。尽管Expositor在平衡数据集上的表现与其他方法相当,但在基因组规模上,我们的方法发现启动子数量最多(70%对46%),假阳性数量最少。我们还检验了Expositor区分不同类别启动子的准确性,发现类别之间的错误分类与启动子之间的生物学相似性是一致的。解释器源代码和预训练模型,以及用于训练和测试的数据集可以在https://github.com/beiko-lab/Expositor上访问。
{"title":"Genome-scale prediction of bacterial promoters","authors":"Miria Bernardino, R. Beiko","doi":"10.1109/CIBCB49929.2021.9562938","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562938","url":null,"abstract":"Proteins are responsible for many tasks including cell growth and metabolism. Transcription, the process where genes are used as templates for the production of a messenger RNA intermediate used in the synthesis of proteins, is regulated to ensure that the cell has the appropriate response according to its current needs. An essential step in transcription is the binding of a group of proteins, collectively known as RNA polymerase, to short promoter sequences upstream of the genes to be transcribed. Automated identification of promoters and nearby regulatory sequences can help to predict which genes are likely to be active under a given set of conditions. However, promoters are short, highly variable, and belong to subclasses that sometimes overlap, making their recognition a very difficult problem. Several tools have been developed to identify promoters in DNA, but methods are generally tested on small, balanced subsets of genomic sequence, and the results may not reflect their expected performance on genomes with millions of DNA base pairs in length where only $sim$ 1% of sequence is expected to correspond to promoters. Here we introduce Expositor, a neural-network-based method that uses different types of DNA encodings and tunable sensitivity and specificity parameters. Although the performance of Expositor on balanced datasets was comparable to that of other approaches, at the genome scale our approach finds the highest number of promoters (70% against 46%) with the smallest number of false positives. We also examined the accuracy of Expositor in distinguishing different classes of promoters, and found that misclassification between classes was consistent with the biological similarity between promoters. Expositor source code and pretrained model, and the datasets used for training and testing can be accessed at https://github.com/beiko-lab/Expositor.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128969118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Scaled-2D CNN for Skin Cancer Diagnosis 一种用于皮肤癌诊断的比例二维CNN
T. H. Rafi, R. Shubair
Every year, doctors diagnose skin cancer in around 3 million or more patients across the globe. Currently, it is one of the most widely recognized kinds of cancers for human health. Hence, we need an early diagnosis to prevail any critical condition of the infected patients. Apparently, it can treat with topical drugs, if it diagnoses in an early stage. Hence as an outcome, skin cancer is responsible for less than 1% of all cancer deaths. There are two types of tumors in the skin cancer diseases domain, such as benign and malignant. To develop a robust and early screening system to diagnose skin cancer, it requires an efficient algorithm for prediction, trained with a large dataset. The primary aim of this research is to develop an efficient skin cancer screening process using a robust deep neural network with a large dataset. In this paper, we intend to determine considerate and dangerous types of skin cancer tumors using dermoscopic images from a publicly available dataset. We proposed an efficient and fast scaled 2D-CNN based on EfficientNet-B7 deep neural architecture with image preprocessing. This paper also uses two different pre-trained deep neural architectures, such as VGG19, and ResNet-50 to compare the performance with the proposed architecture. The proposed architecture outperformed the other pre-trained CNN models whereas the proposed architecture achieved higher AUC and accuracy compared to other architectures.
每年,医生在全球范围内诊断出大约300万或更多的皮肤癌患者。目前,它是危害人类健康的最广泛认识的癌症之一。因此,我们需要早期诊断,以防止感染患者出现任何危急情况。显然,如果在早期诊断出来,它可以用局部药物治疗。因此,皮肤癌在所有癌症死亡中所占的比例不到1%。在皮肤癌疾病领域有两种类型的肿瘤,如良性和恶性。为了开发一个强大的早期筛查系统来诊断皮肤癌,它需要一个有效的预测算法,并经过大型数据集的训练。本研究的主要目的是利用具有大型数据集的鲁棒深度神经网络开发一种有效的皮肤癌筛查过程。在本文中,我们打算使用来自公开数据集的皮肤镜图像来确定考虑和危险类型的皮肤癌肿瘤。我们提出了一种基于effentnet - b7深度神经网络架构并进行图像预处理的高效、快速的2D-CNN。本文还使用了两种不同的预训练深度神经架构,如VGG19和ResNet-50来比较其性能。所提出的体系结构优于其他预训练的CNN模型,并且与其他体系结构相比,所提出的体系结构获得了更高的AUC和精度。
{"title":"A Scaled-2D CNN for Skin Cancer Diagnosis","authors":"T. H. Rafi, R. Shubair","doi":"10.1109/CIBCB49929.2021.9562888","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562888","url":null,"abstract":"Every year, doctors diagnose skin cancer in around 3 million or more patients across the globe. Currently, it is one of the most widely recognized kinds of cancers for human health. Hence, we need an early diagnosis to prevail any critical condition of the infected patients. Apparently, it can treat with topical drugs, if it diagnoses in an early stage. Hence as an outcome, skin cancer is responsible for less than 1% of all cancer deaths. There are two types of tumors in the skin cancer diseases domain, such as benign and malignant. To develop a robust and early screening system to diagnose skin cancer, it requires an efficient algorithm for prediction, trained with a large dataset. The primary aim of this research is to develop an efficient skin cancer screening process using a robust deep neural network with a large dataset. In this paper, we intend to determine considerate and dangerous types of skin cancer tumors using dermoscopic images from a publicly available dataset. We proposed an efficient and fast scaled 2D-CNN based on EfficientNet-B7 deep neural architecture with image preprocessing. This paper also uses two different pre-trained deep neural architectures, such as VGG19, and ResNet-50 to compare the performance with the proposed architecture. The proposed architecture outperformed the other pre-trained CNN models whereas the proposed architecture achieved higher AUC and accuracy compared to other architectures.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116702978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
DeepGREP: A deep convolutional neural network for predicting gene-regulating effects of small molecules DeepGREP:用于预测小分子基因调控作用的深度卷积神经网络
Benan Bardak, Mehmet Tan
Accurately predicting desired gene expression effects by using the representations of drugs and genes in silico is a key task in chemogenomics. This paper proposes DeepGREP, a deep learning model that can predict small molecules' gene regulation effects. The main motivation of this work is improving chemical-induced differential gene expression prediction by using a convolutional-based architecture to represent drugs and genes more effectively. To evaluate the performance of the DeepGREP, we conducted several experiments and compared them with DeepCop, the baseline model. The results show that DeepGREP outperforms the baseline model and significantly improves the gene expression prediction for AUC by around 4%, F-Score by around 15%, and Enrichment Factor by around 22%. We also demonstrate that the proposed method mostly outperforms the baseline in more difficulties setting of generalization to unseen molecules by using cold-drug splitting.
在化学基因组学中,利用药物和基因在计算机上的表征准确预测所需的基因表达效应是一项关键任务。本文提出了一种可以预测小分子基因调控效应的深度学习模型DeepGREP。这项工作的主要动机是通过使用基于卷积的架构来更有效地表示药物和基因,从而改善化学诱导的差异基因表达预测。为了评估DeepGREP的性能,我们进行了几个实验,并将它们与基线模型DeepCop进行了比较。结果表明,DeepGREP优于基线模型,并将AUC的基因表达预测提高了约4%,F-Score提高了约15%,Enrichment Factor提高了约22%。我们还证明,所提出的方法在使用冷药分裂对看不见的分子进行泛化的更困难设置中大多优于基线。
{"title":"DeepGREP: A deep convolutional neural network for predicting gene-regulating effects of small molecules","authors":"Benan Bardak, Mehmet Tan","doi":"10.1109/CIBCB49929.2021.9562920","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562920","url":null,"abstract":"Accurately predicting desired gene expression effects by using the representations of drugs and genes in silico is a key task in chemogenomics. This paper proposes DeepGREP, a deep learning model that can predict small molecules' gene regulation effects. The main motivation of this work is improving chemical-induced differential gene expression prediction by using a convolutional-based architecture to represent drugs and genes more effectively. To evaluate the performance of the DeepGREP, we conducted several experiments and compared them with DeepCop, the baseline model. The results show that DeepGREP outperforms the baseline model and significantly improves the gene expression prediction for AUC by around 4%, F-Score by around 15%, and Enrichment Factor by around 22%. We also demonstrate that the proposed method mostly outperforms the baseline in more difficulties setting of generalization to unseen molecules by using cold-drug splitting.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127013392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
One Moose, Two Moose, Three Fields, More? 一只驼鹿,两只驼鹿,三块田地,更多?
D. Ashlock, J. A. Brown, S. Houghten, M. Makhmutov
This study introduces a new game that models competition in foraging behavior. Two moose decide, in each time period, which of three foraging areas to visit. Moose in the same foraging area fight, gaining no forage and also damaging some forage during their conflict. Moose alone in a foraging area eat, with the forage in each field being replenished with a logistic growth model. This creates a relatively complex game with a rich strategy space in which the moose try to maximize their forage intake. The game is a coordination game, as the moose try to avoid conflict which does not maximize forage intake. The paper reports the results of two student competitions at Innopolis University and performs agent evolution to verify the existence of a rich strategy space for the game.
本研究引入了一种新的游戏来模拟觅食行为中的竞争。两只驼鹿在每个时间段内决定三个觅食区域中的哪一个。在同一觅食区域的驼鹿会打架,在他们的冲突中得不到饲料,也会破坏一些饲料。驼鹿独自在觅食区域进食,每个区域的饲料都用物流增长模型进行补充。这创造了一个相对复杂的游戏,其中有丰富的策略空间,驼鹿试图最大化他们的饲料摄入量。这个游戏是一个协调的游戏,因为驼鹿试图避免冲突,这不会使饲料摄入量最大化。本文报告了Innopolis大学两个学生竞赛的结果,并通过智能体进化来验证博弈存在丰富的策略空间。
{"title":"One Moose, Two Moose, Three Fields, More?","authors":"D. Ashlock, J. A. Brown, S. Houghten, M. Makhmutov","doi":"10.1109/CIBCB49929.2021.9562871","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562871","url":null,"abstract":"This study introduces a new game that models competition in foraging behavior. Two moose decide, in each time period, which of three foraging areas to visit. Moose in the same foraging area fight, gaining no forage and also damaging some forage during their conflict. Moose alone in a foraging area eat, with the forage in each field being replenished with a logistic growth model. This creates a relatively complex game with a rich strategy space in which the moose try to maximize their forage intake. The game is a coordination game, as the moose try to avoid conflict which does not maximize forage intake. The paper reports the results of two student competitions at Innopolis University and performs agent evolution to verify the existence of a rich strategy space for the game.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134196005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discovering Missing Edges in Drug-Protein Networks: Repurposing Drugs for SARS-CoV-2 发现药物-蛋白质网络缺失的边缘:重新利用药物治疗SARS-CoV-2
Fatemeh Zaremehrjardi, Athar Omidi, Cristina D. Sciortino, Ryan E. R. Reid, Ryan Lukeman, J. Hughes, O. Soufan
The COVID-19 pandemic, caused by the SARS-CoV-2 virus, led to a global health crisis, with more than 157 million cases confirmed infected by May 2021. Effective medication is desperately needed. Predicting drug-target interaction (DTI) is an important step to discover novel uses of chemical structures. Here, we develop a pipeline to predict novel DTIs based on the proteins of the coronavirus. Different datasets (human/SARS-CoV-2 Protein-Protein interaction (PPI), Drug-Drug similarity (DD sim), and DTIs) are used and combined. After mapping all datasets onto a heterogeneous graph, path-related features are extracted. We then applied various machine learning (ML) algorithms to model our dataset and predict novel DTIs among unlabeled pairs. Possible drugs identified by the models with a high frequency are reported. In addition, evidence of the efficiency of the predicted medicines by the models against COVID-19 are presented. The proposed model can then be generalized to contain other features that provide a context to predict medicine for different diseases.
由SARS-CoV-2病毒引起的COVID-19大流行引发了全球卫生危机,截至2021年5月,确诊感染病例超过1.57亿例。迫切需要有效的药物治疗。预测药物-靶标相互作用(DTI)是发现化学结构新用途的重要一步。在这里,我们开发了一个基于冠状病毒蛋白质预测新型dti的管道。不同的数据集(人/SARS-CoV-2蛋白-蛋白相互作用(PPI)、药物-药物相似性(DD sim)和DTIs)被使用和组合。将所有数据集映射到异构图后,提取与路径相关的特征。然后,我们应用各种机器学习(ML)算法来建模我们的数据集,并在未标记的对中预测新的dti。本文报道了由模型识别出的可能的高频药物。此外,还提供了模型预测的抗COVID-19药物有效性的证据。提出的模型可以被推广到包含其他特征,这些特征为预测不同疾病的药物提供了背景。
{"title":"Discovering Missing Edges in Drug-Protein Networks: Repurposing Drugs for SARS-CoV-2","authors":"Fatemeh Zaremehrjardi, Athar Omidi, Cristina D. Sciortino, Ryan E. R. Reid, Ryan Lukeman, J. Hughes, O. Soufan","doi":"10.1109/CIBCB49929.2021.9562855","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562855","url":null,"abstract":"The COVID-19 pandemic, caused by the SARS-CoV-2 virus, led to a global health crisis, with more than 157 million cases confirmed infected by May 2021. Effective medication is desperately needed. Predicting drug-target interaction (DTI) is an important step to discover novel uses of chemical structures. Here, we develop a pipeline to predict novel DTIs based on the proteins of the coronavirus. Different datasets (human/SARS-CoV-2 Protein-Protein interaction (PPI), Drug-Drug similarity (DD sim), and DTIs) are used and combined. After mapping all datasets onto a heterogeneous graph, path-related features are extracted. We then applied various machine learning (ML) algorithms to model our dataset and predict novel DTIs among unlabeled pairs. Possible drugs identified by the models with a high frequency are reported. In addition, evidence of the efficiency of the predicted medicines by the models against COVID-19 are presented. The proposed model can then be generalized to contain other features that provide a context to predict medicine for different diseases.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"220 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124366395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Clinical Drug Representations for Improving Mortality and Length of Stay Predictions 使用临床药物表示提高死亡率和住院时间预测
Batuhan Bardak, Mehmet Tan
Drug representations have played an important role in cheminformatics. However, in the healthcare domain, drug representations have been underused relative to the rest of Electronic Health Record (EHR) data, due to the complexity of high dimensional drug representations and the lack of proper pipeline that will allow to convert clinical drugs to their representations. Time-varying vital signs, laboratory measurements, and related time-series signals are commonly used to predict clinical outcomes. In this work, we demonstrated that using clinical drug representations in addition to other clinical features has significant potential to increase the performance of mortality and length of stay (LOS) models. We evaluate the two different drug representation methods (Extended -Connectivity Fingerprint- ECFP and SMILES-Transformer embedding) on clinical outcome predictions. The results have shown that the proposed multimodal approach achieves substantial enhancement on clinical tasks over baseline models. U sing clinical drug representations as additional features improve the LOS prediction for Area Under the Receiver Operating Characteristics (AUROC) around %6 and for Area Under Precision-Recall Curve (AUPRC) by around % 5. Furthermore, for the mortality prediction task, there is an improvement of around % 2 over the time series baseline in terms of AUROC and %3.5 in terms of AUPRC. The code for the proposed method is available at https://github.com/tanlab/MIMIC-III-Clinical-Drug-Representations.
药物表征在化学信息学中起着重要的作用。然而,在医疗保健领域,由于高维药物表示的复杂性以及缺乏将临床药物转换为其表示的适当管道,相对于电子健康记录(EHR)数据的其余部分,药物表示尚未得到充分利用。时变生命体征、实验室测量和相关时间序列信号通常用于预测临床结果。在这项工作中,我们证明了除了其他临床特征外,使用临床药物表征具有显著的潜力,可以提高死亡率和住院时间(LOS)模型的性能。我们评估了两种不同的药物表示方法(扩展-连接指纹- ECFP和SMILES-Transformer嵌入)在临床结果预测中的作用。结果表明,所提出的多模式方法在临床任务上比基线模型有了实质性的增强。使用临床药物表征作为附加特征,可以将受试者操作特征下面积(AUROC)的LOS预测提高约%6,将精确度-召回曲线下面积(AUPRC)的LOS预测提高约% 5。此外,对于死亡率预测任务,在AUROC方面比时间序列基线提高约% 2,在AUPRC方面提高约%3.5。所建议的方法的代码可在https://github.com/tanlab/MIMIC-III-Clinical-Drug-Representations上获得。
{"title":"Using Clinical Drug Representations for Improving Mortality and Length of Stay Predictions","authors":"Batuhan Bardak, Mehmet Tan","doi":"10.1109/CIBCB49929.2021.9562819","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562819","url":null,"abstract":"Drug representations have played an important role in cheminformatics. However, in the healthcare domain, drug representations have been underused relative to the rest of Electronic Health Record (EHR) data, due to the complexity of high dimensional drug representations and the lack of proper pipeline that will allow to convert clinical drugs to their representations. Time-varying vital signs, laboratory measurements, and related time-series signals are commonly used to predict clinical outcomes. In this work, we demonstrated that using clinical drug representations in addition to other clinical features has significant potential to increase the performance of mortality and length of stay (LOS) models. We evaluate the two different drug representation methods (Extended -Connectivity Fingerprint- ECFP and SMILES-Transformer embedding) on clinical outcome predictions. The results have shown that the proposed multimodal approach achieves substantial enhancement on clinical tasks over baseline models. U sing clinical drug representations as additional features improve the LOS prediction for Area Under the Receiver Operating Characteristics (AUROC) around %6 and for Area Under Precision-Recall Curve (AUPRC) by around % 5. Furthermore, for the mortality prediction task, there is an improvement of around % 2 over the time series baseline in terms of AUROC and %3.5 in terms of AUPRC. The code for the proposed method is available at https://github.com/tanlab/MIMIC-III-Clinical-Drug-Representations.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122912902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1