首页 > 最新文献

CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics最新文献

英文 中文
Proof-of-concept model of red blood cell with coarse-grained hemoglobin 粗粒血红蛋白红细胞的概念验证模型
Mariana Ondrusová, I. Cimrák
In modelling of individual red blood cells different bio-mechanical phenomena have to be taken into account. Besides the evident mechanical properties of the red blood cell membrane such as shear elasticity, viscosity ratio is sometimes omitted in the models. We present an approach for including the difference between the inner and the outer fluid of the cell into model. We analyze physical properties of protein hemoglobin that is responsible for higher viscosity of inner cytoplasm of the cell. To keep the computational complexity reasonable we build coarse-grained model of hemoglobin. We present initial proof-of-concept study using the validation test of cell’s behaviour in a shear flow.
在对单个红细胞进行建模时,必须考虑到不同的生物力学现象。除了红细胞膜的剪切弹性等明显的力学特性外,黏度比有时在模型中被忽略。我们提出了一种将细胞内液和外液的差异纳入模型的方法。我们分析了蛋白血红蛋白的物理性质,它是细胞内细胞质高粘度的原因。为了保持合理的计算复杂度,我们建立了血红蛋白的粗粒度模型。我们提出了初步的概念验证研究,使用细胞在剪切流动中的行为验证测试。
{"title":"Proof-of-concept model of red blood cell with coarse-grained hemoglobin","authors":"Mariana Ondrusová, I. Cimrák","doi":"10.1145/3429210.3429228","DOIUrl":"https://doi.org/10.1145/3429210.3429228","url":null,"abstract":"In modelling of individual red blood cells different bio-mechanical phenomena have to be taken into account. Besides the evident mechanical properties of the red blood cell membrane such as shear elasticity, viscosity ratio is sometimes omitted in the models. We present an approach for including the difference between the inner and the outer fluid of the cell into model. We analyze physical properties of protein hemoglobin that is responsible for higher viscosity of inner cytoplasm of the cell. To keep the computational complexity reasonable we build coarse-grained model of hemoglobin. We present initial proof-of-concept study using the validation test of cell’s behaviour in a shear flow.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123528314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic ICD-10 codes association to diagnosis: Bulgarian case 自动ICD-10代码与诊断的关联:保加利亚病例
Boris Velichkov, Simeon Gerginov, P. Panayotov, S. Vassileva, Gerasim Velchev, I. Koychev, S. Boytcheva
This paper presents an approach for the automatic association of diagnoses in Bulgarian language to ICD-10 codes. Since this task is currently performed manually by medical professionals, the ability to automate it would save time and allow doctors to focus more on patient care. The presented approach employs a fine-tuned language model (i.e. BERT) as a multi-class classification model. As there are several different types of BERT models, we conduct experiments to assess the applicability of domain and language specific model adaptation. To train our models we use a big corpora of about 350,000 textual descriptions of diagnosis in Bulgarian language annotated with ICD-10 codes. We conduct experiments comparing the accuracy of ICD-10 code prediction using different types of BERT language models. The results show that the MultilingualBERT model (Accuracy Top 1 - 81%; Macro F1 - 86%, MRR Top 5 - 88%) outperforms other models. However, all models seem to suffer from the class imbalance in the training dataset. The achieved accuracy of prediction in the experiments can be evaluated as very high, given the huge amount of classes and noisiness of the data. The result also provides evidence that the collected dataset and the proposed approach can be useful in building an application to help medical practitioners with this task and encourages further research to improve the prediction accuracy of the models. By design, the proposed approach strives to be language-independent as much as possible and can be easily adapted to other languages.
本文提出了一种保加利亚语诊断与ICD-10代码自动关联的方法。由于这项任务目前由医疗专业人员手动执行,因此自动化功能将节省时间,并使医生能够更多地关注患者护理。所提出的方法采用一种微调的语言模型(即BERT)作为多类分类模型。由于有几种不同类型的BERT模型,我们进行了实验来评估领域和语言特定模型自适应的适用性。为了训练我们的模型,我们使用了一个大型语料库,该语料库包含大约350,000个保加利亚语的诊断文本描述,并附有ICD-10代码注释。我们通过实验比较了不同类型的BERT语言模型对ICD-10代码预测的准确性。结果表明:MultilingualBERT模型(准确率Top 1 - 81%;宏观F1 - 86%, MRR前5 - 88%)优于其他模型。然而,所有的模型似乎都受到训练数据集中的类不平衡的影响。考虑到大量的分类和数据的噪声,在实验中实现的预测精度可以评价为非常高。该结果还提供了证据,表明所收集的数据集和提出的方法可以用于构建应用程序,以帮助医疗从业者完成这项任务,并鼓励进一步研究以提高模型的预测准确性。通过设计,所提出的方法力求尽可能地与语言无关,并且可以很容易地适应其他语言。
{"title":"Automatic ICD-10 codes association to diagnosis: Bulgarian case","authors":"Boris Velichkov, Simeon Gerginov, P. Panayotov, S. Vassileva, Gerasim Velchev, I. Koychev, S. Boytcheva","doi":"10.1145/3429210.3429224","DOIUrl":"https://doi.org/10.1145/3429210.3429224","url":null,"abstract":"This paper presents an approach for the automatic association of diagnoses in Bulgarian language to ICD-10 codes. Since this task is currently performed manually by medical professionals, the ability to automate it would save time and allow doctors to focus more on patient care. The presented approach employs a fine-tuned language model (i.e. BERT) as a multi-class classification model. As there are several different types of BERT models, we conduct experiments to assess the applicability of domain and language specific model adaptation. To train our models we use a big corpora of about 350,000 textual descriptions of diagnosis in Bulgarian language annotated with ICD-10 codes. We conduct experiments comparing the accuracy of ICD-10 code prediction using different types of BERT language models. The results show that the MultilingualBERT model (Accuracy Top 1 - 81%; Macro F1 - 86%, MRR Top 5 - 88%) outperforms other models. However, all models seem to suffer from the class imbalance in the training dataset. The achieved accuracy of prediction in the experiments can be evaluated as very high, given the huge amount of classes and noisiness of the data. The result also provides evidence that the collected dataset and the proposed approach can be useful in building an application to help medical practitioners with this task and encourages further research to improve the prediction accuracy of the models. By design, the proposed approach strives to be language-independent as much as possible and can be easily adapted to other languages.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126335540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Classification of Protein Crystallization Images using EfficientNet with Data Augmentation 基于数据增强的高效网蛋白质结晶图像分类
David William Edwards II, I. Dinç
In this paper, we applied EfficientNet, a scalable deep convolution neural network, with a custom data augmentation stage to a public protein crystallization image dataset called MARCO. The MARCO dataset has 493,214 protein crystallization images collected from several well-known institutions. In our experiments, EfficientNet outperformed the accuracies reported in the previous studies, and it reached an overall 96.71% testing and 91.33% validation accuracy on the dataset. Also, EfficientNet achieved 97.23% crystal detection accuracy in testing data, which is significant improvement over existing studies.
在本文中,我们将可扩展深度卷积神经网络EfficientNet与自定义数据增强阶段应用于公共蛋白质结晶图像数据集MARCO。MARCO数据集收集了来自多个知名机构的493214张蛋白质结晶图像。在我们的实验中,EfficientNet在数据集上的总体测试准确率达到96.71%,验证准确率达到91.33%,优于以往研究报告的准确率。此外,EfficientNet在测试数据中实现了97.23%的晶体检测准确率,与现有研究相比有了显著提高。
{"title":"Classification of Protein Crystallization Images using EfficientNet with Data Augmentation","authors":"David William Edwards II, I. Dinç","doi":"10.1145/3429210.3429220","DOIUrl":"https://doi.org/10.1145/3429210.3429220","url":null,"abstract":"In this paper, we applied EfficientNet, a scalable deep convolution neural network, with a custom data augmentation stage to a public protein crystallization image dataset called MARCO. The MARCO dataset has 493,214 protein crystallization images collected from several well-known institutions. In our experiments, EfficientNet outperformed the accuracies reported in the previous studies, and it reached an overall 96.71% testing and 91.33% validation accuracy on the dataset. Also, EfficientNet achieved 97.23% crystal detection accuracy in testing data, which is significant improvement over existing studies.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115408330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Modeling metabolic fluxes underlying cassava storage root growth through E-Fmin analysis 通过E-Fmin分析模拟木薯贮藏根生长的代谢通量
Ratchaprapa Kamsen, S. Kalapanulak, T. Saithong
Cassava (Manihot esculenta Crantz) is a staple crop that has a great impact on global food security. Cassava yield improvement has continuously been researched, resulting in various elite cultivars bred during last decades. To pursue a better yield, it requires deep insight into metabolic process underlying the assimilation and conversion of carbon substrates to storage root biomass. In this study, we employed E-Fmin analysis to model carbon metabolism in storage roots of cassava. The model was constructed based on primary metabolism of carbon assimilation pathway in non-photosynthetic cells and corresponding gene expression data. The model, namely rMeCBMx-EFmin, was able to mimic growth of storage roots measured from Kasetsart 50 (KU50). The rMeCBMx-EFmin highlighted the tentative metabolic flux distribution that carbon substrates were economically converted into cellular biomass of cassava storage roots. The small total flux (3.2749 mmol gDWSRs−1 day−1) with respect to the published model of cassava storage roots (4.4255 mmol gDWSRs−1 day−1) indicated metabolic frugality in the simulated root metabolism. The simulation also showed that alpha-D-glucose-6-phosphate (-D-Glc-6P) partitioned from respiration was a key carbon precursor imported to plastid for storage root biomass production. The knowledge gained would be beneficial for later experimental design of yield enhancement.
木薯(Manihot esculenta Crantz)是对全球粮食安全有重大影响的主要作物。近几十年来,人们对木薯产量的提高进行了不断的研究,培育出了各种优良品种。为了追求更好的产量,需要深入了解碳底物同化和转化为储存根生物量的代谢过程。在本研究中,我们采用E-Fmin分析来模拟木薯储存根的碳代谢。该模型基于非光合细胞碳同化途径的初级代谢和相应的基因表达数据构建。该模型,即rMeCBMx-EFmin,能够模拟从Kasetsart 50 (KU50)测量的储存根的生长。rMeCBMx-EFmin强调了碳基质经济转化为木薯储根细胞生物量的初步代谢通量分布。与已发表的木薯储存根模型(4.4255 mmol gDWSRs−1 day−1)相比,总通量(3.2749 mmol gDWSRs−1 day−1)较小,表明模拟根代谢的代谢节俭。模拟还表明,从呼吸中分离出来的α - d -葡萄糖-6-磷酸(- d -葡萄糖- 6p)是输入质体储存根生物量的关键碳前体。所获得的知识将有助于以后的增产试验设计。
{"title":"Modeling metabolic fluxes underlying cassava storage root growth through E-Fmin analysis","authors":"Ratchaprapa Kamsen, S. Kalapanulak, T. Saithong","doi":"10.1145/3429210.3429234","DOIUrl":"https://doi.org/10.1145/3429210.3429234","url":null,"abstract":"Cassava (Manihot esculenta Crantz) is a staple crop that has a great impact on global food security. Cassava yield improvement has continuously been researched, resulting in various elite cultivars bred during last decades. To pursue a better yield, it requires deep insight into metabolic process underlying the assimilation and conversion of carbon substrates to storage root biomass. In this study, we employed E-Fmin analysis to model carbon metabolism in storage roots of cassava. The model was constructed based on primary metabolism of carbon assimilation pathway in non-photosynthetic cells and corresponding gene expression data. The model, namely rMeCBMx-EFmin, was able to mimic growth of storage roots measured from Kasetsart 50 (KU50). The rMeCBMx-EFmin highlighted the tentative metabolic flux distribution that carbon substrates were economically converted into cellular biomass of cassava storage roots. The small total flux (3.2749 mmol gDWSRs−1 day−1) with respect to the published model of cassava storage roots (4.4255 mmol gDWSRs−1 day−1) indicated metabolic frugality in the simulated root metabolism. The simulation also showed that alpha-D-glucose-6-phosphate (-D-Glc-6P) partitioned from respiration was a key carbon precursor imported to plastid for storage root biomass production. The knowledge gained would be beneficial for later experimental design of yield enhancement.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122703469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convolutional neural network for prediction of COVID-19 from chest X-ray images 卷积神经网络用于胸部x线图像预测COVID-19
Debayan Goswami, Anwesha Law, Debasrita Chakraborty, Abhishek Dey
The COVID-19 pandemic has affected humans worldwide, and we are in dire need of techniques to bring this situation within our control. Among the various approaches attempted by researchers, preliminary prediction of COVID-19 through chest X-ray images is proving to be quite beneficial and thus, is being explored thoroughly. In this paper, a novel combination of local binary pattern based feature selection along with a convolutional neural network is proposed which can predict positive and negative cases by analysing chest X-ray images. The model consists of a feature extraction process followed by various pooling and convolution layers systematically placed to give an optimal output. The proposed model has been trained and tested on a COVID-19 CXR images dataset, and it is seen that it achieves a significant improvement over the five other comparison methods.
COVID-19大流行已经影响到全世界的人类,我们迫切需要技术来控制这种情况。在研究人员尝试的各种方法中,通过胸部x线图像初步预测COVID-19被证明是非常有益的,因此正在进行彻底的探索。本文提出了一种基于局部二值模式的特征选择与卷积神经网络相结合的方法,该方法可以通过对胸部x线图像的分析来预测阳性和阴性病例。该模型包括一个特征提取过程,然后系统地放置各种池化和卷积层,以给出最佳输出。本文提出的模型在COVID-19 CXR图像数据集上进行了训练和测试,与其他五种比较方法相比,该模型取得了显著的改进。
{"title":"Convolutional neural network for prediction of COVID-19 from chest X-ray images","authors":"Debayan Goswami, Anwesha Law, Debasrita Chakraborty, Abhishek Dey","doi":"10.1145/3429210.3429219","DOIUrl":"https://doi.org/10.1145/3429210.3429219","url":null,"abstract":"The COVID-19 pandemic has affected humans worldwide, and we are in dire need of techniques to bring this situation within our control. Among the various approaches attempted by researchers, preliminary prediction of COVID-19 through chest X-ray images is proving to be quite beneficial and thus, is being explored thoroughly. In this paper, a novel combination of local binary pattern based feature selection along with a convolutional neural network is proposed which can predict positive and negative cases by analysing chest X-ray images. The model consists of a feature extraction process followed by various pooling and convolution layers systematically placed to give an optimal output. The proposed model has been trained and tested on a COVID-19 CXR images dataset, and it is seen that it achieves a significant improvement over the five other comparison methods.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125995269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An Image Segment-based Classification for Chest X-Ray Image 基于图像分割的胸部x射线图像分类
Phongsathorn Kittiworapanya, Kitsuchart Pasupa
In late 2019, the first case of COVID-19 was confirmed in Wuhan, China. The number of cases has been rapidly growing since then. Molecular and antigen testing methods are very accurate for the diagnosis of COVID-19. However, with sudden increases of infected cases, laboratory-based molecular test and COVID-19 test kits are in short supply. Because the virus affects an infected patient’s lung, interpreting images obtained from Computed Tomography Scanners and Chest X-ray Radiography (CXR) machines can be an alternative for diagnosis. However CXR interpretation requires experts and the number of experts is limited. Therefore, automatic detection of COVID-19 from CXR images is required. We describe a system for automatic detection of COVID-19 from CXR images. It first segmented images to select only the lung. The segmented part was then fed into a multiclass classification module, which worked well with samples obtained from various sources, which had different aspect ratios, contrast and viewpoints. The system also handled the unbalanced dataset—only a small fraction of images showed COVID-19. Our system achieved 92% of F1-score and 88.1% Marco F1-score on the 3rd Deep Learning and AI Summer/Winter School Hackathon Phase 3—Multi-class COVID-19 Chest X-ray challenge public leaderboard.
2019年底,中国武汉确诊了首例COVID-19病例。自那时以来,病例数量一直在迅速增长。分子和抗原检测方法对COVID-19的诊断非常准确。然而,随着感染病例的突然增加,实验室分子检测和新冠病毒检测试剂盒供不应求。由于病毒会影响受感染患者的肺部,因此解读计算机断层扫描仪和胸部x射线照相(CXR)机获得的图像可作为诊断的替代方法。然而,CXR解释需要专家,而专家的数量有限。因此,需要从CXR图像中自动检测COVID-19。我们描述了一个从CXR图像中自动检测COVID-19的系统。它首先分割图像,只选择肺。然后将分割的部分输入到多类分类模块中,该模块可以很好地处理来自不同长宽比、对比度和视点的各种来源的样本。该系统还处理了不平衡的数据——只有一小部分图像显示了COVID-19。我们的系统在第三届深度学习与人工智能夏冬学校黑客马拉松第三阶段-多班级COVID-19胸部x线挑战公共排行榜上取得了92%的f1得分和88.1%的Marco f1得分。
{"title":"An Image Segment-based Classification for Chest X-Ray Image","authors":"Phongsathorn Kittiworapanya, Kitsuchart Pasupa","doi":"10.1145/3429210.3429227","DOIUrl":"https://doi.org/10.1145/3429210.3429227","url":null,"abstract":"In late 2019, the first case of COVID-19 was confirmed in Wuhan, China. The number of cases has been rapidly growing since then. Molecular and antigen testing methods are very accurate for the diagnosis of COVID-19. However, with sudden increases of infected cases, laboratory-based molecular test and COVID-19 test kits are in short supply. Because the virus affects an infected patient’s lung, interpreting images obtained from Computed Tomography Scanners and Chest X-ray Radiography (CXR) machines can be an alternative for diagnosis. However CXR interpretation requires experts and the number of experts is limited. Therefore, automatic detection of COVID-19 from CXR images is required. We describe a system for automatic detection of COVID-19 from CXR images. It first segmented images to select only the lung. The segmented part was then fed into a multiclass classification module, which worked well with samples obtained from various sources, which had different aspect ratios, contrast and viewpoints. The system also handled the unbalanced dataset—only a small fraction of images showed COVID-19. Our system achieved 92% of F1-score and 88.1% Marco F1-score on the 3rd Deep Learning and AI Summer/Winter School Hackathon Phase 3—Multi-class COVID-19 Chest X-ray challenge public leaderboard.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115194923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Predicting Dihydroartemisinin Resistance in Plasmodium falciparum using Pathway Activity Inference 利用途径活性推断预测恶性疟原虫双氢青蒿素耐药性
Nicola Lawford, Jonathan H. Chan
Drug resistance threatens the effectiveness of treatments of infectious diseases, particularly on the global scale where mutation is rapid, mechanisms of resistance are developing or unknown, and limited data is available. Pathway activity inference is a dimensionality reduction method with proven effectiveness in classifying cancer types and drug responses based on transcription data. We propose a novel application of pathway activity inference to predict dihydroartemisinin resistance in the Plasmodium falciparum strain of malaria, a global infectious disease. Optimized pathway activity inference models outperform untransformed gene expression models in both in vitro regression (p = 0.03) and in vivo classification tasks (p = 2 × 10− 9). Optimal methods were found to be mostly ensemble (5 of 12) and/or kernel-based (7 of 12), providing the first evidence of the effectiveness of kernel methods for predicting drug resistance in infectious diseases. Performance metrics of the optimal in vitro model on in vivo data (accuracy , area under receiver operating characteristic curve = 0.63) affirmed the low empirical correlation between resistance measures in the two settings.
耐药性威胁到传染病治疗的有效性,特别是在全球范围内,在突变迅速、耐药性机制正在形成或未知、可获得的数据有限的情况下。途径活性推断是一种降维方法,在基于转录数据分类癌症类型和药物反应方面已被证明是有效的。我们提出了一种新的应用途径活性推断来预测疟疾的恶性疟原虫菌株的双氢青蒿素耐药性,这是一种全球性传染病。优化后的途径活性推断模型在体外回归(p = 0.03)和体内分类任务(p = 2 × 10−9)中都优于未转化的基因表达模型。研究发现,优化方法大多是集合(5 / 12)和/或基于核的(7 / 12),这首次证明了核方法在预测传染病耐药性方面的有效性。最佳体外模型在体内数据上的性能指标(准确性,受试者工作特征曲线下面积= 0.63)证实了两种设置下阻力测量之间的经验相关性较低。
{"title":"Predicting Dihydroartemisinin Resistance in Plasmodium falciparum using Pathway Activity Inference","authors":"Nicola Lawford, Jonathan H. Chan","doi":"10.1145/3429210.3429215","DOIUrl":"https://doi.org/10.1145/3429210.3429215","url":null,"abstract":"Drug resistance threatens the effectiveness of treatments of infectious diseases, particularly on the global scale where mutation is rapid, mechanisms of resistance are developing or unknown, and limited data is available. Pathway activity inference is a dimensionality reduction method with proven effectiveness in classifying cancer types and drug responses based on transcription data. We propose a novel application of pathway activity inference to predict dihydroartemisinin resistance in the Plasmodium falciparum strain of malaria, a global infectious disease. Optimized pathway activity inference models outperform untransformed gene expression models in both in vitro regression (p = 0.03) and in vivo classification tasks (p = 2 × 10− 9). Optimal methods were found to be mostly ensemble (5 of 12) and/or kernel-based (7 of 12), providing the first evidence of the effectiveness of kernel methods for predicting drug resistance in infectious diseases. Performance metrics of the optimal in vitro model on in vivo data (accuracy , area under receiver operating characteristic curve = 0.63) affirmed the low empirical correlation between resistance measures in the two settings.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114144239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification of Gene Subnetwork Biomarkers of Lung Cancer from RNA-seq Data 基于RNA-seq数据的肺癌基因亚网络生物标志物鉴定
Kritsada Sreebunpeng, Jonathan H. Chan, A. Meechai
In recent years, the increasing availability of cancer RNA-seq datasets has provided unprecedented information and opportunities for the discovery of biomarkers for cancer. In this study, we tested our previously published Gene Sub-Network-based Feature Selection (GSNFS) method to identify gene-subnetwork biomarkers with RNA-seq-based gene expression data of lung cancer. In addition, five different filter-based feature selection techniques were explored to rank identified subnetworks. We found that the majority of the top 10 ranked subnetworks were associated with cancer pathways such as the MAPK signalling pathway. With Support Vector Machine (SVM) as a classifier based on the Area Under Curve (AUC) of the Receiver Operating Characteristic (ROC) curve using 10-fold cross-validation and cross-dataset validation, we showed that gene subnetwork biomarkers obtained by RNA-seq-based GSNFS analysis had excellent classification performance. Additionally, when comparing the top-ranked subnetworks obtained from RNA-seq-based GSNFS analysis with those top-ranked subnetworks previously obtained from DNA microarray-based GSNFS analysis, we could categorize subnetworks and found unique pathways of cancer for each data-based analysis.
近年来,越来越多的癌症RNA-seq数据集为发现癌症生物标志物提供了前所未有的信息和机会。在这项研究中,我们测试了我们之前发表的基于基因子网络的特征选择(GSNFS)方法,利用基于rna -seq的肺癌基因表达数据识别基因子网络生物标志物。此外,探索了五种不同的基于滤波器的特征选择技术来对已识别的子网进行排序。我们发现,排名前10位的子网络中的大多数与癌症通路(如MAPK信号通路)相关。采用支持向量机(SVM)作为基于受试者工作特征(ROC)曲线下面积(AUC)的分类器,通过10倍交叉验证和跨数据集验证,我们发现基于rna -seq的GSNFS分析获得的基因子网络生物标志物具有优异的分类性能。此外,当比较基于rna -seq的GSNFS分析获得的排名靠前的子网络与先前基于DNA微阵列的GSNFS分析获得的排名靠前的子网络时,我们可以对子网络进行分类,并为每个基于数据的分析发现独特的癌症途径。
{"title":"Identification of Gene Subnetwork Biomarkers of Lung Cancer from RNA-seq Data","authors":"Kritsada Sreebunpeng, Jonathan H. Chan, A. Meechai","doi":"10.1145/3429210.3429212","DOIUrl":"https://doi.org/10.1145/3429210.3429212","url":null,"abstract":"In recent years, the increasing availability of cancer RNA-seq datasets has provided unprecedented information and opportunities for the discovery of biomarkers for cancer. In this study, we tested our previously published Gene Sub-Network-based Feature Selection (GSNFS) method to identify gene-subnetwork biomarkers with RNA-seq-based gene expression data of lung cancer. In addition, five different filter-based feature selection techniques were explored to rank identified subnetworks. We found that the majority of the top 10 ranked subnetworks were associated with cancer pathways such as the MAPK signalling pathway. With Support Vector Machine (SVM) as a classifier based on the Area Under Curve (AUC) of the Receiver Operating Characteristic (ROC) curve using 10-fold cross-validation and cross-dataset validation, we showed that gene subnetwork biomarkers obtained by RNA-seq-based GSNFS analysis had excellent classification performance. Additionally, when comparing the top-ranked subnetworks obtained from RNA-seq-based GSNFS analysis with those top-ranked subnetworks previously obtained from DNA microarray-based GSNFS analysis, we could categorize subnetworks and found unique pathways of cancer for each data-based analysis.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130295515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Automated Comprehension and Alignment of Cardiac Models at the System Invariant Level 在系统不变水平上实现心脏模型的自动理解和对齐
Samuel Huang, Madeline Diep, Kuk Jin Jang, E. Cherry, F. Fenton, R. Cleaveland, Mikael Lindvall, R. Mangharam, Adam Porter
The study of cardiac arrhythmias has spurred the development of models across a variety of formulations and scales and designed for different purposes, each with distinct configuration spaces. Nevertheless, these models should be able to exhibit equivalent behavior when their contexts overlap. Configuring models to both support this context equivalence and still exhibit intended behavioral characteristics can be challenging. Due to the complexity of this problem, automation can be desirable. We present a framework aimed at automating the comprehension and alignment of cardiac model behaviors. For model comprehension, we mine a set of properties (invariants) that a model with given configuration will exhibit when executed. Comprehension can be extended to model alignment: we perform comprehension of one model, and then mine a set of configurations for a second, each of which produces invariants aligned to the invariants of the first. The configuration spaces of the two models under study need not be related in any way; rather, the systems are compared by means of the system invariants that they each exhibit. We model system invariants as association rules, a well-studied representation used in the field of data mining. We apply our methodology to two one-dimensional models of cardiac tissue. One model is the well-known differential-equations-based Fenton-Karma model representing the electrophysiology of interconnected cardiac cells, while the other is a timed automaton representation of cardiac tissue designed to enable formal analysis. We demonstrate alignment of the models with respect to activation rates and path conductance. We expect this methodology can be generalized beyond cardiac models.
心律失常的研究促进了各种模型的发展,这些模型具有不同的配方和尺度,并为不同的目的而设计,每个模型都具有不同的配置空间。然而,当它们的上下文重叠时,这些模型应该能够表现出相同的行为。将模型配置为既支持这种上下文等价性,又显示预期的行为特征,可能是一项挑战。由于这个问题的复杂性,自动化是可取的。我们提出了一个旨在自动化理解和对齐心脏模型行为的框架。为了理解模型,我们挖掘具有给定配置的模型在执行时将显示的一组属性(不变量)。理解可以扩展到模型对齐:我们执行一个模型的理解,然后为第二个模型挖掘一组配置,其中每个模型都产生与第一个模型的不变量对齐的不变量。所研究的两个模型的构型空间不需要以任何方式相关联;相反,系统是通过它们各自表现出的系统不变量来进行比较的。我们将系统不变量建模为关联规则,这是一种在数据挖掘领域中得到充分研究的表示。我们将我们的方法应用于两个一维心脏组织模型。一个模型是著名的基于微分方程的Fenton-Karma模型,代表相互连接的心脏细胞的电生理,而另一个模型是心脏组织的时间自动机表示,旨在实现形式分析。我们证明了模型在激活率和路径电导方面的一致性。我们希望这种方法可以推广到心脏模型之外。
{"title":"Towards Automated Comprehension and Alignment of Cardiac Models at the System Invariant Level","authors":"Samuel Huang, Madeline Diep, Kuk Jin Jang, E. Cherry, F. Fenton, R. Cleaveland, Mikael Lindvall, R. Mangharam, Adam Porter","doi":"10.1145/3429210.3429225","DOIUrl":"https://doi.org/10.1145/3429210.3429225","url":null,"abstract":"The study of cardiac arrhythmias has spurred the development of models across a variety of formulations and scales and designed for different purposes, each with distinct configuration spaces. Nevertheless, these models should be able to exhibit equivalent behavior when their contexts overlap. Configuring models to both support this context equivalence and still exhibit intended behavioral characteristics can be challenging. Due to the complexity of this problem, automation can be desirable. We present a framework aimed at automating the comprehension and alignment of cardiac model behaviors. For model comprehension, we mine a set of properties (invariants) that a model with given configuration will exhibit when executed. Comprehension can be extended to model alignment: we perform comprehension of one model, and then mine a set of configurations for a second, each of which produces invariants aligned to the invariants of the first. The configuration spaces of the two models under study need not be related in any way; rather, the systems are compared by means of the system invariants that they each exhibit. We model system invariants as association rules, a well-studied representation used in the field of data mining. We apply our methodology to two one-dimensional models of cardiac tissue. One model is the well-known differential-equations-based Fenton-Karma model representing the electrophysiology of interconnected cardiac cells, while the other is a timed automaton representation of cardiac tissue designed to enable formal analysis. We demonstrate alignment of the models with respect to activation rates and path conductance. We expect this methodology can be generalized beyond cardiac models.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122508357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics 第十一届计算系统生物学与生物信息学国际学术会议论文集
{"title":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","authors":"","doi":"10.1145/3429210","DOIUrl":"https://doi.org/10.1145/3429210","url":null,"abstract":"","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114902028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1