首页 > 最新文献

2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)最新文献

英文 中文
An Improved Method for Finding Attractors of Large-Scale Asynchronous Boolean Networks 一种寻找大规模异步布尔网络吸引子的改进方法
Giang V. Trinh, K. Hiraishi
Attractor detection in Asynchronous Boolean Networks (ABNs) is very challenging due to the high complexity of the state transition graph of an ABN. Recently, an efficient method (called FVS-ARBN) has been proposed for exactly finding attractors of an ABN. FVS-ARBN uses a Feedback Vertex Set (FVS) to get a candidate set of states, then filters out this set by checking the reachability in ABNs. This method gives promising results; however, it still needs to be improved to handle larger networks. In this paper, we propose a new method (named iFVS-ABN) that includes two improvements to FVS-ARBN. First, we propose a reasonable combination of multiple existing techniques to efficiently check the reachability in ABNs. Second, we formally state and prove a relation between a Negative Feedback Vertex Set (NFVS) and the dynamics of an ABN. Based on this relation, we propose to use an NFVS instead of an FVS to get the candidate set of states. Experimental results show that the two improvements are effective and the improved method outperforms the original one.
由于异步布尔网络的状态转移图的高度复杂性,异步布尔网络中的吸引子检测非常具有挑战性。最近,人们提出了一种精确寻找ABN吸引子的有效方法(称为FVS-ARBN)。FVS- arbn使用反馈顶点集(FVS)获得候选状态集,然后通过检查abn中的可达性来过滤掉该候选状态集。该方法给出了令人满意的结果;然而,它仍然需要改进以处理更大的网络。在本文中,我们提出了一种新的方法(命名为iFVS-ABN),它包含了对FVS-ARBN的两个改进。首先,我们提出了多种现有技术的合理组合,以有效地检查abn的可达性。其次,我们正式陈述并证明了负反馈顶点集(NFVS)与ABN动力学之间的关系。基于这种关系,我们建议使用NFVS来代替FVS来获得候选状态集。实验结果表明,两种改进方法都是有效的,改进后的方法优于原方法。
{"title":"An Improved Method for Finding Attractors of Large-Scale Asynchronous Boolean Networks","authors":"Giang V. Trinh, K. Hiraishi","doi":"10.1109/CIBCB49929.2021.9562947","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562947","url":null,"abstract":"Attractor detection in Asynchronous Boolean Networks (ABNs) is very challenging due to the high complexity of the state transition graph of an ABN. Recently, an efficient method (called FVS-ARBN) has been proposed for exactly finding attractors of an ABN. FVS-ARBN uses a Feedback Vertex Set (FVS) to get a candidate set of states, then filters out this set by checking the reachability in ABNs. This method gives promising results; however, it still needs to be improved to handle larger networks. In this paper, we propose a new method (named iFVS-ABN) that includes two improvements to FVS-ARBN. First, we propose a reasonable combination of multiple existing techniques to efficiently check the reachability in ABNs. Second, we formally state and prove a relation between a Negative Feedback Vertex Set (NFVS) and the dynamics of an ABN. Based on this relation, we propose to use an NFVS instead of an FVS to get the candidate set of states. Experimental results show that the two improvements are effective and the improved method outperforms the original one.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"400 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121804811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Identification of Genes Associated with Alzheimer's Disease using Evolutionary Computation 利用进化计算鉴定与阿尔茨海默病相关的基因
Guangyao Chen, James Sargant, S. Houghten, T. K. Collins
A multi-objective genetic algorithm is applied to the problem of identifying genes associated with Alzheimer's disease. The input to the genetic algorithm is a set of centrality measures obtained by merging various biological evidence types into a complex network, based on a set of 11 genes already known to be associated with this disease. In terms of leave-one-out validation, the strongest results are obtained using betweenness, with ranking showing that better results are sometimes obtained by including either stress or load with betweenness. The overall ranking of the genes across all runs is examined and suggests some genes worthy of further study with respect to their link to this disease. The methodology is also evaluated with respect to robustness by modifying the original network by a range of percentages, and applying the methodology to these variations. The results show that the methodology returns very similar results under these circumstances.
将多目标遗传算法应用于阿尔茨海默病相关基因的识别问题。遗传算法的输入是一组中心性度量,通过将各种生物证据类型合并到一个复杂的网络中获得,该网络基于一组已知与该疾病相关的11个基因。就留一验证而言,使用中间性获得了最强的结果,排序显示,有时使用中间性包括应力或负载会获得更好的结果。研究人员检查了所有基因的总体排名,并提出了一些值得进一步研究的基因与这种疾病的联系。该方法还通过修改原始网络的百分比范围来评估鲁棒性,并将该方法应用于这些变化。结果表明,在这些情况下,该方法返回非常相似的结果。
{"title":"Identification of Genes Associated with Alzheimer's Disease using Evolutionary Computation","authors":"Guangyao Chen, James Sargant, S. Houghten, T. K. Collins","doi":"10.1109/CIBCB49929.2021.9562876","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562876","url":null,"abstract":"A multi-objective genetic algorithm is applied to the problem of identifying genes associated with Alzheimer's disease. The input to the genetic algorithm is a set of centrality measures obtained by merging various biological evidence types into a complex network, based on a set of 11 genes already known to be associated with this disease. In terms of leave-one-out validation, the strongest results are obtained using betweenness, with ranking showing that better results are sometimes obtained by including either stress or load with betweenness. The overall ranking of the genes across all runs is examined and suggests some genes worthy of further study with respect to their link to this disease. The methodology is also evaluated with respect to robustness by modifying the original network by a range of percentages, and applying the methodology to these variations. The results show that the methodology returns very similar results under these circumstances.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116505171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Random Forest to Predict Eucalyptus as a Potential Herb in Preventing Covid19 随机森林预测桉树作为预防covid - 19的潜在草药
N. Ramadhanti, W. Kusuma, I. Batubara, R. Heryanto
The covid-19 pandemic had been on the rise since the beginning of 2020. In Indonesia itself, the first case was identified on 3rd March 2020, then peaked at around the end of January 2021. Even though the recent number of covid-19 cases is not as much as the peak time, the positive case has been increasing from around 2600 to 6300 cases every day in the last month. This phenomenon is urging people to take better care of their health. One of the alternatives Indonesian takes to maintain and increase their health is using herbal medicine. Indonesia is one of the countries with a flourishing number of herbal species. Eucalyptus is one of herbal plants with lots of benefits. Even before the pandemic eucalyptus oil has been used for daily use by many in Indonesia. In this study, we predict the compounds in eucalyptus which have any interaction with protein in SARS-COV-2 virus using machine learning method, namely Random Forest. This is one of the applications of the drug-discovery method, drug repurposing, which used existing drug-target interaction data as a model to predict drug compounds with unidentified interaction with targets. Applying this method, we predicted some compounds found in eucalyptus, such as alpha-terpinene, and 1,8-cineole might have an interaction with covid-19 protein thus eucalyptus can be used as a preventive measure.
自2020年初以来,covid-19大流行一直在上升。在印度尼西亚本身,第一例病例于2020年3月3日被发现,然后在2021年1月底左右达到高峰。最近的确诊病例虽然没有高峰期多,但从上个月开始每天2600多例增加到6300多例。这种现象正在敦促人们更好地照顾自己的健康。印尼人用来保持和增进健康的替代方法之一是使用草药。印度尼西亚是草本植物种类丰富的国家之一。桉树是一种草本植物,有很多好处。甚至在大流行之前,桉树油就已经被印度尼西亚的许多人用于日常使用。在这项研究中,我们使用机器学习方法,即随机森林,预测桉树中与SARS-COV-2病毒蛋白相互作用的化合物。这是药物发现方法药物再利用的应用之一,它使用现有的药物-靶标相互作用数据作为模型来预测与靶标相互作用未知的药物化合物。应用该方法,我们预测桉树中发现的一些化合物,如α -萜烯和1,8-桉树脑可能与covid-19蛋白相互作用,因此桉树可以作为预防措施。
{"title":"Random Forest to Predict Eucalyptus as a Potential Herb in Preventing Covid19","authors":"N. Ramadhanti, W. Kusuma, I. Batubara, R. Heryanto","doi":"10.1109/CIBCB49929.2021.9562940","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562940","url":null,"abstract":"The covid-19 pandemic had been on the rise since the beginning of 2020. In Indonesia itself, the first case was identified on 3rd March 2020, then peaked at around the end of January 2021. Even though the recent number of covid-19 cases is not as much as the peak time, the positive case has been increasing from around 2600 to 6300 cases every day in the last month. This phenomenon is urging people to take better care of their health. One of the alternatives Indonesian takes to maintain and increase their health is using herbal medicine. Indonesia is one of the countries with a flourishing number of herbal species. Eucalyptus is one of herbal plants with lots of benefits. Even before the pandemic eucalyptus oil has been used for daily use by many in Indonesia. In this study, we predict the compounds in eucalyptus which have any interaction with protein in SARS-COV-2 virus using machine learning method, namely Random Forest. This is one of the applications of the drug-discovery method, drug repurposing, which used existing drug-target interaction data as a model to predict drug compounds with unidentified interaction with targets. Applying this method, we predicted some compounds found in eucalyptus, such as alpha-terpinene, and 1,8-cineole might have an interaction with covid-19 protein thus eucalyptus can be used as a preventive measure.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132619862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Vaccinating a Population is a Changing Programming Problem 为人群接种疫苗是一个不断变化的规划问题
Sumaiya Amin, S. Houghten, J. Hughes
How best to apply vaccines to a population is an open problem. It is trivial to derive intuitive strategies, but until tested, their efficacy is not known. This problem is particularly challenging when considering the dynamics of social contact networks and their changes over time. A system for automatically discovering tested vaccination strategies with evolutionary computation has been improved upon to include additional graph metrics and to generate vaccination strategies for dynamic graphs, something that is expected of real social networks within communities. The system's ability to generate effective strategies was demonstrated along with a comparison of the strategies developed when fit to a static graph versus a dynamic graph. It was observed that the additional computational resources required to generate strategies on a dynamic graph may not be necessary as strategies developed for static graphs performed similarly well; however, the authors are careful to acknowledge that results may differ significantly when adjusting the systems many parameters.
如何最好地将疫苗应用于人群是一个悬而未决的问题。推导出直观的策略是微不足道的,但在测试之前,它们的功效是未知的。当考虑到社交网络的动态及其随时间的变化时,这个问题尤其具有挑战性。通过进化计算自动发现已测试的疫苗接种策略的系统已经得到改进,包括额外的图形度量,并为动态图形生成疫苗接种策略,这是对社区内真实社会网络的期望。演示了系统生成有效策略的能力,并比较了适合静态图和动态图时开发的策略。有人指出,在动态图上生成策略所需的额外计算资源可能没有必要,因为为静态图开发的策略表现同样良好;然而,作者谨慎地承认,当调整系统的许多参数时,结果可能会有很大的不同。
{"title":"Vaccinating a Population is a Changing Programming Problem","authors":"Sumaiya Amin, S. Houghten, J. Hughes","doi":"10.1109/CIBCB49929.2021.9562943","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562943","url":null,"abstract":"How best to apply vaccines to a population is an open problem. It is trivial to derive intuitive strategies, but until tested, their efficacy is not known. This problem is particularly challenging when considering the dynamics of social contact networks and their changes over time. A system for automatically discovering tested vaccination strategies with evolutionary computation has been improved upon to include additional graph metrics and to generate vaccination strategies for dynamic graphs, something that is expected of real social networks within communities. The system's ability to generate effective strategies was demonstrated along with a comparison of the strategies developed when fit to a static graph versus a dynamic graph. It was observed that the additional computational resources required to generate strategies on a dynamic graph may not be necessary as strategies developed for static graphs performed similarly well; however, the authors are careful to acknowledge that results may differ significantly when adjusting the systems many parameters.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130200837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A comparison of multi-objective optimization algorithms to identify drug target combinations 多目标优化算法识别药物靶标组合的比较
S. Spolaor, D. Papetti, P. Cazzaniga, D. Besozzi, M. S. Nobile
Combination therapies represent one of the most effective strategy in inducing cancer cell death and reducing the risk to develop drug resistance. The identification of putative novel drug combinations, which typically requires the execution of expensive and time consuming lab experiments, can be supported by the synergistic use of mathematical models and multi-objective optimization algorithms. The computational approach allows to automatically search for potential therapeutic combinations and to test their effectiveness in silico, thus reducing the costs of time and money, and driving the experiments toward the most promising therapies. In this work, we couple dynamic fuzzy modeling of cancer cells with different multi-objective optimization algorithm, and we compare their performance in identifying drug target combinations. Specifically, we perform batches of optimizations with 3 and 4 objective functions defined to achieve a desired behavior of the system (e.g., maximize apop-tosis while minimizing necrosis and survival), and we compare the quality of the solutions included in the Pareto fronts. Our results show that both the choice of the multi-objective algorithm and the formulation of the optimization problem have an impact on the identified solutions, highlighting the strengths as well as the limitations of this approach.
联合治疗是诱导癌细胞死亡和降低产生耐药性风险的最有效策略之一。假设的新型药物组合的鉴定通常需要执行昂贵且耗时的实验室实验,可以通过协同使用数学模型和多目标优化算法来支持。计算方法允许自动搜索潜在的治疗组合,并在计算机上测试它们的有效性,从而减少时间和金钱的成本,并推动实验朝着最有希望的治疗方法发展。在这项工作中,我们将癌细胞的动态模糊建模与不同的多目标优化算法相结合,并比较了它们在识别药物靶点组合方面的性能。具体来说,我们使用3个和4个目标函数来执行批量优化,以实现系统的期望行为(例如,最大化细胞凋亡,同时最小化坏死和存活),并且我们比较了帕累托前沿中包含的解决方案的质量。我们的研究结果表明,多目标算法的选择和优化问题的表述都对识别出的解有影响,突出了该方法的优势和局限性。
{"title":"A comparison of multi-objective optimization algorithms to identify drug target combinations","authors":"S. Spolaor, D. Papetti, P. Cazzaniga, D. Besozzi, M. S. Nobile","doi":"10.1109/CIBCB49929.2021.9562773","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562773","url":null,"abstract":"Combination therapies represent one of the most effective strategy in inducing cancer cell death and reducing the risk to develop drug resistance. The identification of putative novel drug combinations, which typically requires the execution of expensive and time consuming lab experiments, can be supported by the synergistic use of mathematical models and multi-objective optimization algorithms. The computational approach allows to automatically search for potential therapeutic combinations and to test their effectiveness in silico, thus reducing the costs of time and money, and driving the experiments toward the most promising therapies. In this work, we couple dynamic fuzzy modeling of cancer cells with different multi-objective optimization algorithm, and we compare their performance in identifying drug target combinations. Specifically, we perform batches of optimizations with 3 and 4 objective functions defined to achieve a desired behavior of the system (e.g., maximize apop-tosis while minimizing necrosis and survival), and we compare the quality of the solutions included in the Pareto fronts. Our results show that both the choice of the multi-objective algorithm and the formulation of the optimization problem have an impact on the identified solutions, highlighting the strengths as well as the limitations of this approach.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123318087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Influenza A Viral Host Using PSSM and Word Embeddings 利用PSSM和词嵌入预测甲型流感病毒宿主
Yanhua Xu, D. Wojtczak
The rapid mutation of influenza virus threatens public health. Reassortment among viruses with different hosts can lead to a fatal pandemic. However, it is difficult to detect the original host of the virus during or after an outbreak as influenza viruses can circulate between different species. Therefore, early and rapid detection of the viral host would help reduce the further spread of the virus. We use various machine learning models with features derived from the position-specific scoring matrix (PSSM) and features learned from word embedding and word encoding to infer the origin host of viruses. The results show that the performance of the PSSM-based model reaches the MCC around 95%, and the F1, around 96%. The MCC obtained using the model with word embedding is around 96%, and the F1 is around 97%.
流感病毒的快速变异威胁着公众健康。不同宿主的病毒重新组合可能导致致命的大流行。然而,由于流感病毒可以在不同物种之间传播,因此很难在疫情期间或之后发现病毒的原始宿主。因此,早期和快速发现病毒宿主将有助于减少病毒的进一步传播。我们使用各种机器学习模型,这些模型具有来自位置特定评分矩阵(PSSM)的特征,以及从单词嵌入和单词编码中学习的特征来推断病毒的起源宿主。结果表明,基于pssm的模型的MCC性能达到95%左右,F1性能达到96%左右。采用词嵌入模型得到的MCC约为96%,F1约为97%。
{"title":"Predicting Influenza A Viral Host Using PSSM and Word Embeddings","authors":"Yanhua Xu, D. Wojtczak","doi":"10.1109/CIBCB49929.2021.9562959","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562959","url":null,"abstract":"The rapid mutation of influenza virus threatens public health. Reassortment among viruses with different hosts can lead to a fatal pandemic. However, it is difficult to detect the original host of the virus during or after an outbreak as influenza viruses can circulate between different species. Therefore, early and rapid detection of the viral host would help reduce the further spread of the virus. We use various machine learning models with features derived from the position-specific scoring matrix (PSSM) and features learned from word embedding and word encoding to infer the origin host of viruses. The results show that the performance of the PSSM-based model reaches the MCC around 95%, and the F1, around 96%. The MCC obtained using the model with word embedding is around 96%, and the F1 is around 97%.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115954133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
$mathcal{LAJA}{-}$ Label Attention Transformer Architectures for ICD-10 Coding of Unstructured Clinical Notes $mathcal{LAJA}{-}$用于非结构化临床记录ICD-10编码的标签注意力转换器架构
V. Mayya, Sowmya S Kamath, V. Sugumaran
Effective code assignment for patient clinical records in a hospital plays a significant role in the process of standardizing medical records, mainly for streamlining clinical care delivery, billing, and managing insurance claims. The current practice employed is manual coding, usually carried out by trained medical coders, making the process subjective, error-prone, inexact, and time-consuming. To alleviate this cost-intensive process, intelligent coding systems built on patients' structured electronic medical records are critical. Classification of medical diagnostic codes, like ICD-10, is widely employed to categorize patients' clinical conditions and associated diagnoses. In this work, we present a neural model $mathcal{LAJA}$, built on Label Attention Transformer Architectures for automatic assignment of ICD-10 codes. Our work is benchmarked on the CodiEsp dataset, a dataset for automatic clinical coding systems for multilingual medical documents, used in the eHealth CLEF 2020-Multilingual Information Extraction Shared Task. The experimental results reveal that the proposed $mathcal{LAJA}$ variants outperform their basic BERT counterparts by 33-49% in terms of standard metrics like precision, recall, F1-score and mean average precision. The label attention mechanism also enables direct extraction of textual evidence in medical documents that map to the clinical ICD-10 diagnostic codes.
医院中患者临床记录的有效代码分配在医疗记录标准化过程中起着重要作用,主要是为了简化临床护理交付、计费和管理保险索赔。目前采用的做法是手动编码,通常由训练有素的医疗编码人员执行,这使得该过程主观、容易出错、不准确且耗时。为了减轻这一成本密集的过程,建立在患者结构化电子病历基础上的智能编码系统至关重要。医学诊断代码分类,如ICD-10,被广泛用于对患者的临床状况和相关诊断进行分类。在这项工作中,我们提出了一个神经模型$mathcal{LAJA}$,建立在标签注意力转换器架构上,用于ICD-10代码的自动分配。我们的工作以CodiEsp数据集为基准,CodiEsp数据集是用于多语言医疗文档的自动临床编码系统的数据集,用于eHealth CLEF 2020-多语言信息提取共享任务。实验结果表明,在精度、召回率、f1分数和平均精度等标准指标上,提出的$mathcal{LAJA}$变体比基本BERT变体高出33-49%。标签注意机制还可以直接提取与ICD-10临床诊断代码相对应的医疗文件中的文本证据。
{"title":"$mathcal{LAJA}{-}$ Label Attention Transformer Architectures for ICD-10 Coding of Unstructured Clinical Notes","authors":"V. Mayya, Sowmya S Kamath, V. Sugumaran","doi":"10.1109/CIBCB49929.2021.9562815","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562815","url":null,"abstract":"Effective code assignment for patient clinical records in a hospital plays a significant role in the process of standardizing medical records, mainly for streamlining clinical care delivery, billing, and managing insurance claims. The current practice employed is manual coding, usually carried out by trained medical coders, making the process subjective, error-prone, inexact, and time-consuming. To alleviate this cost-intensive process, intelligent coding systems built on patients' structured electronic medical records are critical. Classification of medical diagnostic codes, like ICD-10, is widely employed to categorize patients' clinical conditions and associated diagnoses. In this work, we present a neural model $mathcal{LAJA}$, built on Label Attention Transformer Architectures for automatic assignment of ICD-10 codes. Our work is benchmarked on the CodiEsp dataset, a dataset for automatic clinical coding systems for multilingual medical documents, used in the eHealth CLEF 2020-Multilingual Information Extraction Shared Task. The experimental results reveal that the proposed $mathcal{LAJA}$ variants outperform their basic BERT counterparts by 33-49% in terms of standard metrics like precision, recall, F1-score and mean average precision. The label attention mechanism also enables direct extraction of textual evidence in medical documents that map to the clinical ICD-10 diagnostic codes.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121585843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Human Activity Recognition Using Convolutional Neural Networks 基于卷积神经网络的人类活动识别
Gulustan Dogan, Sinem Sena Ertas, Iremnaz Cay
Using smartphone sensors to recognize human activity may be advantageous due to the abundant volume of data that can be obtained. In this paper, we propose a sensor data based deep learning approach for recognizing human activity. Our proposed recognition method uses linear accelerometer (LAcc), gyroscope (Gyr), and magnetometer (Mag) sensors to perceive eight transportation and locomotion activities. The eight activities include: Still, Walk, Run, Bike, Bus, Car, Train, and Subway. In this study, the Sussex-Huawei Locomotion (SHL) Dataset of three participants are used to recognize the physical activities of the users. Fast Fourier Transform (FFT) spectrograms generated from the three axes of the LAcc, Gyr, and Mag sensor data are used as input data for our proposed Convolutional Neural Network (CNN) model. Experimental results on the task of human activity recognition demonstrated the effectiveness of our proposed user-independent approach over that of competitive baselines.
使用智能手机传感器来识别人类活动可能是有利的,因为可以获得大量的数据。在本文中,我们提出了一种基于传感器数据的深度学习方法来识别人类活动。我们提出的识别方法使用线性加速度计(LAcc)、陀螺仪(Gyr)和磁力计(Mag)传感器来感知八种运输和运动活动。这八种活动包括:静止、步行、跑步、自行车、公共汽车、汽车、火车和地铁。在本研究中,使用三个参与者的Sussex-Huawei Locomotion (SHL) Dataset来识别用户的身体活动。从LAcc、Gyr和Mag传感器数据的三个轴生成的快速傅立叶变换(FFT)频谱图被用作我们提出的卷积神经网络(CNN)模型的输入数据。在人类活动识别任务上的实验结果表明,我们提出的用户独立方法比竞争基线方法更有效。
{"title":"Human Activity Recognition Using Convolutional Neural Networks","authors":"Gulustan Dogan, Sinem Sena Ertas, Iremnaz Cay","doi":"10.1109/CIBCB49929.2021.9562906","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562906","url":null,"abstract":"Using smartphone sensors to recognize human activity may be advantageous due to the abundant volume of data that can be obtained. In this paper, we propose a sensor data based deep learning approach for recognizing human activity. Our proposed recognition method uses linear accelerometer (LAcc), gyroscope (Gyr), and magnetometer (Mag) sensors to perceive eight transportation and locomotion activities. The eight activities include: Still, Walk, Run, Bike, Bus, Car, Train, and Subway. In this study, the Sussex-Huawei Locomotion (SHL) Dataset of three participants are used to recognize the physical activities of the users. Fast Fourier Transform (FFT) spectrograms generated from the three axes of the LAcc, Gyr, and Mag sensor data are used as input data for our proposed Convolutional Neural Network (CNN) model. Experimental results on the task of human activity recognition demonstrated the effectiveness of our proposed user-independent approach over that of competitive baselines.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132739753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Dynamically Regulated Initialization for S-system Modelling of Genetic Networks 遗传网络s系统建模的动态调节初始化
Jaskaran Gill, M. Chetty, Adrian B. R. Shatte, J. Hallinan
Reverse engineering of gene regulatory networks through temporal gene expression data is an active area of research. Among the plethora of modelling techniques under investigation is the decoupled S-system model, which attempts to capture the non-linearity of biological systems in detail. For the model, number of parameters to be estimated are significantly high even when the network is of small or medium scale. Thus, the inference process poses a significant computational burden. In this paper, we propose: (1) a novel population initialization technique, Dynamically Regulated Prediction Initialization (DRPI), which utilises prior knowledge of biological gene expression data to create a feedback loop to produce dynamically regulated high-quality individuals for initial population; (2) an adaptive fitness function; and (3) a method for the maintenance of population diversity. The aim of this work is to reduce the computational complexity of the inference algorithm, to speed up the entire process of reverse engineering. The performance of the proposed algorithm was evaluated against a benchmark dataset and compared with other methods from earlier work. The experimental results show that we succeeded in achieving higher accuracy results in lesser fitness evaluations, considerably reducing the computational burden of the inference process.
通过时间基因表达数据进行基因调控网络的逆向工程是一个活跃的研究领域。在众多正在研究的建模技术中,解耦s系统模型试图详细捕捉生物系统的非线性。对于模型来说,即使是在中小型网络中,需要估计的参数数量也非常多。因此,推理过程带来了巨大的计算负担。在本文中,我们提出:(1)小说种群初始化技术,动态调节预测初始化(DRPI),利用先验知识的生物基因表达数据创建一个反馈回路来产生初始种群动态监管的高质量的个人;(2)自适应适应度函数;(3)维持种群多样性的方法。本工作的目的是降低推理算法的计算复杂度,加快逆向工程的整个过程。根据基准数据集评估了所提出算法的性能,并与早期工作中的其他方法进行了比较。实验结果表明,我们成功地在较少的适应度评估下获得了较高的准确率结果,大大减少了推理过程的计算负担。
{"title":"Dynamically Regulated Initialization for S-system Modelling of Genetic Networks","authors":"Jaskaran Gill, M. Chetty, Adrian B. R. Shatte, J. Hallinan","doi":"10.1109/CIBCB49929.2021.9562958","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562958","url":null,"abstract":"Reverse engineering of gene regulatory networks through temporal gene expression data is an active area of research. Among the plethora of modelling techniques under investigation is the decoupled S-system model, which attempts to capture the non-linearity of biological systems in detail. For the model, number of parameters to be estimated are significantly high even when the network is of small or medium scale. Thus, the inference process poses a significant computational burden. In this paper, we propose: (1) a novel population initialization technique, Dynamically Regulated Prediction Initialization (DRPI), which utilises prior knowledge of biological gene expression data to create a feedback loop to produce dynamically regulated high-quality individuals for initial population; (2) an adaptive fitness function; and (3) a method for the maintenance of population diversity. The aim of this work is to reduce the computational complexity of the inference algorithm, to speed up the entire process of reverse engineering. The performance of the proposed algorithm was evaluated against a benchmark dataset and compared with other methods from earlier work. The experimental results show that we succeeded in achieving higher accuracy results in lesser fitness evaluations, considerably reducing the computational burden of the inference process.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117314651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
SPOT-1D2: Improving Protein Secondary Structure Prediction using High Sequence Identity Training Set and an Ensemble of Recurrent and Residual-convolutional Neural Networks SPOT-1D2:利用高序列同一性训练集和循环和残差卷积神经网络集成改进蛋白质二级结构预测
Jaspreet Singh, Jaswinder Singh, K. Paliwal, Andrew Busch, Yaoqi Zhou
Protein secondary structure prediction has been a long-standing problem in computational biology. Recent advances in deep contextual learning have enabled its performance in three-state prediction closer to the theoretical limit at 88–90%. Here, we showed that a large training set with 95% sequence identity cutoff can improve prediction of secondary structures even for those unrelated test sequences (<25% sequence identity cutoff) compared to the use of a non-redundant training dataset with 25% sequence identity cutoff. The three-state prediction edges closer to an accuracy of 87% and eight-state at 76%.The resulting model called SPOT-1D2 is freely available to academic users at https://github.com/jas-preet/SPOT-1D2.
蛋白质二级结构预测是计算生物学中一个长期存在的问题。深度上下文学习的最新进展使其在三状态预测中的表现更接近88-90%的理论极限。在这里,我们表明,与使用具有25%序列身份截止率的非冗余训练数据集相比,具有95%序列身份截止率的大型训练集可以改善二级结构的预测,甚至对于那些不相关的测试序列(<25%序列身份截止率)。三州预测的准确率接近87%,八州预测的准确率为76%。由此产生的模型被称为SPOT-1D2,学术用户可以在https://github.com/jas-preet/SPOT-1D2上免费获得。
{"title":"SPOT-1D2: Improving Protein Secondary Structure Prediction using High Sequence Identity Training Set and an Ensemble of Recurrent and Residual-convolutional Neural Networks","authors":"Jaspreet Singh, Jaswinder Singh, K. Paliwal, Andrew Busch, Yaoqi Zhou","doi":"10.1109/CIBCB49929.2021.9562849","DOIUrl":"https://doi.org/10.1109/CIBCB49929.2021.9562849","url":null,"abstract":"Protein secondary structure prediction has been a long-standing problem in computational biology. Recent advances in deep contextual learning have enabled its performance in three-state prediction closer to the theoretical limit at 88–90%. Here, we showed that a large training set with 95% sequence identity cutoff can improve prediction of secondary structures even for those unrelated test sequences (<25% sequence identity cutoff) compared to the use of a non-redundant training dataset with 25% sequence identity cutoff. The three-state prediction edges closer to an accuracy of 87% and eight-state at 76%.The resulting model called SPOT-1D2 is freely available to academic users at https://github.com/jas-preet/SPOT-1D2.","PeriodicalId":163387,"journal":{"name":"2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132501103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2021 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1