Mining arguments from cancer documents using Natural Language Processing and ontologies

2016 IEEE 12th International Conference on Intelligent Computer Communication and Processing (ICCP) Pub Date : 2016-07-27 DOI:10.1109/ICCP.2016.7737126

Adrian Groza, Oana Popa

{"title":"Mining arguments from cancer documents using Natural Language Processing and ontologies","authors":"Adrian Groza, Oana Popa","doi":"10.1109/ICCP.2016.7737126","DOIUrl":null,"url":null,"abstract":"In the medical domain, the continuous stream of scientific research contains contradictory results supported by arguments and counter-arguments. As medical expertise occurs at different levels, part of the human agents have difficulties to face the huge amount of studies, but also to understand the reasons and pieces of evidences claimed by the proponents and the opponents of the debated topic. To better understand the supporting arguments for new findings related to current state of the art in the medical domain we need tools able to identify arguments in scientific papers. Our work here aims to fill the above technological gap. We rely on the well-known interleaving of domain knowledge with natural language processing. To formalise the existing medical knowledge, we rely on ontologies. To structure the argumentation model we use also the expressivity and reasoning capabilities of Description Logics. To perform argumentation mining we formalise various linguistic patterns in a rule-based language. We tested our solution against a corpus of scientific papers related to breast cancer. The run experiments show a F-measure between 0.71 and 0.86 for identifying conclusions of an argument and between 0.65 and 0.86 for identifying premises of an argument.","PeriodicalId":343658,"journal":{"name":"2016 IEEE 12th International Conference on Intelligent Computer Communication and Processing (ICCP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 12th International Conference on Intelligent Computer Communication and Processing (ICCP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCP.2016.7737126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

In the medical domain, the continuous stream of scientific research contains contradictory results supported by arguments and counter-arguments. As medical expertise occurs at different levels, part of the human agents have difficulties to face the huge amount of studies, but also to understand the reasons and pieces of evidences claimed by the proponents and the opponents of the debated topic. To better understand the supporting arguments for new findings related to current state of the art in the medical domain we need tools able to identify arguments in scientific papers. Our work here aims to fill the above technological gap. We rely on the well-known interleaving of domain knowledge with natural language processing. To formalise the existing medical knowledge, we rely on ontologies. To structure the argumentation model we use also the expressivity and reasoning capabilities of Description Logics. To perform argumentation mining we formalise various linguistic patterns in a rule-based language. We tested our solution against a corpus of scientific papers related to breast cancer. The run experiments show a F-measure between 0.71 and 0.86 for identifying conclusions of an argument and between 0.65 and 0.86 for identifying premises of an argument.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用自然语言处理和本体从癌症文档中挖掘论点

在医学领域，源源不断的科学研究包含着由论证和反论证支持的相互矛盾的结果。由于医学专业知识发生在不同的层次，部分人类代理人难以面对大量的研究，也难以理解辩论主题的支持者和反对者所声称的理由和证据。为了更好地理解与医学领域当前技术状况相关的新发现的支持论据，我们需要能够识别科学论文中论点的工具。我们这里的工作旨在填补上述技术空白。我们依赖于众所周知的领域知识与自然语言处理的交叉。为了形式化现有的医学知识，我们依赖于本体。为了构建论证模型，我们还使用了描述逻辑的表达能力和推理能力。为了执行论证挖掘，我们将各种语言模式形式化为基于规则的语言。我们用与乳腺癌有关的科学论文的语料库来测试我们的解决方案。运行实验表明，识别论点结论的f值在0.71和0.86之间，识别论点前提的f值在0.65和0.86之间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2016 IEEE 12th International Conference on Intelligent Computer Communication and Processing (ICCP)

自引率

0.00%

发文量

期刊最新文献

A mobile application to improve the quality of life via exercise Modeling framework for hazard management applied to water pollution and radiation dispersion Modelling eye fatigue in gaze spelling task Bird Mating Optimization method for one-to-n skill matching A trust-enriched approach for item-based collaborative filtering recommendations