A Neuro Symbolic Approach for Contradiction Detection in Persian Text

J. Univers. Comput. Sci. Pub Date : 2023-03-28 DOI:10.3897/jucs.90646

Zeinab Rahimi, M. Shamsfard

{"title":"A Neuro Symbolic Approach for Contradiction Detection in Persian Text","authors":"Zeinab Rahimi, M. Shamsfard","doi":"10.3897/jucs.90646","DOIUrl":null,"url":null,"abstract":"Detection of semantic contradictory sentences is a challenging and fundamental issue for some NLP applications, such as textual entailments recognition. In this study, contradiction means different types of semantic confrontation, such as negation, antonymy, and numerical. Due to the lack of sufficient data to apply precise machine learning and, specifically, deep learning methods to Persian and other low-resource languages, rule-based approaches are of great interest. Also, recently, the emergence of new methods such as transfer learning has opened up the possibility of deep learning for low-resource languages. This paper introduces a hybrid contradiction detection approach for detecting seven categories of contradictions in Persian texts: Antonymy, negation, numerical, factive, structural, lexical and world knowledge. The proposed method consists of 1) a novel data mining method and 2) a transformer-based deep neural method for contradiction detection . Also, a simple baseline is presented for comparison. The data mining method uses frequent rule mining to extract appropriate contradiction detection rules employing a development set. Extracted rules are tested for different categories of contradictory sentences. In the first step, a classifier checks whether the rules work for an input sentence pair. Then, according to the result, rules are used for three categories of negation, numerical, and antonym. In this part, the highest F-measure is obtained for detecting the negation category (90%), the average F-measure for these three categories is 86%, and for the other four categories, in which the rules have a lower F-measure of 62%, the transformer-based method achieved 76%. The proposed hybrid approach has an overall f-measure of higher than 80%. ","PeriodicalId":14652,"journal":{"name":"J. Univers. Comput. Sci.","volume":"25 1","pages":"242-264"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Univers. Comput. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/jucs.90646","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Detection of semantic contradictory sentences is a challenging and fundamental issue for some NLP applications, such as textual entailments recognition. In this study, contradiction means different types of semantic confrontation, such as negation, antonymy, and numerical. Due to the lack of sufficient data to apply precise machine learning and, specifically, deep learning methods to Persian and other low-resource languages, rule-based approaches are of great interest. Also, recently, the emergence of new methods such as transfer learning has opened up the possibility of deep learning for low-resource languages. This paper introduces a hybrid contradiction detection approach for detecting seven categories of contradictions in Persian texts: Antonymy, negation, numerical, factive, structural, lexical and world knowledge. The proposed method consists of 1) a novel data mining method and 2) a transformer-based deep neural method for contradiction detection . Also, a simple baseline is presented for comparison. The data mining method uses frequent rule mining to extract appropriate contradiction detection rules employing a development set. Extracted rules are tested for different categories of contradictory sentences. In the first step, a classifier checks whether the rules work for an input sentence pair. Then, according to the result, rules are used for three categories of negation, numerical, and antonym. In this part, the highest F-measure is obtained for detecting the negation category (90%), the average F-measure for these three categories is 86%, and for the other four categories, in which the rules have a lower F-measure of 62%, the transformer-based method achieved 76%. The proposed hybrid approach has an overall f-measure of higher than 80%.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

波斯语文本矛盾检测的神经符号方法

语义矛盾句的检测是一些自然语言处理应用(如文本蕴涵识别)中一个具有挑战性和基础性的问题。在本研究中，矛盾是指不同类型的语义对抗，如否定、反义词、数词等。由于缺乏足够的数据来应用精确的机器学习，特别是对波斯语和其他低资源语言的深度学习方法，基于规则的方法非常有趣。此外，最近，迁移学习等新方法的出现为低资源语言的深度学习开辟了可能性。本文介绍了一种混合矛盾检测方法，用于检测波斯语文本中的七种矛盾:反义词、否定、数字、事实、结构、词汇和世界知识。该方法由一种新的数据挖掘方法和一种基于变压器的深度神经网络的矛盾检测方法组成。此外，还提供了一个简单的基线进行比较。数据挖掘方法采用频繁规则挖掘，利用开发集提取合适的矛盾检测规则。对所提取的规则进行了不同类别矛盾句的测试。在第一步中，分类器检查规则是否适用于输入句子对。然后，根据结果，对否定、数词和反义词三大类规则进行了应用。在这一部分中，检测否定类获得了最高的f值(90%)，这三个类别的平均f值为86%，对于其他四个类别，规则的f值较低，为62%，基于变压器的方法达到76%。所提出的混合方法的总体f值高于80%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

J. Univers. Comput. Sci.

自引率

0.00%

发文量