配置:探索语篇层次的汉语修辞格

Proceedings of COLING. International Conference on Computational Linguistics Pub Date : 2022-09-16 DOI:10.48550/arXiv.2209.07678

Dawei Zhu, Qiusi Zhan, Zhejian Zhou, Yifan Song, Jiebin Zhang, Sujian Li

{"title":"配置:探索语篇层次的汉语修辞格","authors":"Dawei Zhu, Qiusi Zhan, Zhejian Zhou, Yifan Song, Jiebin Zhang, Sujian Li","doi":"10.48550/arXiv.2209.07678","DOIUrl":null,"url":null,"abstract":"Figures of speech, such as metaphor and irony, are ubiquitous in literature works and colloquial conversations. This poses great challenge for natural language understanding since figures of speech usually deviate from their ostensible meanings to express deeper semantic implications. Previous research lays emphasis on the literary aspect of figures and seldom provide a comprehensive exploration from a view of computational linguistics. In this paper, we first propose the concept of figurative unit, which is the carrier of a figure. Then we select 12 types of figures commonly used in Chinese, and build a Chinese corpus for Contextualized Figure Recognition (ConFiguRe). Different from previous token-level or sentence-level counterparts, ConFiguRe aims at extracting a figurative unit from discourse-level context, and classifying the figurative unit into the right figure type. On ConFiguRe, three tasks, i.e., figure extraction, figure type classification and figure recognition, are designed and the state-of-the-art techniques are utilized to implement the benchmarks. We conduct thorough experiments and show that all three tasks are challenging for existing models, thus requiring further research. Our dataset and code are publicly available at https://github.com/pku-tangent/ConFiguRe.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"1 1","pages":"3374-3385"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ConFiguRe: Exploring Discourse-level Chinese Figures of Speech\",\"authors\":\"Dawei Zhu, Qiusi Zhan, Zhejian Zhou, Yifan Song, Jiebin Zhang, Sujian Li\",\"doi\":\"10.48550/arXiv.2209.07678\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Figures of speech, such as metaphor and irony, are ubiquitous in literature works and colloquial conversations. This poses great challenge for natural language understanding since figures of speech usually deviate from their ostensible meanings to express deeper semantic implications. Previous research lays emphasis on the literary aspect of figures and seldom provide a comprehensive exploration from a view of computational linguistics. In this paper, we first propose the concept of figurative unit, which is the carrier of a figure. Then we select 12 types of figures commonly used in Chinese, and build a Chinese corpus for Contextualized Figure Recognition (ConFiguRe). Different from previous token-level or sentence-level counterparts, ConFiguRe aims at extracting a figurative unit from discourse-level context, and classifying the figurative unit into the right figure type. On ConFiguRe, three tasks, i.e., figure extraction, figure type classification and figure recognition, are designed and the state-of-the-art techniques are utilized to implement the benchmarks. We conduct thorough experiments and show that all three tasks are challenging for existing models, thus requiring further research. Our dataset and code are publicly available at https://github.com/pku-tangent/ConFiguRe.\",\"PeriodicalId\":91381,\"journal\":{\"name\":\"Proceedings of COLING. International Conference on Computational Linguistics\",\"volume\":\"1 1\",\"pages\":\"3374-3385\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of COLING. International Conference on Computational Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2209.07678\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of COLING. International Conference on Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2209.07678","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

隐喻、反讽等修辞手法在文学作品和口语对话中无处不在。这给自然语言理解带来了巨大的挑战，因为修辞格通常会偏离其表面意义来表达更深层次的语义含义。以往的研究多侧重于数字的文学方面，很少从计算语言学的角度进行全面的探索。本文首先提出了形象单位的概念，形象单位是形象的载体。在此基础上，选取12种汉语常用图形，构建语境化图形识别的汉语语料库(ConFiguRe)。与以往的符号级或句子级的对等物不同，ConFiguRe的目的是从话语级语境中提取比喻单位，并将比喻单位分类为正确的比喻类型。在配置上，设计了三个任务，即图形提取、图形类型分类和图形识别，并利用最先进的技术来实现基准。我们进行了彻底的实验，并表明这三个任务对现有模型都是具有挑战性的，因此需要进一步的研究。我们的数据集和代码可以在https://github.com/pku-tangent/ConFiguRe上公开获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ConFiguRe: Exploring Discourse-level Chinese Figures of Speech

Figures of speech, such as metaphor and irony, are ubiquitous in literature works and colloquial conversations. This poses great challenge for natural language understanding since figures of speech usually deviate from their ostensible meanings to express deeper semantic implications. Previous research lays emphasis on the literary aspect of figures and seldom provide a comprehensive exploration from a view of computational linguistics. In this paper, we first propose the concept of figurative unit, which is the carrier of a figure. Then we select 12 types of figures commonly used in Chinese, and build a Chinese corpus for Contextualized Figure Recognition (ConFiguRe). Different from previous token-level or sentence-level counterparts, ConFiguRe aims at extracting a figurative unit from discourse-level context, and classifying the figurative unit into the right figure type. On ConFiguRe, three tasks, i.e., figure extraction, figure type classification and figure recognition, are designed and the state-of-the-art techniques are utilized to implement the benchmarks. We conduct thorough experiments and show that all three tasks are challenging for existing models, thus requiring further research. Our dataset and code are publicly available at https://github.com/pku-tangent/ConFiguRe.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of COLING. International Conference on Computational Linguistics

自引率

0.00%

发文量