Exploring Coreference Features in Heterogeneous Data with Text Classification

Ekaterina Lapshinova-Koltunski, K. Kunz
{"title":"Exploring Coreference Features in Heterogeneous Data with Text Classification","authors":"Ekaterina Lapshinova-Koltunski, K. Kunz","doi":"10.18653/v1/2020.codi-1.6","DOIUrl":null,"url":null,"abstract":"The present paper focuses on variation phenomena in coreference chains. We address the hypothesis that the degree of structural variation between chain elements depends on language-specific constraints and preferences and, even more, on the communicative situation of language production. We define coreference features that also include reference to abstract entities and events. These features are inspired through several sources – cognitive parameters, pragmatic factors and typological status. We pay attention to the distributions of these features in a dataset containing English and German texts of spoken and written discourse mode, which can be classified into seven different registers. We apply text classification and feature selection to find out how these variational dimensions (language, mode and register) impact on coreference features. Knowledge on the variation under analysis is valuable for contrastive linguistics, translation studies and multilingual natural language processing (NLP), e.g. machine translation or cross-lingual coreference resolution.","PeriodicalId":332037,"journal":{"name":"Proceedings of the First Workshop on Computational Approaches to Discourse","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the First Workshop on Computational Approaches to Discourse","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2020.codi-1.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

The present paper focuses on variation phenomena in coreference chains. We address the hypothesis that the degree of structural variation between chain elements depends on language-specific constraints and preferences and, even more, on the communicative situation of language production. We define coreference features that also include reference to abstract entities and events. These features are inspired through several sources – cognitive parameters, pragmatic factors and typological status. We pay attention to the distributions of these features in a dataset containing English and German texts of spoken and written discourse mode, which can be classified into seven different registers. We apply text classification and feature selection to find out how these variational dimensions (language, mode and register) impact on coreference features. Knowledge on the variation under analysis is valuable for contrastive linguistics, translation studies and multilingual natural language processing (NLP), e.g. machine translation or cross-lingual coreference resolution.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用文本分类探索异构数据中的共参考特征
本文主要研究共参考链中的变异现象。我们提出了一个假设,即链元素之间的结构变化程度取决于语言特定的约束和偏好,甚至更多地取决于语言生产的交际情况。我们定义了包括对抽象实体和事件的引用在内的共引用特性。这些特征的灵感来源于认知参数、语用因素和类型状态。我们在包含英语和德语口语和书面话语模式文本的数据集中关注这些特征的分布,这些文本可以被分类为七个不同的语域。我们运用文本分类和特征选择来研究这些变化维度(语言、模式和语域)对共指特征的影响。分析变异的知识对于对比语言学、翻译研究和多语言自然语言处理(NLP),例如机器翻译或跨语言共指解析是有价值的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Do sentence embeddings capture discourse properties of sentences from Scientific Abstracts ? Contextualized Embeddings for Connective Disambiguation in Shallow Discourse Parsing Joint Modeling of Arguments for Event Understanding Coreference for Discourse Parsing: A Neural Approach Computational Interpretation of Recency for the Choice of Referring Expressions in Discourse
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1