Investigating Diatopic Variation in a Historical Corpus

Workshop on NLP for Similar Languages, Varieties and Dialects Pub Date : 2017-04-01 DOI:10.18653/v1/W17-1204

Stefanie Dipper, Sandra Waldenberger

引用次数: 2

Abstract

This paper investigates diatopic variation in a historical corpus of German. Based on equivalent word forms from different language areas, replacement rules and mappings are derived which describe the relations between these word forms. These rules and mappings are then interpreted as reflections of morphological, phonological or graphemic variation. Based on sample rules and mappings, we show that our approach can replicate results from historical linguistics. While previous studies were restricted to predefined word lists, or confined to single authors or texts, our approach uses a much wider range of data available in historical corpora.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

本文研究了德语历史语料库中的音位变化。基于不同语言区域的等价词形，推导出描述这些词形之间关系的替换规则和映射。然后，这些规则和映射被解释为形态、语音或字母变化的反映。基于样本规则和映射，我们证明了我们的方法可以复制历史语言学的结果。虽然以前的研究仅限于预定义的单词列表，或者局限于单个作者或文本，但我们的方法使用了历史语料库中更广泛的可用数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Workshop on NLP for Similar Languages, Varieties and Dialects

自引率

0.00%

发文量