{"title":"Towards a broad-coverage graphemic analysis of large historical corpora","authors":"Sandra Waldenberger, Stefanie Dipper, Ilka Lemke","doi":"10.1515/zfs-2021-2037","DOIUrl":null,"url":null,"abstract":"Abstract This paper presents a method which we are developing to explore graphemic variation in large historical corpora of German. Historical corpora provide an amount of data at the level of graphemics which cannot be handled exhaustively using common methods of manual evaluation. To deal with this challenge, we apply methods from computational linguistics to pave the way for a broad-coverage graph(em)ic analysis of large historical corpora. In this paper, we show how our approach can be applied to the Reference Corpus of Middle High German. Illustrating our method and linguistic analysis, we present findings from our investigations into diatopic and/or diachronic variation as documented in 13th and 14th century charters (Urkunden) from the corpus.","PeriodicalId":43494,"journal":{"name":"Zeitschrift Fur Sprachwissenschaft","volume":"40 1","pages":"401 - 420"},"PeriodicalIF":0.6000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Zeitschrift Fur Sprachwissenschaft","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1515/zfs-2021-2037","RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract This paper presents a method which we are developing to explore graphemic variation in large historical corpora of German. Historical corpora provide an amount of data at the level of graphemics which cannot be handled exhaustively using common methods of manual evaluation. To deal with this challenge, we apply methods from computational linguistics to pave the way for a broad-coverage graph(em)ic analysis of large historical corpora. In this paper, we show how our approach can be applied to the Reference Corpus of Middle High German. Illustrating our method and linguistic analysis, we present findings from our investigations into diatopic and/or diachronic variation as documented in 13th and 14th century charters (Urkunden) from the corpus.
期刊介绍:
The aim of the journal is to promote linguistic research by publishing high-quality contributions and thematic special issues from all fields and trends of modern linguistics. In addition to articles and reviews, the journal also features contributions to discussions on current controversies in the field as well as overview articles outlining the state-of-the art of relevant research paradigms. Topics: -General Linguistics -Language Typology -Language acquisition, language change and synchronic variation -Empirical linguistics: experimental and corpus-based research -Contributions to theory-building