{"title":"语料库语言学:使用古英语词典获得更多古英语拼写变化的数据","authors":"Mark Faulkner","doi":"10.1093/llc/fqad064","DOIUrl":null,"url":null,"abstract":"Abstract This article presents a methodology for obtaining large datasets for the spelling of individual phonological segments in Old English texts, based on searching the Dictionary of Old English Corpus for the attested spellings listed in the Dictionary of Old English A-H. It exemplifies this ‘corpus philology’ through a study of 216,526 spellings for words beginning with h followed by a vowel, using a variety of techniques to evaluate the methodology’s precision and recall, which are calculated as very high for <h->initial spellings (precision 100% precision, recall 92.1%) and moderate, but still usable, for <h->less spellings (precision 85.5%, recall 58.3%). Data for fourteen other segments related to the behaviour of h- in Old English is presented in the Supplementary Materials that complement the paper online. This dataset of 379,484 spellings from 2,605 Old English texts is shown to seriously problematize the findings of traditional philology, the conclusions of which are in contrast based on only a handful of spellings from a few texts, and to have the potential to radically enhance our understanding of the literary and linguistic histories of English.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"5 1","pages":"0"},"PeriodicalIF":0.7000,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Corpus philology: Using the Dictionary of Old English to get bigger data for Old English spelling variation\",\"authors\":\"Mark Faulkner\",\"doi\":\"10.1093/llc/fqad064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract This article presents a methodology for obtaining large datasets for the spelling of individual phonological segments in Old English texts, based on searching the Dictionary of Old English Corpus for the attested spellings listed in the Dictionary of Old English A-H. It exemplifies this ‘corpus philology’ through a study of 216,526 spellings for words beginning with h followed by a vowel, using a variety of techniques to evaluate the methodology’s precision and recall, which are calculated as very high for <h->initial spellings (precision 100% precision, recall 92.1%) and moderate, but still usable, for <h->less spellings (precision 85.5%, recall 58.3%). Data for fourteen other segments related to the behaviour of h- in Old English is presented in the Supplementary Materials that complement the paper online. This dataset of 379,484 spellings from 2,605 Old English texts is shown to seriously problematize the findings of traditional philology, the conclusions of which are in contrast based on only a handful of spellings from a few texts, and to have the potential to radically enhance our understanding of the literary and linguistic histories of English.\",\"PeriodicalId\":45315,\"journal\":{\"name\":\"Digital Scholarship in the Humanities\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2023-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Scholarship in the Humanities\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/llc/fqad064\",\"RegionNum\":3,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"HUMANITIES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Scholarship in the Humanities","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/llc/fqad064","RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"HUMANITIES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
摘要本文提出了一种基于古英语语料库词典(Dictionary of Old English Corpus)对古英语a - h中列出的已证实的拼写进行检索的方法,用于获取古英语文本中单个音韵段拼写的大型数据集。它通过对以h开头的单词后面跟着一个元音的216,526个拼写的研究来例证这种“语库语言学”,使用各种技术来评估该方法的准确性和召回率,计算结果表明,<h->初始拼写非常高(精确度100%,召回率92.1%),中等,但仍然可用,对于<h->较少拼写(精确度85.5%,召回率58.3%)。与古英语中h-的行为相关的其他14个片段的数据在补充材料中提出,补充在线论文。这个包含2605个古英语文本的379484个拼写的数据集严重质疑了传统文献学的发现,传统文献学的结论仅基于少数文本的少数拼写,并且有可能从根本上增强我们对英语文学和语言历史的理解。
Corpus philology: Using the Dictionary of Old English to get bigger data for Old English spelling variation
Abstract This article presents a methodology for obtaining large datasets for the spelling of individual phonological segments in Old English texts, based on searching the Dictionary of Old English Corpus for the attested spellings listed in the Dictionary of Old English A-H. It exemplifies this ‘corpus philology’ through a study of 216,526 spellings for words beginning with h followed by a vowel, using a variety of techniques to evaluate the methodology’s precision and recall, which are calculated as very high for <h->initial spellings (precision 100% precision, recall 92.1%) and moderate, but still usable, for <h->less spellings (precision 85.5%, recall 58.3%). Data for fourteen other segments related to the behaviour of h- in Old English is presented in the Supplementary Materials that complement the paper online. This dataset of 379,484 spellings from 2,605 Old English texts is shown to seriously problematize the findings of traditional philology, the conclusions of which are in contrast based on only a handful of spellings from a few texts, and to have the potential to radically enhance our understanding of the literary and linguistic histories of English.
期刊介绍:
DSH or Digital Scholarship in the Humanities is an international, peer reviewed journal which publishes original contributions on all aspects of digital scholarship in the Humanities including, but not limited to, the field of what is currently called the Digital Humanities. Long and short papers report on theoretical, methodological, experimental, and applied research and include results of research projects, descriptions and evaluations of tools, techniques, and methodologies, and reports on work in progress. DSH also publishes reviews of books and resources. Digital Scholarship in the Humanities was previously known as Literary and Linguistic Computing.