P. Plecháč, Klemens Bobenhausen, Benjamin Hammerich
{"title":"Versification and authorship attribution. A pilot study on Czech, German, Spanish, and English poetry","authors":"P. Plecháč, Klemens Bobenhausen, Benjamin Hammerich","doi":"10.12697/SMP.2018.5.2.02","DOIUrl":null,"url":null,"abstract":"This article describes pilot experiments performed as one part of a longterm project examining the possibilities for using versification analysis to determine the authorships of poetic texts. Since we are addressing this article to both stylometry experts and experts in the study of verse, we first introduce in detail the common classifiers used in contemporary stylometry (Burrows’ Delta, Argamon’s Quadratic Delta, Smith-Aldridge’s Cosine Delta, and the Support Vector Machine) and explain how they work via graphic examples. We then provide an evaluation of these classifiers’ performance when used with the versification features found in Czech, German, Spanish, and English poetry. We conclude that versification is a reasonable stylometric marker, the strength of which is comparable to the other markers traditionally used in stylometry (such as the frequencies of the most frequent words and the frequencies of the most frequent character n-grams).","PeriodicalId":55924,"journal":{"name":"Studia Metrica et Poetica","volume":" ","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2019-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.12697/SMP.2018.5.2.02","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studia Metrica et Poetica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12697/SMP.2018.5.2.02","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 23
Abstract
This article describes pilot experiments performed as one part of a longterm project examining the possibilities for using versification analysis to determine the authorships of poetic texts. Since we are addressing this article to both stylometry experts and experts in the study of verse, we first introduce in detail the common classifiers used in contemporary stylometry (Burrows’ Delta, Argamon’s Quadratic Delta, Smith-Aldridge’s Cosine Delta, and the Support Vector Machine) and explain how they work via graphic examples. We then provide an evaluation of these classifiers’ performance when used with the versification features found in Czech, German, Spanish, and English poetry. We conclude that versification is a reasonable stylometric marker, the strength of which is comparable to the other markers traditionally used in stylometry (such as the frequencies of the most frequent words and the frequencies of the most frequent character n-grams).
本文描述了作为一个长期项目的一部分进行的试点实验,该项目研究了使用诗文分析来确定诗歌文本作者身份的可能性。由于这篇文章是针对文体学专家和诗歌研究专家的,我们首先详细介绍了当代文体学中常用的分类器(Burrows ' s Delta, Argamon ' s Quadratic Delta, Smith-Aldridge ' s Cosine Delta和支持向量机),并通过图形示例解释它们是如何工作的。然后,我们对这些分类器与捷克语、德语、西班牙语和英语诗歌中的诗文特征一起使用时的性能进行了评估。我们得出的结论是,诗化是一种合理的文体学标记,其强度可与文体学中传统使用的其他标记相媲美(例如最常见单词的频率和最常见字符n-gram的频率)。