古希腊语的词源化

IF 0.5 0 LANGUAGE & LINGUISTICS Journal of Greek Linguistics Pub Date : 2020-11-12 DOI:10.1163/15699846-02002001
A. Vatri, Barbara McGillivray
{"title":"古希腊语的词源化","authors":"A. Vatri, Barbara McGillivray","doi":"10.1163/15699846-02002001","DOIUrl":null,"url":null,"abstract":"\n This article presents the result of accuracy tests for currently available Ancient Greek lemmatizers and recently published lemmatized corpora. We ran a blinded experiment in which three highly proficient readers of Ancient Greek evaluated the output of the CLTK lemmatizer, of the CLTK backoff lemmatizer, and of GLEM, together with the lemmatizations offered by the Diorisis corpus and the Lemmatized Ancient Greek Texts repository. The texts chosen for this experiment are Homer, Iliad 1.1–279 and Lysias 7. The results suggest that lemmatization methods using large lexica as well as part-of-speech tagging—such as those employed by the Diorisis corpus and the CLTK backoff lemmatizer—are more reliable than methods that rely more heavily on machine learning and use smaller lexica.","PeriodicalId":42386,"journal":{"name":"Journal of Greek Linguistics","volume":"20 1","pages":"179-196"},"PeriodicalIF":0.5000,"publicationDate":"2020-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Lemmatization for Ancient Greek\",\"authors\":\"A. Vatri, Barbara McGillivray\",\"doi\":\"10.1163/15699846-02002001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n This article presents the result of accuracy tests for currently available Ancient Greek lemmatizers and recently published lemmatized corpora. We ran a blinded experiment in which three highly proficient readers of Ancient Greek evaluated the output of the CLTK lemmatizer, of the CLTK backoff lemmatizer, and of GLEM, together with the lemmatizations offered by the Diorisis corpus and the Lemmatized Ancient Greek Texts repository. The texts chosen for this experiment are Homer, Iliad 1.1–279 and Lysias 7. The results suggest that lemmatization methods using large lexica as well as part-of-speech tagging—such as those employed by the Diorisis corpus and the CLTK backoff lemmatizer—are more reliable than methods that rely more heavily on machine learning and use smaller lexica.\",\"PeriodicalId\":42386,\"journal\":{\"name\":\"Journal of Greek Linguistics\",\"volume\":\"20 1\",\"pages\":\"179-196\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2020-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Greek Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1163/15699846-02002001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Greek Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1163/15699846-02002001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 1

摘要

本文介绍了目前可用的古希腊词法和最近出版的词法语料库的准确性测试结果。我们进行了一项盲法实验,让三位精通古希腊语的读者评估CLTK词法归纳器、CLTK退退词法归纳器和GLEM的输出,以及由Diorisis语料库和词法归纳的古希腊语文本库提供的词法归纳。这个实验选择的文本是荷马,伊利亚特1.1-279和吕西亚斯7。结果表明,使用大型词汇库和词性标注的词汇化方法——比如Diorisis语料库和CLTK backoff词汇化器所使用的词汇化方法——比更依赖于机器学习和使用较小词汇库的方法更可靠。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Lemmatization for Ancient Greek
This article presents the result of accuracy tests for currently available Ancient Greek lemmatizers and recently published lemmatized corpora. We ran a blinded experiment in which three highly proficient readers of Ancient Greek evaluated the output of the CLTK lemmatizer, of the CLTK backoff lemmatizer, and of GLEM, together with the lemmatizations offered by the Diorisis corpus and the Lemmatized Ancient Greek Texts repository. The texts chosen for this experiment are Homer, Iliad 1.1–279 and Lysias 7. The results suggest that lemmatization methods using large lexica as well as part-of-speech tagging—such as those employed by the Diorisis corpus and the CLTK backoff lemmatizer—are more reliable than methods that rely more heavily on machine learning and use smaller lexica.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Greek Linguistics
Journal of Greek Linguistics LANGUAGE & LINGUISTICS-
CiteScore
1.10
自引率
0.00%
发文量
1
审稿时长
42 weeks
期刊最新文献
On the representation and realization of the Ancient Greek acute: Evidence from tone-tune mappings in Ancient Greek music Discourse marker development in epistolary contexts: Ἰδού ‘look!’ in the Greek epistolary papyri A Construction Morphology account for Ancient Greek accentuation Synthetic-analytic variation in the formation of Greek comparatives and relative superlatives Derivational morphology in Modern Greek: The State of the Art
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1