A Corpus of the Sorani Kurdish Folkloric Lyrics

Sina Ahmadi, Hossein Hassani, K. Abedi
{"title":"A Corpus of the Sorani Kurdish Folkloric Lyrics","authors":"Sina Ahmadi, Hossein Hassani, K. Abedi","doi":"10.13025/1YDH-EW61","DOIUrl":null,"url":null,"abstract":"Kurdish poetry and prose narratives were historically transmitted orally and less in a written form. Being an essential medium of oral narration and literature, Kurdish lyrics have had a unique attribute in becoming a vital resource for different types of studies, including Digital Humanities, Computational Folkloristics and Computational Linguistics. As an initial study of its kind for the Kurdish language, this paper presents our efforts in transcribing and collecting Kurdish folk lyrics as a corpus that covers various Kurdish musical genres, in particular Beyt, Gorani, Bend, and Heyran. We believe that this corpus contributes to Kurdish language processing in several ways, such as compensation for the lack of a long history of written text by incorporating oral literature, presenting an unexplored realm in Kurdish language processing, and assisting the initiation of Kurdish computational folkloristics. Our corpus contains 49,582 tokens in the Sorani dialect of Kurdish. The corpus is publicly available in the Text Encoding Initiative (TEI) format for non-commercial use.","PeriodicalId":190269,"journal":{"name":"Workshop on Spoken Language Technologies for Under-resourced Languages","volume":"331 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Spoken Language Technologies for Under-resourced Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13025/1YDH-EW61","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Kurdish poetry and prose narratives were historically transmitted orally and less in a written form. Being an essential medium of oral narration and literature, Kurdish lyrics have had a unique attribute in becoming a vital resource for different types of studies, including Digital Humanities, Computational Folkloristics and Computational Linguistics. As an initial study of its kind for the Kurdish language, this paper presents our efforts in transcribing and collecting Kurdish folk lyrics as a corpus that covers various Kurdish musical genres, in particular Beyt, Gorani, Bend, and Heyran. We believe that this corpus contributes to Kurdish language processing in several ways, such as compensation for the lack of a long history of written text by incorporating oral literature, presenting an unexplored realm in Kurdish language processing, and assisting the initiation of Kurdish computational folkloristics. Our corpus contains 49,582 tokens in the Sorani dialect of Kurdish. The corpus is publicly available in the Text Encoding Initiative (TEI) format for non-commercial use.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
索拉尼族库尔德民歌歌词语料库
库尔德诗歌和散文叙事在历史上是口头传播的,很少以书面形式传播。作为口头叙述和文学的重要媒介,库尔德歌词具有独特的属性,成为不同类型研究的重要资源,包括数字人文、计算民俗学和计算语言学。作为对库尔德语的初步研究,本文介绍了我们在转录和收集库尔德民间歌词方面所做的努力,这些歌词作为一个语料库,涵盖了各种库尔德音乐流派,特别是Beyt, Gorani, Bend和Heyran。我们相信这个语料库在几个方面有助于库尔德语的处理,比如通过结合口头文学来弥补缺乏长期的书面文本历史,在库尔德语处理中呈现一个未开发的领域,并协助库尔德计算民俗学的启动。我们的语料库包含49,582个库尔德语索拉尼方言的符号。该语料库以文本编码倡议(TEI)格式公开提供,供非商业用途。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Corpus of the Sorani Kurdish Folkloric Lyrics A Sentiment Analysis Dataset for Code-Mixed Malayalam-English Corpus Creation for Sentiment Analysis in Code-Mixed Tamil-English Text Text Normalization for Bangla, Khmer, Nepali, Javanese, Sinhala and Sundanese Text-to-Speech Systems Crowd-Sourced Speech Corpora for Javanese, Sundanese, Sinhala, Nepali, and Bangladeshi Bengali
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1