分析语法形式抽取是对语料库的新挑战(以波兰语和乌克兰语条件语气为例)

IF 1.8 3区 地球科学 Q2 PALEONTOLOGY Acta Palaeontologica Polonica Pub Date : 2022-01-01 DOI:10.17651/polon.42.9
S. Fokin
{"title":"分析语法形式抽取是对语料库的新挑战(以波兰语和乌克兰语条件语气为例)","authors":"S. Fokin","doi":"10.17651/polon.42.9","DOIUrl":null,"url":null,"abstract":"A particular challenge for modern textual corpora is the tagging of analytical grammar categories. The com-ponents of these categories may be separated in certain contexts by other words or may even be inverted. A particular interest regarding the selection of analytical grammatical forms is centred around the conditional mood in some Slavic languages, as expressed by means of two words: a past verb form and the particle by/б/би/бы, which is why in most modern corpora, this category lacks a specific tag for these compound forms. The case of Polish is particularly complicated because the particle by may either be merged with the parti-ciple or used separately; furthermore, its separated form may contain a personal verb ending. Specific que-ries subject to experiment on Polish and Ukrainian corpora allow selecting the analytical forms in question.","PeriodicalId":50887,"journal":{"name":"Acta Palaeontologica Polonica","volume":"53 1","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analytical grammar forms extraction as a new challenge for corpora (Case of conditional mood in Polish and Ukrainian)\",\"authors\":\"S. Fokin\",\"doi\":\"10.17651/polon.42.9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A particular challenge for modern textual corpora is the tagging of analytical grammar categories. The com-ponents of these categories may be separated in certain contexts by other words or may even be inverted. A particular interest regarding the selection of analytical grammatical forms is centred around the conditional mood in some Slavic languages, as expressed by means of two words: a past verb form and the particle by/б/би/бы, which is why in most modern corpora, this category lacks a specific tag for these compound forms. The case of Polish is particularly complicated because the particle by may either be merged with the parti-ciple or used separately; furthermore, its separated form may contain a personal verb ending. Specific que-ries subject to experiment on Polish and Ukrainian corpora allow selecting the analytical forms in question.\",\"PeriodicalId\":50887,\"journal\":{\"name\":\"Acta Palaeontologica Polonica\",\"volume\":\"53 1\",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Palaeontologica Polonica\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.17651/polon.42.9\",\"RegionNum\":3,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PALEONTOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Palaeontologica Polonica","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.17651/polon.42.9","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PALEONTOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

现代文本语料库面临的一个特殊挑战是分析语法范畴的标注。这些范畴的组成部分可能在某些上下文中被其他词语分开,甚至可能被颠倒。关于分析语法形式的选择,一个特别的兴趣集中在一些斯拉夫语言中的条件语气上,通过两个词来表达:一个过去的动词形式和一个/ /би/бы的助词,这就是为什么在大多数现代语料库中,这一类没有针对这些复合形式的特定标签。Polish的情况特别复杂,因为助词by既可以与助词合并使用,也可以单独使用;此外,它的分离形式可能包含人称动词结尾。在波兰语和乌克兰语语料库上进行实验的特定查询允许选择有问题的分析形式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Analytical grammar forms extraction as a new challenge for corpora (Case of conditional mood in Polish and Ukrainian)
A particular challenge for modern textual corpora is the tagging of analytical grammar categories. The com-ponents of these categories may be separated in certain contexts by other words or may even be inverted. A particular interest regarding the selection of analytical grammatical forms is centred around the conditional mood in some Slavic languages, as expressed by means of two words: a past verb form and the particle by/б/би/бы, which is why in most modern corpora, this category lacks a specific tag for these compound forms. The case of Polish is particularly complicated because the particle by may either be merged with the parti-ciple or used separately; furthermore, its separated form may contain a personal verb ending. Specific que-ries subject to experiment on Polish and Ukrainian corpora allow selecting the analytical forms in question.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Acta Palaeontologica Polonica
Acta Palaeontologica Polonica 地学-古生物学
CiteScore
2.80
自引率
5.60%
发文量
36
审稿时长
12.5 months
期刊介绍: Acta Palaeontologica Polonica is an international quarterly journal publishing papers of general interest from all areas of paleontology. Since its founding by Roman Kozłowski in 1956, various currents of modern paleontology have been represented in the contents of the journal, especially those rooted in biologically oriented paleontology, an area he helped establish. In-depth studies of all kinds of fossils, of the mode of life of ancient organisms and structure of their skeletons are welcome, as those offering stratigraphically ordered evidence of evolution. Work on vertebrates and applications of fossil evidence to developmental studies, both ontogeny and astogeny of clonal organisms, have a long tradition in our journal. Evolution of the biosphere and its ecosystems, as inferred from geochemical evidence, has also been the focus of studies published in the journal.
期刊最新文献
Further Desmostylian Remains from the late Oligocene of Vancouver Island, British Columbia, Canada New species of mirid insects and their importance for the higher classification of plant bugs Chaetognath grasping spines from the Upper Famennian (Devonian) of Poland: their construction and geochemistry La Piquera (central Iberian Peninsula): A new key vertebrate locality for the Early Pliocene of western Europe Bivalve-barnacle pseudoplanktonic colonisation of wood from the Toarcian, Lower Jurassic, Strawberry Bank Lagerstätte, Somerset UK
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1