{"title":"Analytical grammar forms extraction as a new challenge for corpora (Case of conditional mood in Polish and Ukrainian)","authors":"S. Fokin","doi":"10.17651/polon.42.9","DOIUrl":null,"url":null,"abstract":"A particular challenge for modern textual corpora is the tagging of analytical grammar categories. The com-ponents of these categories may be separated in certain contexts by other words or may even be inverted. A particular interest regarding the selection of analytical grammatical forms is centred around the conditional mood in some Slavic languages, as expressed by means of two words: a past verb form and the particle by/б/би/бы, which is why in most modern corpora, this category lacks a specific tag for these compound forms. The case of Polish is particularly complicated because the particle by may either be merged with the parti-ciple or used separately; furthermore, its separated form may contain a personal verb ending. Specific que-ries subject to experiment on Polish and Ukrainian corpora allow selecting the analytical forms in question.","PeriodicalId":50887,"journal":{"name":"Acta Palaeontologica Polonica","volume":"53 1","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Palaeontologica Polonica","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.17651/polon.42.9","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PALEONTOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
A particular challenge for modern textual corpora is the tagging of analytical grammar categories. The com-ponents of these categories may be separated in certain contexts by other words or may even be inverted. A particular interest regarding the selection of analytical grammatical forms is centred around the conditional mood in some Slavic languages, as expressed by means of two words: a past verb form and the particle by/б/би/бы, which is why in most modern corpora, this category lacks a specific tag for these compound forms. The case of Polish is particularly complicated because the particle by may either be merged with the parti-ciple or used separately; furthermore, its separated form may contain a personal verb ending. Specific que-ries subject to experiment on Polish and Ukrainian corpora allow selecting the analytical forms in question.
期刊介绍:
Acta Palaeontologica Polonica is an international quarterly journal publishing papers of general interest from all areas of paleontology. Since its founding by Roman Kozłowski in 1956, various currents of modern paleontology have been represented in the contents of the journal, especially those rooted in biologically oriented paleontology, an area he helped establish.
In-depth studies of all kinds of fossils, of the mode of life of ancient organisms and structure of their skeletons are welcome, as those offering stratigraphically ordered evidence of evolution. Work on vertebrates and applications of fossil evidence to developmental studies, both ontogeny and astogeny of clonal organisms, have a long tradition in our journal. Evolution of the biosphere and its ecosystems, as inferred from geochemical evidence, has also been the focus of studies published in the journal.