Applied Corpus Linguistics for Lexicography: Sepedi Negation as a Case in Point

IF 0.9 2区 文学 N/A LANGUAGE & LINGUISTICS Lexikos Pub Date : 2022-01-01 DOI:10.5788/32-2-1698
Gertrud Faaß
{"title":"Applied Corpus Linguistics for Lexicography: Sepedi Negation as a Case in Point","authors":"Gertrud Faaß","doi":"10.5788/32-2-1698","DOIUrl":null,"url":null,"abstract":"So far, Sepedi negations have been considered more from the point of view of lexico­graphical treatment. Theoretical works on Sepedi have been used for this purpose, setting as an objective a neat description of these negations in a (paper) dictionary. This paper is from a different perspective: instead of theoretical works, corpus linguistic methods are used: (1) a Sepedi corpus is examined on the basis of existing descriptions of the occurrences of a relevant verb, looking at its negated forms from a purely prescriptive point of view; (2) a \"corpus-driven\" strategy is employed, looking only for sequences of negation particles (or morphemes) in order to list occurring con­structions, without taking into account the verbs occurring in them, apart from their endings. The approach in (2) is only intended to show a possible methodology to extend existing theories on occurring negations. We would also like to try to help lexicographers to establish a frequency-based order of entries of possible negation forms in their dictionaries by showing them the number of respective occurrences. As with all corpus linguistic work, however, we must regard corpus evidence not as representative, but as tendencies of language use that can be detected and described. This is especially true for Sepedi, for which only few and small corpora exist. This paper also describes the resources and tools used to create the necessary corpus and also how it was annotated with part of speech and lemmas. Exploring the quality of available Sepedi part-of-speech taggers concerning verbs, negation morphemes and subject concords may be a positive side result. Keywords: African languages dictionaries, corpus linguistics, negation, Sepedi, Northern Sotho, lexicography, part-of-speech tagging, corpus query processing","PeriodicalId":43907,"journal":{"name":"Lexikos","volume":null,"pages":null},"PeriodicalIF":0.9000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lexikos","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.5788/32-2-1698","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"N/A","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0

Abstract

So far, Sepedi negations have been considered more from the point of view of lexico­graphical treatment. Theoretical works on Sepedi have been used for this purpose, setting as an objective a neat description of these negations in a (paper) dictionary. This paper is from a different perspective: instead of theoretical works, corpus linguistic methods are used: (1) a Sepedi corpus is examined on the basis of existing descriptions of the occurrences of a relevant verb, looking at its negated forms from a purely prescriptive point of view; (2) a "corpus-driven" strategy is employed, looking only for sequences of negation particles (or morphemes) in order to list occurring con­structions, without taking into account the verbs occurring in them, apart from their endings. The approach in (2) is only intended to show a possible methodology to extend existing theories on occurring negations. We would also like to try to help lexicographers to establish a frequency-based order of entries of possible negation forms in their dictionaries by showing them the number of respective occurrences. As with all corpus linguistic work, however, we must regard corpus evidence not as representative, but as tendencies of language use that can be detected and described. This is especially true for Sepedi, for which only few and small corpora exist. This paper also describes the resources and tools used to create the necessary corpus and also how it was annotated with part of speech and lemmas. Exploring the quality of available Sepedi part-of-speech taggers concerning verbs, negation morphemes and subject concords may be a positive side result. Keywords: African languages dictionaries, corpus linguistics, negation, Sepedi, Northern Sotho, lexicography, part-of-speech tagging, corpus query processing
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
语料库语言学在词典编纂中的应用:以词缀否定为例
到目前为止,人们更多地是从词典学处理的角度来考虑词缀否定。Sepedi的理论著作已被用于此目的,在(纸质)词典中将这些否定的简洁描述设置为客观。本文从一个不同的角度出发,采用语料库语言学的方法,而不是理论研究:(1)语料库是在现有的相关动词出现描述的基础上进行检验的,从纯粹的规定性的角度来看待其否定形式;(2)采用“语料库驱动”策略,只寻找否定粒子(或语素)序列来列出发生的构式,而不考虑其中发生的动词,除了它们的结尾。(2)中的方法只是为了展示一种可能的方法来扩展现有的关于发生否定的理论。我们还希望通过向词典编纂者显示各自出现的次数,帮助他们在字典中建立基于频率的可能否定形式条目顺序。然而,与所有语料库语言学工作一样,我们必须将语料库证据视为可以检测和描述的语言使用趋势,而不是代表性的。对于Sepedi来说尤其如此,因为它的语料库很少而且很小。本文还介绍了用于创建必要语料库的资源和工具,以及如何用词性和引理进行注释。探索现有的sepdi词性标注器的质量,包括动词、否定语素和主语协和音,可能是一个积极的结果。关键词:非洲语言词典,语料库语言学,否定,Sepedi,北索托语,词典编纂,词性标注,语料库查询处理
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Lexikos
Lexikos Multiple-
CiteScore
1.00
自引率
25.00%
发文量
15
审稿时长
7 weeks
期刊介绍: Lexikos (Greek for "of or for words") is a journal for the lexicographical specialist. It is the only journal in Africa which is exclusively devoted to lexicography. Articles dealing with all aspects of lexicography and terminology or the implications that research in related disciplines such as linguistics, computer and information science, etc. has for lexicography will be considered for publication. Articles may be written in Afrikaans, English, Dutch, German and French.
期刊最新文献
The Operative Function in Spanish Lexicography Exemplified through Sport Dictionaries and Other Reference Works Synonymy from a Prototype Theory Perspective and its Symbiosis with Polysemy: Towards a New Dictionary of Synonyms African Englishes in the Oxford English Dictionary Heming Yong and Jing Peng. A Sociolinguistic History of British English Lexicography. A New English–Serbian Dictionary of Sports Terms in the Light of Contemporary Challenges
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1