首页 > 最新文献

International Journal of Lexicography最新文献

英文 中文
Two Ways of Representing Specialist Knowledge: Analysing the Botanical Lexicon in Diccionario de la Lengua Española and Diccionario del Español de México 代表专家知识的两种方式:分析《西班牙语词典》和《墨西哥西班牙语词典》中的植物词典
IF 0.5 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2023-07-11 DOI: 10.1093/ijl/ecad014
Jesús Camacho-Niño
This article explores the lexicographic codification of botanical knowledge in two general dictionaries: Diccionario de la lengua española (2014 [2021]), produced by the Real Academia Española in Spain, and Diccionario del español de México (2019), published by El Colegio de México. It begins with a historical overview of the inclusion of botanical terminology in general dictionaries by the Real Academia Española and by other authors. Then, it analyses the degree to which the botanical terms included in each dictionary meet the inclusion criteria established. Finally, it focuses on diatechnical labelling and the differences between the two dictionaries in terms of the disciplines to which terms are assigned. This allows us to draw conclusions regarding the representation of this field of knowledge in the two dictionaries and the lexicographic techniques used to produce them.
本文探讨了两本通用词典中植物学知识的词典编纂:西班牙西班牙皇家科学院出版的《西班牙植物词典》(Diccionario de la lengua española,2014[2021])和墨西哥El Colegio de México出版的《墨西哥植物词典》。它从西班牙皇家科学院和其他作者将植物学术语纳入通用词典的历史概述开始。然后,分析了每本词典中收录的植物学术语符合既定收录标准的程度。最后,重点讨论了中介技术标签以及两本词典在术语分配学科方面的差异。这使我们能够得出关于这一知识领域在两本词典中的表现以及用于产生它们的词典编纂技术的结论。
{"title":"Two Ways of Representing Specialist Knowledge: Analysing the Botanical Lexicon in Diccionario de la Lengua Española and Diccionario del Español de México","authors":"Jesús Camacho-Niño","doi":"10.1093/ijl/ecad014","DOIUrl":"https://doi.org/10.1093/ijl/ecad014","url":null,"abstract":"\u0000 This article explores the lexicographic codification of botanical knowledge in two general dictionaries: Diccionario de la lengua española (2014 [2021]), produced by the Real Academia Española in Spain, and Diccionario del español de México (2019), published by El Colegio de México. It begins with a historical overview of the inclusion of botanical terminology in general dictionaries by the Real Academia Española and by other authors. Then, it analyses the degree to which the botanical terms included in each dictionary meet the inclusion criteria established. Finally, it focuses on diatechnical labelling and the differences between the two dictionaries in terms of the disciplines to which terms are assigned. This allows us to draw conclusions regarding the representation of this field of knowledge in the two dictionaries and the lexicographic techniques used to produce them.","PeriodicalId":45657,"journal":{"name":"International Journal of Lexicography","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48898426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Categorizing obsolete, archaic, and classic words in an Indonesian dictionary 在印尼语词典中对过时的、古老的和经典的词进行分类
IF 0.5 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2023-06-29 DOI: 10.1558/lexi.24757
Dewi Puspita, K. Yusuf
The global era has led to fairly rapid changes in language. Many words have become obsolete. There are also many words whose meanings have become irrelevant nowadays. Unfortunately, in Indonesian dictionaries, especially in the Kamus Besar Bahasa Indonesia (KBBI; Comprehensive dictionary of Indonesian), there is no label for obsolete words. There is only an “archaic” label to mark all outdated words and a “classic” label to mark classical words. Another labeling problem in the KBBI is that there are no clear guidelines or criteria to determine when a word is considered archaic, obsolete, or classic. The absence of clear criteria causes some entries that have been labeled “archaic” in the KBBI to seem obsolete, and sometimes words labeled as classic get confused with archaic words. The aim of this article is to investigate ways of categorizing archaic, obsolete, and classic words in the KBBI. This research was conducted by comparing several forms and entry criteria labeled “archaic,” “obsolete,” and “classic” in several dictionaries, in particular dictionaries of foreign languages whose lexicographic traditions are well established. Each dictionary has its own criteria for classifying a word as archaic, obsolete, or classic, and we can learn from them. The findings suggest that checking the corpus data set is the easiest way to categorize words according to their labels.
全球化时代导致了语言的迅速变化。许多词已经过时了。现在也有许多词的意思已经变得无关紧要了。不幸的是,在印尼语词典中,特别是在印尼语Kamus Besar Bahasa Indonesia (KBBI;综合印尼语词典),没有废词的标签。只有一个“古老”的标签来标记所有过时的单词,一个“经典”的标签来标记经典的单词。KBBI的另一个标签问题是,没有明确的指导方针或标准来确定一个词何时被认为是古老的、过时的或经典的。由于缺乏明确的标准,一些在KBBI中被标记为“archaic”的条目似乎已经过时,有时被标记为“classic”的单词会与古单词混淆。本文的目的是研究KBBI中古词、废词和经典词的分类方法。这项研究是通过比较几本词典中标记为“古老的”、“过时的”和“经典的”的几种形式和词条标准来进行的,特别是那些词典编纂传统已经确立的外语词典。每本词典都有自己的标准来区分一个词是古老的、过时的还是经典的,我们可以从中学习。研究结果表明,检查语料库数据集是根据标签对单词进行分类的最简单方法。
{"title":"Categorizing obsolete, archaic, and classic words in an Indonesian dictionary","authors":"Dewi Puspita, K. Yusuf","doi":"10.1558/lexi.24757","DOIUrl":"https://doi.org/10.1558/lexi.24757","url":null,"abstract":"The global era has led to fairly rapid changes in language. Many words have become obsolete. There are also many words whose meanings have become irrelevant nowadays. Unfortunately, in Indonesian dictionaries, especially in the Kamus Besar Bahasa Indonesia (KBBI; Comprehensive dictionary of Indonesian), there is no label for obsolete words. There is only an “archaic” label to mark all outdated words and a “classic” label to mark classical words. Another labeling problem in the KBBI is that there are no clear guidelines or criteria to determine when a word is considered archaic, obsolete, or classic. The absence of clear criteria causes some entries that have been labeled “archaic” in the KBBI to seem obsolete, and sometimes words labeled as classic get confused with archaic words. The aim of this article is to investigate ways of categorizing archaic, obsolete, and classic words in the KBBI. This research was conducted by comparing several forms and entry criteria labeled “archaic,” “obsolete,” and “classic” in several dictionaries, in particular dictionaries of foreign languages whose lexicographic traditions are well established. Each dictionary has its own criteria for classifying a word as archaic, obsolete, or classic, and we can learn from them. The findings suggest that checking the corpus data set is the easiest way to categorize words according to their labels.","PeriodicalId":45657,"journal":{"name":"International Journal of Lexicography","volume":"16 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75430982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
use of verb valency patterns in the Indonesian monolingual learner’s dictionary 印尼语单语学习者词典中动词配价模式的使用
IF 0.5 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2023-06-29 DOI: 10.1558/lexi.24995
Dora Amalía
The verb is one of the most perplexing features for learners of BIPA (Bahasa Indonesia bagi Penutur Asing “Indonesian for foreign speakers”). Indonesian verbs are particularly rich in affixes that correspond to their numerous senses. The learners find it challenging to use verbs with the appropriate affixes in a sentence structure. To function effectively as a learning tool, dictionaries must provide morphological and grammatical information. Frame semantics theory is used in this research to determine a verb’s meaning based on its semantic context and frame. The verb is described by identifying the grammatical constructions in which it participates, and by characterizing all of the obligatory and optional types of companions. By doing this, we can obtain the verb valency pattern to add to a dictionary’s morphological and grammatical information. This study aims to create entry models of affixed verbs with the valency pattern. The six transitive verbs for discussion comprise a variety of affixes with various senses: mempersembahkan “to present/dedicate”; membersihkan “to clean”; mencintai “to love”; memperlancar “to expedite”; memperbaiki “to fix”; and memberlakukan “to apply”. These entries serve as models for the BIPA learner’s dictionary.
动词是BIPA(印尼语bagi Penutur,即“外国人用印尼语”)学习者最困惑的特征之一。印尼语动词的词缀特别丰富,与它们众多的意义相对应。学习者发现在句子结构中使用带有适当词缀的动词是很有挑战性的。为了有效地发挥学习工具的作用,字典必须提供词法和语法信息。本研究采用框架语义理论,根据动词的语义语境和语义框架来确定动词的意义。动词的描述是通过识别它所参与的语法结构,并通过描述所有强制性和可选的同伴类型。通过这样做,我们可以获得动词的配价模式,以添加到词典的形态和语法信息中。本研究旨在建立附形动词配价模式的输入模型。讨论的六个及物动词包括具有不同意义的各种词缀:mempersembahkan“呈现/奉献”;会员须“清洁”;门门台“要爱”;成员“加快”;Memperbaiki“修理”;而会员lakukan“申请”。这些条目作为BIPA学习者词典的模型。
{"title":"use of verb valency patterns in the Indonesian monolingual learner’s dictionary","authors":"Dora Amalía","doi":"10.1558/lexi.24995","DOIUrl":"https://doi.org/10.1558/lexi.24995","url":null,"abstract":"The verb is one of the most perplexing features for learners of BIPA (Bahasa Indonesia bagi Penutur Asing “Indonesian for foreign speakers”). Indonesian verbs are particularly rich in affixes that correspond to their numerous senses. The learners find it challenging to use verbs with the appropriate affixes in a sentence structure. To function effectively as a learning tool, dictionaries must provide morphological and grammatical information. Frame semantics theory is used in this research to determine a verb’s meaning based on its semantic context and frame. The verb is described by identifying the grammatical constructions in which it participates, and by characterizing all of the obligatory and optional types of companions. By doing this, we can obtain the verb valency pattern to add to a dictionary’s morphological and grammatical information. This study aims to create entry models of affixed verbs with the valency pattern. The six transitive verbs for discussion comprise a variety of affixes with various senses: mempersembahkan “to present/dedicate”; membersihkan “to clean”; mencintai “to love”; memperlancar “to expedite”; memperbaiki “to fix”; and memberlakukan “to apply”. These entries serve as models for the BIPA learner’s dictionary.","PeriodicalId":45657,"journal":{"name":"International Journal of Lexicography","volume":"48 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81701567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Digitalizing a local language dictionary 将本地语言词典数字化
IF 0.5 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2023-06-29 DOI: 10.1558/lexi.25076
Winda Luthfita, Selly Rizki Yanita
The National Agency for Language Development and Cultivation (henceforth, Badan Bahasa) has published many dictionaries as a government agency under the Ministry of Education and Culture of Indonesia. More than 100 dictionaries have been published since 1977. Some dictionaries have been revised by adding new entries and senses. In alignment with technological developments, Badan Bahasa has started an integration project that aims to provide an online application for their language products. In 2015, it started Program Pengayaan Kosakata (Word proposal application program), and this was followed by the launch of the online version of the Kamus Besar Bahasa Indonesia (KBBI; Comprehensive dictionary of Indonesian), Tesaurus Tematis Bahasa Indonesia (Thematic thesaurus of Indonesian), and Ensiklopedia Sastra Indonesia (Encyclopedia of Indonesian literature) in 2016. In 2020, Badan Bahasa started the development of Aplikasi Pangkalan Data Kamus, also called Aplikasi Kompilasi Kamus (AKK; Dictionary compilation application). This online application accommodates at least three kinds of dictionary – a local language dictionary, specialized dictionary, and bilingual dictionary – published by Badan Bahasa. The process has continued by developing a digitalization project targeting the digitalization of print versions of specialized dictionaries, Indonesian-local language dictionaries, and local language-Indonesian dictionaries. This article aims to discuss some challenges regarding the digitalization of dictionaries arising from the print versions, the dictionary structure, and the dictionary interface, and puts forward some solutions to deal with the issues. The research method uses qualitative methods for observing dictionary files and examining microstructure issues throughout the whole process. The results of this study are expected to support the digitalization process and dictionary development in Indonesia.
作为印尼教育和文化部下属的政府机构,国家语言发展和培养机构(以下简称“Badan Bahasa”)出版了许多字典。自1977年以来,已经出版了100多本词典。有些字典已经过修订,增加了新的词条和词义。随着科技的发展,巴丹语已经启动了一个整合项目,旨在为他们的语言产品提供在线应用。2015年,它启动了Word提案应用程序Program Pengayaan Kosakata (Word提案应用程序),随后推出了在线版本的印度尼西亚语Kamus Besar Bahasa (KBBI;印度尼西亚综合词典),Tesaurus Tematis Bahasa Indonesia(印度尼西亚主题辞典)和Ensiklopedia Sastra Indonesia(印度尼西亚文学百科全书)。2020年,巴丹语开始开发《马来语数据卡姆斯》,也称为《马来语数据卡姆斯》(AKK;字典编译应用程序)。这个在线应用程序至少可以容纳三种词典——本地语言词典、专业词典和双语词典——由巴丹语出版。这一进程还在继续,并发展了一个数字化项目,目标是将专业词典、印尼语-当地语言词典和印尼语-当地语言词典的印刷版数字化。本文从印刷版、词典结构、词典接口等方面探讨了词典数字化面临的挑战,并提出了解决问题的对策。研究方法采用定性方法对字典文件进行观察,对整个过程中的微观结构问题进行考察。这项研究的结果有望支持印尼的数字化进程和词典的发展。
{"title":"Digitalizing a local language dictionary","authors":"Winda Luthfita, Selly Rizki Yanita","doi":"10.1558/lexi.25076","DOIUrl":"https://doi.org/10.1558/lexi.25076","url":null,"abstract":"The National Agency for Language Development and Cultivation (henceforth, Badan Bahasa) has published many dictionaries as a government agency under the Ministry of Education and Culture of Indonesia. More than 100 dictionaries have been published since 1977. Some dictionaries have been revised by adding new entries and senses. In alignment with technological developments, Badan Bahasa has started an integration project that aims to provide an online application for their language products. In 2015, it started Program Pengayaan Kosakata (Word proposal application program), and this was followed by the launch of the online version of the Kamus Besar Bahasa Indonesia (KBBI; Comprehensive dictionary of Indonesian), Tesaurus Tematis Bahasa Indonesia (Thematic thesaurus of Indonesian), and Ensiklopedia Sastra Indonesia (Encyclopedia of Indonesian literature) in 2016. In 2020, Badan Bahasa started the development of Aplikasi Pangkalan Data Kamus, also called Aplikasi Kompilasi Kamus (AKK; Dictionary compilation application). This online application accommodates at least three kinds of dictionary – a local language dictionary, specialized dictionary, and bilingual dictionary – published by Badan Bahasa. The process has continued by developing a digitalization project targeting the digitalization of print versions of specialized dictionaries, Indonesian-local language dictionaries, and local language-Indonesian dictionaries. This article aims to discuss some challenges regarding the digitalization of dictionaries arising from the print versions, the dictionary structure, and the dictionary interface, and puts forward some solutions to deal with the issues. The research method uses qualitative methods for observing dictionary files and examining microstructure issues throughout the whole process. The results of this study are expected to support the digitalization process and dictionary development in Indonesia.","PeriodicalId":45657,"journal":{"name":"International Journal of Lexicography","volume":"7 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87690298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
use of lexical bundles in an online comprehensive dictionary of Indonesian (KBBI Daring) 联机印尼语综合词典中词汇束的使用
IF 0.5 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2023-06-29 DOI: 10.1558/lexi.25177
Adi Budiwiyanto
Research on lexical bundles in the last few decades has focused mostly on written registers, especially academic writing. In this study, I investigate the use of lexical bundles in a different genre – dictionaries. As a lexical bundle is formulaic language specific to a particular register, I hypothesize that particular lexical bundles are used in dictionaries. The research focus of this study is the extent to which lexical bundles are used in the online version of the Kamus Besar Bahasa Indonesia (KBBI Daring; Comprehensive dictionary of Indonesian online), especially in the lemma, definition, and example sections. In addition, the study examines the characteristics of lexical bundles in dictionaries. The approach used is corpus based. As reference bundles, a total of 517 lexical bundles were extracted from the IndonesianWeb Corpus (available from SketchEngine). The bundles were then analyzed for their use in KBBI Daring in terms of their frequency, structure, and function. The results showed that the use of lexical bundles in KBBI Daring was mostly found in the definition section. The bundles found were generally in the form of phrases rather than clauses. In terms of structure, the lexical bundles were dominated by incomplete structures. The bundles, either in the definition or example sections, were mostly in the form of yang-clause fragments, such as yang digunakan untuk “that is used for,” yang terdiri atas “that consists of,” yang terbuat dari “that is made of/from,” yang berasal dari “that comes from,” and yang berhubungan dengan “that relates to.” In terms of function, presenting content is the dominant function in the KBBI, while organizing text is the least prevalent function. In a nutshell, each section in the dictionary, especially in the KBBI, has its own character.
在过去的几十年里,对词汇束的研究主要集中在书面语域,尤其是学术写作上。在这项研究中,我研究了词汇束在不同类型词典中的使用。由于词汇包是特定于特定寄存器的公式化语言,因此我假设字典中使用了特定的词汇包。本研究的研究重点是词束在印尼语Kamus Besar Bahasa Indonesia (KBBI Daring;综合的印尼语在线词典),特别是在引理,定义,和例子部分。此外,本研究还探讨了词典中词汇束的特征。使用的方法是基于语料库的。作为参考包,从印尼web语料库(SketchEngine)中提取了总共517个词汇包。然后分析了这些束在KBBI dare中的使用频率、结构和功能。结果表明,KBBI dare中词汇束的使用主要集中在定义部分。所发现的捆包通常是短语而不是从句的形式。在结构上,词束以不完整结构为主。无论是在定义部分还是示例部分,这些词束大多以“yang-从句”片段的形式出现,例如“yang digunakan untuk”用于,“yang terdiri atas”由,“yang terbuat dari”由/来自,“yang berasal dari”来自,“yang berhubungan dengan”与…有关。就功能而言,呈现内容是KBBI的主要功能,而组织文本是最不常见的功能。简而言之,字典中的每个部分,尤其是KBBI,都有自己的特点。
{"title":"use of lexical bundles in an online comprehensive dictionary of Indonesian (KBBI Daring)","authors":"Adi Budiwiyanto","doi":"10.1558/lexi.25177","DOIUrl":"https://doi.org/10.1558/lexi.25177","url":null,"abstract":"Research on lexical bundles in the last few decades has focused mostly on written registers, especially academic writing. In this study, I investigate the use of lexical bundles in a different genre – dictionaries. As a lexical bundle is formulaic language specific to a particular register, I hypothesize that particular lexical bundles are used in dictionaries. The research focus of this study is the extent to which lexical bundles are used in the online version of the Kamus Besar Bahasa Indonesia (KBBI Daring; Comprehensive dictionary of Indonesian online), especially in the lemma, definition, and example sections. In addition, the study examines the characteristics of lexical bundles in dictionaries. The approach used is corpus based. As reference bundles, a total of 517 lexical bundles were extracted from the IndonesianWeb Corpus (available from SketchEngine). The bundles were then analyzed for their use in KBBI Daring in terms of their frequency, structure, and function. The results showed that the use of lexical bundles in KBBI Daring was mostly found in the definition section. The bundles found were generally in the form of phrases rather than clauses. In terms of structure, the lexical bundles were dominated by incomplete structures. The bundles, either in the definition or example sections, were mostly in the form of yang-clause fragments, such as yang digunakan untuk “that is used for,” yang terdiri atas “that consists of,” yang terbuat dari “that is made of/from,” yang berasal dari “that comes from,” and yang berhubungan dengan “that relates to.” In terms of function, presenting content is the dominant function in the KBBI, while organizing text is the least prevalent function. In a nutshell, each section in the dictionary, especially in the KBBI, has its own character.","PeriodicalId":45657,"journal":{"name":"International Journal of Lexicography","volume":"35 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75706202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Avoiding Recursion in the Representation of Subsenses and Subentries in Dictionaries 避免字典中子项和子项的递归表示
IF 0.5 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2023-06-10 DOI: 10.1093/ijl/ecad012
M. Mechura
Recursion, and recursion-like design patterns, are used in the entry schemas of dictionaries to model subsenses and subentries. Recursion occurs when elements of a given type, such as sense, are allowed to contain elements of the same or similar type, such as sense or subsense. This article argues that recursion unnecessarily increases the computational complexity of entries, making dictionaries less easily processable by machines. The article will show how entry schemas can be simplified by re-engineering subsenses and subentries as relations (like in a relational database) such that we only have flat lists of senses and entries, while the is-subsense-of and is-subentry-of relations are encoded using pairs of unique identifiers. This design pattern losslessly records the same information as recursion (including – importantly – the listing order of items inside an entry) but decreases the complexity of the entry structure and makes dictionary entries more easily machine-processable.
字典的条目模式中使用递归和类似递归的设计模式对子语义和子条目进行建模。当允许给定类型的元素(如sense)包含相同或相似类型的元素(如sense或subsense)时,就会发生递归。本文认为递归不必要地增加了条目的计算复杂性,使机器不容易处理字典。本文将展示如何通过将子语义和子条目重新设计为关系(就像在关系数据库中一样)来简化条目模式,这样我们就只有意义和条目的平面列表,而is-subsense-of和is-subentry-of关系则使用一对唯一标识符进行编码。这种设计模式无损地记录了与递归相同的信息(包括——重要的是——条目内项的列出顺序),但降低了条目结构的复杂性,并使字典条目更容易由机器处理。
{"title":"Avoiding Recursion in the Representation of Subsenses and Subentries in Dictionaries","authors":"M. Mechura","doi":"10.1093/ijl/ecad012","DOIUrl":"https://doi.org/10.1093/ijl/ecad012","url":null,"abstract":"\u0000 Recursion, and recursion-like design patterns, are used in the entry schemas of dictionaries to model subsenses and subentries. Recursion occurs when elements of a given type, such as sense, are allowed to contain elements of the same or similar type, such as sense or subsense. This article argues that recursion unnecessarily increases the computational complexity of entries, making dictionaries less easily processable by machines. The article will show how entry schemas can be simplified by re-engineering subsenses and subentries as relations (like in a relational database) such that we only have flat lists of senses and entries, while the is-subsense-of and is-subentry-of relations are encoded using pairs of unique identifiers. This design pattern losslessly records the same information as recursion (including – importantly – the listing order of items inside an entry) but decreases the complexity of the entry structure and makes dictionary entries more easily machine-processable.","PeriodicalId":45657,"journal":{"name":"International Journal of Lexicography","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2023-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43354780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Not Only Meaning… Verbel: The Electronic Dictionary of Paradigms of Polish Verbal Multiword Expressions 不只是意义…Verbel:波兰语多词表达范例电子词典
IF 0.5 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2023-06-02 DOI: 10.1093/ijl/ecad008
Sebastian Przybyszewski, Iwona Kosek, Monika Czerepowicka
This article presents Verbel: The Electronic Dictionary of Paradigms of Polish Verbal Multiword Expressions (MWEs) and discusses theoretical problems connected with compiling such a dictionary for inflectionally complex languages such as Polish. The dictionary includes over 5,000 Polish verbal MWEs and explicitly presents their forms and constraints in inflection. It also provides grammatical, semantic, pragmatic and prescriptive commentaries. The first part of the article covers the theoretical and methodological basis used in the compilation of the dictionary. Generally, a verbal MWE is inflected according to the paradigm of the verb which is its main component. However, MWEs may have some specific inflectional constraints connected with other factors (e.g. semantic, pragmatic), which result in different paradigms for verbal MWEs and for the verbs that are their main components. In the second part, the conception and content of the dictionary are discussed. Finally, the natural language processing tools that underlie the work on the dictionary are described.
本文介绍了《Verbel:波兰语动词多词表达范式电子词典》,并讨论了为波兰语等屈折复杂语言编写该词典所涉及的理论问题。该词典收录了5000多个波兰语动词MWE,并明确地以屈折形式呈现了它们的形式和限制。它还提供语法、语义、语用和规定性评注。文章的第一部分介绍了词典编纂的理论和方法基础。一般来说,动词MWE是根据作为其主要组成部分的动词的范式进行屈折的。然而,MWE可能有一些与其他因素(如语义、语用)相关的特定屈折限制,这导致了言语MWE和作为其主要组成部分的动词的不同范式。第二部分论述了词典的概念和内容。最后,介绍了构成词典工作基础的自然语言处理工具。
{"title":"Not Only Meaning… Verbel: The Electronic Dictionary of Paradigms of Polish Verbal Multiword Expressions","authors":"Sebastian Przybyszewski, Iwona Kosek, Monika Czerepowicka","doi":"10.1093/ijl/ecad008","DOIUrl":"https://doi.org/10.1093/ijl/ecad008","url":null,"abstract":"\u0000 This article presents Verbel: The Electronic Dictionary of Paradigms of Polish Verbal Multiword Expressions (MWEs) and discusses theoretical problems connected with compiling such a dictionary for inflectionally complex languages such as Polish. The dictionary includes over 5,000 Polish verbal MWEs and explicitly presents their forms and constraints in inflection. It also provides grammatical, semantic, pragmatic and prescriptive commentaries. The first part of the article covers the theoretical and methodological basis used in the compilation of the dictionary. Generally, a verbal MWE is inflected according to the paradigm of the verb which is its main component. However, MWEs may have some specific inflectional constraints connected with other factors (e.g. semantic, pragmatic), which result in different paradigms for verbal MWEs and for the verbs that are their main components. In the second part, the conception and content of the dictionary are discussed. Finally, the natural language processing tools that underlie the work on the dictionary are described.","PeriodicalId":45657,"journal":{"name":"International Journal of Lexicography","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49103709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wheat or Chaff? A Compound Selection Model Based on Look-Up Data 小麦还是谷壳?基于查找数据的复合选择模型
2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2023-05-30 DOI: 10.1093/ijl/ecad013
Mikkel Ekeland Paulsen
Abstract Which compounds should be included in general-purpose dictionaries is often an open question that is answered with a case-by-case consideration of all compounds above a certain corpus frequency threshold. Another way to determine which compounds should be listed, is to examine which compounds, or rather which compound properties, are in demand by the users. This study uses look-up data from the two officially sanctioned, general-purpose dictionaries of Norwegian (Bokmålsordboka and Nynorskordboka) to derive an explicit compound selection model that performs with comparable sensitivity and specificity as the traditional procedure. These findings demonstrate that it is indeed possible to arrive at a fully operational and explicit compound selection model that meets the needs of users. With such a tool at their disposal, lexicographers would be able to separate the wheat from the chaff in the boundless field that is the compound lexicon of North Germanic Languages.
哪些复合词应该被包括在通用词典中,这通常是一个悬而未决的问题,需要对所有超过一定语料库频率阈值的复合词进行逐案考虑才能回答。确定应该列出哪些化合物的另一种方法是检查用户需要哪些化合物,或者更确切地说,是哪些化合物的性质。本研究使用了两个官方认可的通用挪威语词典(bokm lsordboka和Nynorskordboka)的查找数据,得出了一个明确的化合物选择模型,该模型具有与传统程序相当的灵敏度和特异性。这些发现表明,确实有可能达到一个完全可操作和明确的化合物选择模型,满足用户的需求。有了这样一种工具,词典编纂者就能在北日耳曼语言的复合词典这一浩瀚的领域中,把小麦从谷壳中分离出来。
{"title":"Wheat or Chaff? A Compound Selection Model Based on Look-Up Data","authors":"Mikkel Ekeland Paulsen","doi":"10.1093/ijl/ecad013","DOIUrl":"https://doi.org/10.1093/ijl/ecad013","url":null,"abstract":"Abstract Which compounds should be included in general-purpose dictionaries is often an open question that is answered with a case-by-case consideration of all compounds above a certain corpus frequency threshold. Another way to determine which compounds should be listed, is to examine which compounds, or rather which compound properties, are in demand by the users. This study uses look-up data from the two officially sanctioned, general-purpose dictionaries of Norwegian (Bokmålsordboka and Nynorskordboka) to derive an explicit compound selection model that performs with comparable sensitivity and specificity as the traditional procedure. These findings demonstrate that it is indeed possible to arrive at a fully operational and explicit compound selection model that meets the needs of users. With such a tool at their disposal, lexicographers would be able to separate the wheat from the chaff in the boundless field that is the compound lexicon of North Germanic Languages.","PeriodicalId":45657,"journal":{"name":"International Journal of Lexicography","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135693008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hausa Dictionary for Everyday Use: Hausa - English/ English - Hausa. Ƙamusun Hausa na yau da kullum: Hausa - Inglilishi/ Ingilishi - Hausa. Paul Newman and Roxana Ma Newman 用途:豪萨-英语/英语-豪萨萨Ƙamusun到kullum:豪萨人yau - Inglilishi / Ingilishi -豪萨。保罗·纽曼和罗克珊娜·纽曼
IF 0.5 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2023-05-29 DOI: 10.1093/ijl/ecad010
C. Schmaling
{"title":"Hausa Dictionary for Everyday Use: Hausa - English/ English - Hausa. Ƙamusun Hausa na yau da kullum: Hausa - Inglilishi/ Ingilishi - Hausa. Paul Newman and Roxana Ma Newman","authors":"C. Schmaling","doi":"10.1093/ijl/ecad010","DOIUrl":"https://doi.org/10.1093/ijl/ecad010","url":null,"abstract":"","PeriodicalId":45657,"journal":{"name":"International Journal of Lexicography","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2023-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45152399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Definition, Presentation and Automatic Generation of Contextual Data in Lexicography 词典学中语境数据的定义、表示和自动生成
IF 0.5 2区 文学 0 LANGUAGE & LINGUISTICS Pub Date : 2023-05-05 DOI: 10.1093/ijl/ecac020
M. J. Domínguez, R. Gouws
This paper deals with several aspects of context in lexicography. Section 1 briefly mentions some different approaches to the concept context in various fields. Section 2 puts the focus on different uses and perceptions of the concept context in lexicography, contrasting it with related concepts, such as cotext, contextualization and contextual information. A more comprehensive discussion also covers different aspects of the occurrence of the concept context in dictionary research, with specific reference to central aspects of the so-called inner and outer context. Various portals, dictionaries and dictionary entries will illustrate the above-mentioned approaches. Section 3 approaches the subject from a user perspective. Section 4 addresses the question How can contextual data be extracted or generated? To answer this question, some methods and tools for (automatic) acquisition and analysis of contextual data, – in particular of the local contextual data in terms of Faber and León-Araúz (2016) – are introduced. Examples of these are lexical databases or semantic networks, like WordNet, and corpora, like Sketch Engine, or predictive methods, like Word2vec and similar ones. Some advantages and disadvantages of specific data acquisition tools used for the analysis of local contextual data are indicated. This section also contributes to a more detailed discussion of the automatic generation of the so-called local syntactic-semantic context or word environment, specifically of the building of syntactic-semantic argument patterns and their examples.
本文论述了词典编纂中语境的几个方面。第1节简要地提到了在各个领域中对概念上下文的一些不同方法。第二节着重分析了词典编纂中对概念语境的不同使用和认知,并将其与相关概念(如小语境、语境化和语境信息)进行了对比。更全面的讨论还涵盖了词典研究中概念语境发生的不同方面,具体涉及所谓内部语境和外部语境的中心方面。各种门户网站、词典和词典条目将说明上述方法。第3节从用户的角度探讨这个主题。第4节讨论了如何提取或生成上下文数据的问题?为了回答这个问题,介绍了一些用于(自动)获取和分析上下文数据的方法和工具,特别是Faber和León-Araúz(2016)的局部上下文数据。例如词汇数据库或语义网络,如WordNet,语料库,如Sketch Engine,或预测方法,如Word2vec和类似方法。指出了用于分析本地上下文数据的特定数据采集工具的一些优点和缺点。本节还有助于更详细地讨论所谓的局部句法语义上下文或单词环境的自动生成,特别是句法语义论证模式的构建及其示例。
{"title":"The Definition, Presentation and Automatic Generation of Contextual Data in Lexicography","authors":"M. J. Domínguez, R. Gouws","doi":"10.1093/ijl/ecac020","DOIUrl":"https://doi.org/10.1093/ijl/ecac020","url":null,"abstract":"\u0000 This paper deals with several aspects of context in lexicography. Section 1 briefly mentions some different approaches to the concept context in various fields. Section 2 puts the focus on different uses and perceptions of the concept context in lexicography, contrasting it with related concepts, such as cotext, contextualization and contextual information. A more comprehensive discussion also covers different aspects of the occurrence of the concept context in dictionary research, with specific reference to central aspects of the so-called inner and outer context. Various portals, dictionaries and dictionary entries will illustrate the above-mentioned approaches. Section 3 approaches the subject from a user perspective. Section 4 addresses the question How can contextual data be extracted or generated? To answer this question, some methods and tools for (automatic) acquisition and analysis of contextual data, – in particular of the local contextual data in terms of Faber and León-Araúz (2016) – are introduced. Examples of these are lexical databases or semantic networks, like WordNet, and corpora, like Sketch Engine, or predictive methods, like Word2vec and similar ones. Some advantages and disadvantages of specific data acquisition tools used for the analysis of local contextual data are indicated. This section also contributes to a more detailed discussion of the automatic generation of the so-called local syntactic-semantic context or word environment, specifically of the building of syntactic-semantic argument patterns and their examples.","PeriodicalId":45657,"journal":{"name":"International Journal of Lexicography","volume":"1 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2023-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41887432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
International Journal of Lexicography
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1