ABSTRACT:Based on the entries for the word unicorn, this article investigates how meaning is defined in three very different dictionaries: Ælfric's Glossary (ÆGl), the Oxford English Dictionary (OED), and Urban Dictionary (UD). Starting with ÆGl in the Old English period, the article shows that different types of definitions as described by Lew (2013) are already present and that Ælfric's definitions of unicorn, in fact, combine divergent concepts of this mythological creature. The different meanings of unicorn presented by Ælfric are reflected in some of the multiple senses of the word as defined by the monumental OED. A comparison with the numerous entries for unicorn in UD reveals that one of its most prominent senses—'a very attractive (and hence unobtainable) person'—is missing from the OED and also from Lexico, an online dictionary of contemporary English provided by Oxford University Press. On the one hand, these similarities and differences reveal each dictionary's bias for a particular register. On a more fundamental level, however, the evidence calls into question how far classic dictionary definitions are actually able to convey word meaning. In a sense, the multiple overlapping and competing definitions of UD are more successful in representing the fuzziness of word meaning. In a similar way, ÆGl, though written by a single author, combines different sources on the unicorn without merging them into a unified account. Thus, from a typological perspective, medieval glossaries turn out to share certain features with crowd-sourced lexicographical resources like UD, and both are quite distinct from professional lexicography in how they approach word meaning.
{"title":"How to Catch Your Unicorn: Defining Meaning in Ælfric's Glossary, the Oxford English Dictionary, and Urban Dictionary","authors":"A. Seiler","doi":"10.1353/dic.2020.0013","DOIUrl":"https://doi.org/10.1353/dic.2020.0013","url":null,"abstract":"ABSTRACT:Based on the entries for the word unicorn, this article investigates how meaning is defined in three very different dictionaries: Ælfric's Glossary (ÆGl), the Oxford English Dictionary (OED), and Urban Dictionary (UD). Starting with ÆGl in the Old English period, the article shows that different types of definitions as described by Lew (2013) are already present and that Ælfric's definitions of unicorn, in fact, combine divergent concepts of this mythological creature. The different meanings of unicorn presented by Ælfric are reflected in some of the multiple senses of the word as defined by the monumental OED. A comparison with the numerous entries for unicorn in UD reveals that one of its most prominent senses—'a very attractive (and hence unobtainable) person'—is missing from the OED and also from Lexico, an online dictionary of contemporary English provided by Oxford University Press. On the one hand, these similarities and differences reveal each dictionary's bias for a particular register. On a more fundamental level, however, the evidence calls into question how far classic dictionary definitions are actually able to convey word meaning. In a sense, the multiple overlapping and competing definitions of UD are more successful in representing the fuzziness of word meaning. In a similar way, ÆGl, though written by a single author, combines different sources on the unicorn without merging them into a unified account. Thus, from a typological perspective, medieval glossaries turn out to share certain features with crowd-sourced lexicographical resources like UD, and both are quite distinct from professional lexicography in how they approach word meaning.","PeriodicalId":35106,"journal":{"name":"Dictionaries","volume":"41 1","pages":"245 - 276"},"PeriodicalIF":0.0,"publicationDate":"2020-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/dic.2020.0013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44879229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Describing Prescriptivism: Usage Guides and Usage Problems in British and American English by Ingrid Tieken-Boon van Ostade (review)","authors":"Don Chapman","doi":"10.1353/dic.2020.0019","DOIUrl":"https://doi.org/10.1353/dic.2020.0019","url":null,"abstract":"","PeriodicalId":35106,"journal":{"name":"Dictionaries","volume":"41 1","pages":"299 - 304"},"PeriodicalIF":0.0,"publicationDate":"2020-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/dic.2020.0019","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41793891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ABSTRACT:Every year thousands of neologisms, or new words, are coined. Most neologisms are compounds or derivations. Existing words used with a new meaning (e.g., English smart [slim in Dutch] 'appearing to have a degree of intelligence' [OED]), often used attributively before a machine or device) and new multiword units (urban gym) are also treated as neologisms. New loanwords are often considered neologisms as well: in Dutch many neologisms are borrowed from English, as with frosecco 'frozen prosecco' and the more familiar crowdsourcing and staycation, for example.Not every neologism is widely used and the majority of new words will disappear. The more widely adopted or firmly rooted neologisms are often described in dictionaries, such as the Algemeen Nederlands Woordenboek (ANW), an online dictionary of present-day Dutch. Why are some new words adopted, while others are ignored? Is it necessary to register and describe neologisms that are likely to disappear, for example in a dictionary of neologisms? And what should a dictionary of neologisms look like?In this article I present a pilot version of a new dictionary of Dutch neologisms. Firstly, I explain how neologisms are created in general and what Dutch neologisms look like. Secondly, I demonstrate why it is necessary to register and describe neologisms (including those that are not adopted in contemporary speech) in an online dictionary portal. Then I zoom in on Dutch and show how potential neologisms in Dutch can be detected with the aid of the computer tool Neoloog and through corpus analysis. Finally, I examine the lemma structure of a Dutch special-domain dictionary of neologisms, the Neologismenwoordenboek (NW) and discuss how it differs from the ANW in the way it describes neologisms.
摘要:每年都有成千上万的新词被创造出来。大多数新词都是复合词或派生词。具有新含义的现有单词(例如,英语smart[荷兰语中的slim]“似乎有一定程度的智能”[OED]),通常在机器或设备之前用作定语)和新的多词单位(城市健身房)也被视为新词。新的外来词通常也被认为是新词:在荷兰语中,许多新词都是从英语中借来的,比如frosecco的“冷冻prosecco”,以及人们更熟悉的众包和居家度假。并不是每一个新词都被广泛使用,大多数新词都会消失。更广泛采用或根深蒂固的新词经常在词典中被描述,比如现代荷兰语的在线词典Algemeen Nederlands Woordenboek(ANW)。为什么有些新词被采用,而另一些却被忽视了?是否有必要登记和描述可能消失的新词,例如在新词词典中?新词词典应该是什么样子?在这篇文章中,我介绍了一本新的荷兰新词词典的试点版本。首先,我解释了新词是如何产生的,以及荷兰语新词是什么样子的。其次,我展示了为什么有必要在在线词典门户网站上注册和描述新词(包括当代演讲中未采用的新词)。然后我放大荷兰语,展示如何借助计算机工具Neoloog和语料库分析来检测荷兰语中潜在的新词。最后,我考察了荷兰新词特殊领域词典《新语词典》(NW)的引理结构,并讨论了它与ANW在描述新词方面的区别。
{"title":"Neologisms in an Online Portal: The Dutch Neologismenwoordenboek (NW)","authors":"Vivien Waszink","doi":"10.1353/dic.2020.0003","DOIUrl":"https://doi.org/10.1353/dic.2020.0003","url":null,"abstract":"ABSTRACT:Every year thousands of neologisms, or new words, are coined. Most neologisms are compounds or derivations. Existing words used with a new meaning (e.g., English smart [slim in Dutch] 'appearing to have a degree of intelligence' [OED]), often used attributively before a machine or device) and new multiword units (urban gym) are also treated as neologisms. New loanwords are often considered neologisms as well: in Dutch many neologisms are borrowed from English, as with frosecco 'frozen prosecco' and the more familiar crowdsourcing and staycation, for example.Not every neologism is widely used and the majority of new words will disappear. The more widely adopted or firmly rooted neologisms are often described in dictionaries, such as the Algemeen Nederlands Woordenboek (ANW), an online dictionary of present-day Dutch. Why are some new words adopted, while others are ignored? Is it necessary to register and describe neologisms that are likely to disappear, for example in a dictionary of neologisms? And what should a dictionary of neologisms look like?In this article I present a pilot version of a new dictionary of Dutch neologisms. Firstly, I explain how neologisms are created in general and what Dutch neologisms look like. Secondly, I demonstrate why it is necessary to register and describe neologisms (including those that are not adopted in contemporary speech) in an online dictionary portal. Then I zoom in on Dutch and show how potential neologisms in Dutch can be detected with the aid of the computer tool Neoloog and through corpus analysis. Finally, I examine the lemma structure of a Dutch special-domain dictionary of neologisms, the Neologismenwoordenboek (NW) and discuss how it differs from the ANW in the way it describes neologisms.","PeriodicalId":35106,"journal":{"name":"Dictionaries","volume":"41 1","pages":"27 - 44"},"PeriodicalIF":0.0,"publicationDate":"2020-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/dic.2020.0003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44834070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ABSTRACT:The most recent literature on neology has discussed the criteria that must be taken into account in order to include new words in dictionaries (Metcalf 2002, Ishikawa 2006, O'Donovan and O'Neill 2008, Cook 2010, Freixa 2016, among others). Although there are other factors that must be considered, such as morphologic features and semantic transparency (Adelstein and Freixa 2013, Bernal et al. 2018), authors broadly agree that frequency plays a central role, given that high corpus frequency may be taken as evidence of the institutionalization of a lexical unit. However, it has also been pointed out that frequency is a complex criterion in itself, and, therefore, aspects such as stabilization in use (Cook 2010) or a possible longitudinal change in frequency (Metcalf 2002, Ishikawa 2006) must also be taken into account when measuring frequency in corpora.In this paper, we approach lexical frequency as a criterion to evaluate whether neologisms should be included in Spanish dictionaries from a new perspective. Specifically, we compare data concerning change in frequency of neologisms through time with speakers' perceptions about their novelty, known as "neological feeling" in the specialized literature (Gardin et al. 1974, Sablayrolles 2003). Data about speakers' perceptions is obtained from online questionnaires carried out within the framework of the Neómetro project (Bernal et al. in press). A set of questionnaires was launched in which 100 subjects evaluated their perceptions of 130 neologisms in Spanish according to four different criteria (correct formation, frequency, novelty and necessity of inclusion in dictionaries). Frequency data is taken from an extensive corpus of texts from the press, Factiva, which provides histograms of frequency through time.For this study, we analyze the neologisms that were perceived as the most and the least frequent in the questionnaires. We analyze their frequency curve through time in Factiva to find correlations between stabilization in time and speakers' perceptions of their institutionalization. The data allows us to improve the predictive capacity of frequency as a measure to decide which neologisms should be included in dictionaries, as it introduces factors (formal, semantic or usage) that favor or hinder institutionalization into the equation.
摘要:最近关于新词的文献讨论了词典收录新词必须考虑的标准(Metcalf 2002, Ishikawa 2006, O'Donovan and O'Neill 2008, Cook 2010, Freixa 2016等)。尽管必须考虑其他因素,如形态特征和语义透明度(Adelstein and Freixa 2013, Bernal et al. 2018),但作者普遍认为频率起着核心作用,因为高语料库频率可能被视为词汇单位制度化的证据。然而,也有人指出,频率本身是一个复杂的标准,因此,在测量语料库中的频率时,还必须考虑诸如使用稳定性(Cook 2010)或频率可能的纵向变化(Metcalf 2002, Ishikawa 2006)等方面。本文从一个新的角度出发,探讨了词汇频次作为衡量新词是否应纳入西班牙语词典的标准。具体来说,我们比较了有关新词频率随时间变化的数据与说话者对其新颖性的感知,在专业文献中被称为“新词感觉”(Gardin et al. 1974, Sablayrolles 2003)。关于说话者感知的数据是从Neómetro项目框架内进行的在线问卷中获得的(Bernal等人)。研究人员开展了一套调查问卷,让100名受试者根据四个不同的标准(正确的构词方式、使用频率、新颖性和词典收录的必要性)来评估他们对130个西班牙语新词的看法。频率数据取自媒体Factiva的大量文本语料库,该语料库提供频率随时间的直方图。在这项研究中,我们分析了在问卷中被认为是最频繁和最不频繁的新词。我们在Factiva中分析了他们的频率随时间的曲线,以发现时间稳定与说话者对其制度化的看法之间的相关性。这些数据使我们能够提高频率的预测能力,作为决定哪些新词应该被纳入词典的一种措施,因为它引入了有利于或阻碍制度化的因素(形式、语义或用法)。
{"title":"Beyond Frequency: On the Dictionarization of New Words in Spanish","authors":"J. Freixa, Sergi Torner","doi":"10.1353/dic.2020.0008","DOIUrl":"https://doi.org/10.1353/dic.2020.0008","url":null,"abstract":"ABSTRACT:The most recent literature on neology has discussed the criteria that must be taken into account in order to include new words in dictionaries (Metcalf 2002, Ishikawa 2006, O'Donovan and O'Neill 2008, Cook 2010, Freixa 2016, among others). Although there are other factors that must be considered, such as morphologic features and semantic transparency (Adelstein and Freixa 2013, Bernal et al. 2018), authors broadly agree that frequency plays a central role, given that high corpus frequency may be taken as evidence of the institutionalization of a lexical unit. However, it has also been pointed out that frequency is a complex criterion in itself, and, therefore, aspects such as stabilization in use (Cook 2010) or a possible longitudinal change in frequency (Metcalf 2002, Ishikawa 2006) must also be taken into account when measuring frequency in corpora.In this paper, we approach lexical frequency as a criterion to evaluate whether neologisms should be included in Spanish dictionaries from a new perspective. Specifically, we compare data concerning change in frequency of neologisms through time with speakers' perceptions about their novelty, known as \"neological feeling\" in the specialized literature (Gardin et al. 1974, Sablayrolles 2003). Data about speakers' perceptions is obtained from online questionnaires carried out within the framework of the Neómetro project (Bernal et al. in press). A set of questionnaires was launched in which 100 subjects evaluated their perceptions of 130 neologisms in Spanish according to four different criteria (correct formation, frequency, novelty and necessity of inclusion in dictionaries). Frequency data is taken from an extensive corpus of texts from the press, Factiva, which provides histograms of frequency through time.For this study, we analyze the neologisms that were perceived as the most and the least frequent in the questionnaires. We analyze their frequency curve through time in Factiva to find correlations between stabilization in time and speakers' perceptions of their institutionalization. The data allows us to improve the predictive capacity of frequency as a measure to decide which neologisms should be included in dictionaries, as it introduces factors (formal, semantic or usage) that favor or hinder institutionalization into the equation.","PeriodicalId":35106,"journal":{"name":"Dictionaries","volume":"41 1","pages":"131 - 153"},"PeriodicalIF":0.0,"publicationDate":"2020-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/dic.2020.0008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42757424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ABSTRACT:This paper explores the inclusion of genericized trademarks that have made their way into Greek dictionaries. Genericized trademarks constitute a special type of neologism, balancing between non-lexical and lexical, between "proper" and "common". Although the goal of creating a brand name is to make a specific product easily recognizable by distinguishing it from the rest of its kind, the trademark might become so well-known and widely used that it starts denoting all similar products, becomes part of the general vocabulary and gains lemma status in dictionaries. Given the fact that very little, if any, documentation exists on the subject, be it publicized lexicographic policies, style guides, or any references in the relevant literature, the main aim of the article is to explore some of the criteria by which such items have made their way into dictionaries of Modern Greek. First, an overview of genericized trademarks and brand names in Modern Greek dictionaries is presented. Then, based on the etymological information in the dictionaries, the paper investigates now many genericized trademarks are borrowed by other languages compared to Greek and which these languages are. The list of all these items is cross-checked against two different corpora to compare the frequency of their lexical use to that of their non-lexical use. Finally, the article attempts to test whether the main criteria used in the English lexicographic tradition to differentiate the two forms of use also apply in the case of Modern Greek.
{"title":"Exploring Criteria for the Inclusion of Trademarks in General Language Dictionaries of Modern Greek","authors":"A. Vacalopoulou","doi":"10.1353/dic.2020.0009","DOIUrl":"https://doi.org/10.1353/dic.2020.0009","url":null,"abstract":"ABSTRACT:This paper explores the inclusion of genericized trademarks that have made their way into Greek dictionaries. Genericized trademarks constitute a special type of neologism, balancing between non-lexical and lexical, between \"proper\" and \"common\". Although the goal of creating a brand name is to make a specific product easily recognizable by distinguishing it from the rest of its kind, the trademark might become so well-known and widely used that it starts denoting all similar products, becomes part of the general vocabulary and gains lemma status in dictionaries. Given the fact that very little, if any, documentation exists on the subject, be it publicized lexicographic policies, style guides, or any references in the relevant literature, the main aim of the article is to explore some of the criteria by which such items have made their way into dictionaries of Modern Greek. First, an overview of genericized trademarks and brand names in Modern Greek dictionaries is presented. Then, based on the etymological information in the dictionaries, the paper investigates now many genericized trademarks are borrowed by other languages compared to Greek and which these languages are. The list of all these items is cross-checked against two different corpora to compare the frequency of their lexical use to that of their non-lexical use. Finally, the article attempts to test whether the main criteria used in the English lexicographic tradition to differentiate the two forms of use also apply in the case of Modern Greek.","PeriodicalId":35106,"journal":{"name":"Dictionaries","volume":"41 1","pages":"155 - 177"},"PeriodicalIF":0.0,"publicationDate":"2020-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/dic.2020.0009","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41809406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ABSTRACT:This is an introduction to a special issue of Dictionaries: Journal of the Dictionary Society of North America. It offers a characterization of neology and describes the Globalex-sponsored workshop at which the papers in the issue originated. It provides an overview of the papers, which treat lexicographical neology and neological lexicography in Danish, Dutch, Estonian, Frisian, Greek, Korean, Spanish, and Swahili and address relevant aspects of lexicography in those languages, presenting state-of-the-art research into neology and ideas about modern lexicographic treatment of neologisms in various dictionary types.
{"title":"Global Viewpoints on Lexicography and Neologisms: An Introduction","authors":"Annette Klosa-Kückelhaus, Ilan Kernerman","doi":"10.1353/dic.2020.0001","DOIUrl":"https://doi.org/10.1353/dic.2020.0001","url":null,"abstract":"ABSTRACT:This is an introduction to a special issue of Dictionaries: Journal of the Dictionary Society of North America. It offers a characterization of neology and describes the Globalex-sponsored workshop at which the papers in the issue originated. It provides an overview of the papers, which treat lexicographical neology and neological lexicography in Danish, Dutch, Estonian, Frisian, Greek, Korean, Spanish, and Swahili and address relevant aspects of lexicography in those languages, presenting state-of-the-art research into neology and ideas about modern lexicographic treatment of neologisms in various dictionary types.","PeriodicalId":35106,"journal":{"name":"Dictionaries","volume":"41 1","pages":"1 - 9"},"PeriodicalIF":0.0,"publicationDate":"2020-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/dic.2020.0001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47639858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ABSTRACT:This paper reports on the Korean Neologism Investigation Project and discusses a number of issues related to neologism research. Since 1994, when the government of South Korea initiated the project, the use of the Internet and mobile phones has increased exponentially and the methods and scope of the investigation into Korean neologisms have been modified accordingly. This project consists of collecting all the neologisms that appear each year in news articles on the Naver portal using a Web-based neologism extractor (task 1) and examining the usage development of neologisms within the past decade using a Web crawler (task 2). As a result of task 2, the neologisms that occurred at least twenty times in the Web-crawled corpus, across ten articles or more, for five years or more over a span of ten years, are considered as headword candidates. Whether these constitute suitable criteria for lexicographic inclusion is also examined. This paper also examines how the results of tasks 1 and 2 are reflected in Korean lexicography by looking up high-frequency neologisms in four major Korean dictionaries, among which two are user-generated. The results of this survey confirm the crucial role of expert lexicographers and the value of the Korean Neologism Investigation Project in the lexicographic inclusion of neologisms.
{"title":"The Korean Neologism Investigation Project: Current Status and Key Issues","authors":"Kilim Nam, Soojin Lee, HaeRee Jung","doi":"10.1353/dic.2020.0007","DOIUrl":"https://doi.org/10.1353/dic.2020.0007","url":null,"abstract":"ABSTRACT:This paper reports on the Korean Neologism Investigation Project and discusses a number of issues related to neologism research. Since 1994, when the government of South Korea initiated the project, the use of the Internet and mobile phones has increased exponentially and the methods and scope of the investigation into Korean neologisms have been modified accordingly. This project consists of collecting all the neologisms that appear each year in news articles on the Naver portal using a Web-based neologism extractor (task 1) and examining the usage development of neologisms within the past decade using a Web crawler (task 2). As a result of task 2, the neologisms that occurred at least twenty times in the Web-crawled corpus, across ten articles or more, for five years or more over a span of ten years, are considered as headword candidates. Whether these constitute suitable criteria for lexicographic inclusion is also examined. This paper also examines how the results of tasks 1 and 2 are reflected in Korean lexicography by looking up high-frequency neologisms in four major Korean dictionaries, among which two are user-generated. The results of this survey confirm the crucial role of expert lexicographers and the value of the Korean Neologism Investigation Project in the lexicographic inclusion of neologisms.","PeriodicalId":35106,"journal":{"name":"Dictionaries","volume":"41 1","pages":"105 - 129"},"PeriodicalIF":0.0,"publicationDate":"2020-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/dic.2020.0007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43684286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Margit Langemets, Jelena Kallas, Kaisa Norak, Indrek Hein
ABSTRACT:The Web era has intensified the need for the automatic monitoring of language, including the extraction of new words and senses. In this paper, we first give a brief overview of the unified dictionary system Ekilex, the starting point for all new lexicographic tasks at the Institute of the Estonian Language since 2019. We describe the existing databases meant for manual collecting and registering new words and meanings. Next we describe an experimental study on semi-automatic new word detection on the basis of the small media corpus and existing dictionaries carried out in 2018 at the Institute of the Estonian Language. The goal of the experiment was to develop a workflow for new word detection, to test the reliability of the tools for Estonian language processing, and to compile the new word candidate list. The experiment was focused on single word detection. The results revealed that in order to make new word discovery more effective we need more advanced tools for automatic language processing, and we perceive an urgent need to set up an infrastructure for (semi-) automatic new word detection.This is the first study for Estonian aimed at the development of a tool to supply lexicographers with new word candidates for inclusion in a dictionary. We end the paper by discussing some aspects of the lexicographic treatment of new words and meanings in the near future.
{"title":"New Estonian Words and Senses: Detection and Description","authors":"Margit Langemets, Jelena Kallas, Kaisa Norak, Indrek Hein","doi":"10.1353/dic.2020.0005","DOIUrl":"https://doi.org/10.1353/dic.2020.0005","url":null,"abstract":"ABSTRACT:The Web era has intensified the need for the automatic monitoring of language, including the extraction of new words and senses. In this paper, we first give a brief overview of the unified dictionary system Ekilex, the starting point for all new lexicographic tasks at the Institute of the Estonian Language since 2019. We describe the existing databases meant for manual collecting and registering new words and meanings. Next we describe an experimental study on semi-automatic new word detection on the basis of the small media corpus and existing dictionaries carried out in 2018 at the Institute of the Estonian Language. The goal of the experiment was to develop a workflow for new word detection, to test the reliability of the tools for Estonian language processing, and to compile the new word candidate list. The experiment was focused on single word detection. The results revealed that in order to make new word discovery more effective we need more advanced tools for automatic language processing, and we perceive an urgent need to set up an infrastructure for (semi-) automatic new word detection.This is the first study for Estonian aimed at the development of a tool to supply lexicographers with new word candidates for inclusion in a dictionary. We end the paper by discussing some aspects of the lexicographic treatment of new words and meanings in the near future.","PeriodicalId":35106,"journal":{"name":"Dictionaries","volume":"41 1","pages":"69 - 82"},"PeriodicalIF":0.0,"publicationDate":"2020-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/dic.2020.0005","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47115645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Danish Dictionary, a corpus-based online dictionary, contains just over 100,000 entries. The dictionary is updated on a regular basis, with new versions published two or three times a year. Whenever an update is released, it almost always becomes the object of public attention. The media love new words and usually assume that a new word in the dictionary is also a new word in the language—a neologism. Of course, popular belief is far from the truth: many newly published words have been in the language for a long time but were perhaps too infrequent to be included previously.
Given their popularity, neologisms are obviously interesting for the dictionary staff, and in this paper I analyze those that have been included recently and consider whether special selection criteria should apply. The editors do not use a specific method to detect neologisms in particular but have various tools to assist them in finding lemma candidates in general, and they can also analyze the updates that have been published in recent years. I pursue both these approaches, addressing questions including the following:
What broad types of neologisms exist and what are their characteristics?
How does pressure from English affect the vocabulary of the dictionary?
Are Anglicisms dominant or used increasingly over time as compared with language-internal neologisms? Does globalization promote the import of words from other languages, too?
Although the notion 'neologism' pertains to a range of linguistic phenomena, I confine myself in this context to words and multiword expressions as (potential) entries.
{"title":"Language-Internal Neologisms and Anglicisms: Dealing with New Words and Expressions in The Danish Dictionary","authors":"Lars Trap-Jensen","doi":"10.1353/dic.2020.0002","DOIUrl":"https://doi.org/10.1353/dic.2020.0002","url":null,"abstract":"<p>ABSTRACT:</p><p><i>The Danish Dictionary</i>, a corpus-based online dictionary, contains just over 100,000 entries. The dictionary is updated on a regular basis, with new versions published two or three times a year. Whenever an update is released, it almost always becomes the object of public attention. The media love new words and usually assume that a new word in the dictionary is also a new word in the language—a neologism. Of course, popular belief is far from the truth: many newly published words have been in the language for a long time but were perhaps too infrequent to be included previously.</p><p>Given their popularity, neologisms are obviously interesting for the dictionary staff, and in this paper I analyze those that have been included recently and consider whether special selection criteria should apply. The editors do not use a specific method to detect neologisms in particular but have various tools to assist them in finding lemma candidates in general, and they can also analyze the updates that have been published in recent years. I pursue both these approaches, addressing questions including the following: <list list-type=\"order\"><list-item><label>(1)</label><p>What broad types of neologisms exist and what are their characteristics?</p></list-item><list-item><label>(2)</label><p>How does pressure from English affect the vocabulary of the dictionary?</p></list-item><list-item><label>(3)</label><p>Are Anglicisms dominant or used increasingly over time as compared with language-internal neologisms? Does globalization promote the import of words from other languages, too?</p></list-item></list> Although the notion 'neologism' pertains to a range of linguistic phenomena, I confine myself in this context to words and multiword expressions as (potential) entries.</p>","PeriodicalId":35106,"journal":{"name":"Dictionaries","volume":"41 1","pages":"11 - 25"},"PeriodicalIF":0.0,"publicationDate":"2020-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/dic.2020.0002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48390849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}