首页 > 最新文献

Language Resources and Evaluation最新文献

英文 中文
Towards a resource for multilingual lexicons: an MT assisted and human-in-the-loop multilingual parallel corpus with multi-word expression annotation. 面向多语言词汇资源:基于多词表达标注的机器翻译辅助人机循环多语言并行语料库。
IF 1.8 3区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-01 Epub Date: 2026-03-14 DOI: 10.1007/s10579-025-09876-7
Lifeng Han, Najet Hadj Mohamed, Malak Rassem, Gareth J F Jones, Alan F Smeaton, Goran Nenadic

In this work, we introduce the construction of a machine translation (MT) assisted and human-in-the-loop multilingual parallel corpus with annotations of multi-word expressions (MWEs), named AlphaMWE. The MWEs include verbal MWEs (vMWEs) defined in the PARSEME shared task that have a verb as the head of the studied terms. The annotated vMWEs are also bilingually and multilingually aligned manually. The languages covered include Arabic, Chinese, English, German, Italian, and Polish, of which, the Arabic corpus includes both standard and dialectal variations from Egypt and Tunisia. Our original English corpus is taken from the PARSEME shared task in 2018. We performed machine translation of this source corpus followed by human post-editing and annotation of target MWEs. Strict quality control was applied for error limitation, i.e., each MT output sentence received first manual post-editing and annotation plus a second manual quality rechecking. One of our findings during corpora preparation is that accurate translation of MWEs presents challenges to MT systems, as reflected by the outcomes of human-in-the-loop metric HOPE. To facilitate further MT research, we present a categorisation of the error types encountered by MT systems in performing MWE-related translation. To acquire a broader view of MT issues, we selected four popular state-of-the-art MT systems for comparison, namely Microsoft Bing Translator, GoogleMT, Baidu Fanyi, and DeepL MT. Because of the noise removal, translation post-editing, and MWE annotation by human professionals, we believe the AlphaMWE data set will be an asset for both monolingual and cross-lingual research, such as multi-word term lexicography, MT, and information extraction.

在这项工作中,我们介绍了一个机器翻译(MT)辅助和人类在环的多语言并行语料库的构建,命名为alphawe,该语料库具有多词表达式(MWEs)的注释。MWEs包括在PARSEME共享任务中定义的动词MWEs (vMWEs),其动词作为所研究术语的头部。标注的vMWEs还可以手动进行双语和多语言对齐。涵盖的语言包括阿拉伯语、汉语、英语、德语、意大利语和波兰语,其中阿拉伯语料库包括来自埃及和突尼斯的标准和方言变体。我们的原始英语语料库取自2018年PARSEME共享任务。我们对该源语料库进行机器翻译,然后对目标MWEs进行人工后期编辑和注释。严格的质量控制是为了限制错误,即每个MT输出的句子都要经过第一次人工后期编辑和注释,再进行第二次人工质量复核。我们在语料库准备过程中的一个发现是,MWEs的准确翻译对机器翻译系统提出了挑战,正如人类在循环度量HOPE的结果所反映的那样。为了便于进一步的机器翻译研究,我们对机器翻译系统在执行mwe相关翻译时遇到的错误类型进行了分类。为了更广泛地了解机器翻译问题,我们选择了四个流行的最先进的机器翻译系统进行比较,即Microsoft Bing Translator, GoogleMT,百度fani和DeepL机器翻译系统。由于噪声去除,翻译后期编辑和人类专业人员的MWE注释,我们相信alphawe数据集将成为单语言和跨语言研究的资产,例如多词术语词典编纂,机器翻译和信息提取。
{"title":"Towards a resource for multilingual lexicons: an MT assisted and human-in-the-loop multilingual parallel corpus with multi-word expression annotation.","authors":"Lifeng Han, Najet Hadj Mohamed, Malak Rassem, Gareth J F Jones, Alan F Smeaton, Goran Nenadic","doi":"10.1007/s10579-025-09876-7","DOIUrl":"https://doi.org/10.1007/s10579-025-09876-7","url":null,"abstract":"<p><p>In this work, we introduce the construction of a machine translation (MT) assisted and human-in-the-loop multilingual parallel corpus with annotations of multi-word expressions (MWEs), named AlphaMWE. The MWEs include verbal MWEs (vMWEs) defined in the PARSEME shared task that have a verb as the head of the studied terms. The annotated vMWEs are also bilingually and multilingually aligned manually. The languages covered include Arabic, Chinese, English, German, Italian, and Polish, of which, the Arabic corpus includes both standard and dialectal variations from Egypt and Tunisia. Our original English corpus is taken from the PARSEME shared task in 2018. We performed machine translation of this source corpus followed by human post-editing and annotation of target MWEs. Strict quality control was applied for error limitation, i.e., each MT output sentence received first manual post-editing and annotation plus a second manual quality rechecking. One of our findings during corpora preparation is that accurate translation of MWEs presents challenges to MT systems, as reflected by the outcomes of human-in-the-loop metric HOPE. To facilitate further MT research, we present a categorisation of the error types encountered by MT systems in performing MWE-related translation. To acquire a broader view of MT issues, we selected four popular state-of-the-art MT systems for comparison, namely Microsoft Bing Translator, GoogleMT, Baidu Fanyi, and DeepL MT. Because of the noise removal, translation post-editing, and MWE annotation by human professionals, we believe the AlphaMWE data set will be an asset for both monolingual and cross-lingual research, such as multi-word term lexicography, MT, and information extraction.</p>","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"60 2","pages":"33"},"PeriodicalIF":1.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12989027/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147469850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OjibweMorph: an approachable finite-state transducer for Ojibwe (and beyond). OjibweMorph:一个可接近的有限状态传感器,用于Ojibwe(及其他)。
IF 1.8 3区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-01 Epub Date: 2026-02-28 DOI: 10.1007/s10579-025-09887-4
Christopher Hammerly, Nora Livesay, Antti Arppe, Anna Stacey, Miikka Silfverberg

This paper describes the design, evaluation, and application of OjibweMorph, a finite-state transducer (FST) for generating and analyzing words in the Central Algonquian language Ojibwe. We created a language-general modular system for creating FSTs from human- and machine-readable spreadsheets, where sets of inflectional and derivational morphology can be defined, combined with a lexical database, and automatically compiled into an FST. We show how this system is applied to generate and analyze the complex nominal and verbal morphology in Ojibwe, with an eye towards how our framework and toolkit can be used to create FSTs for other morphologically complex languages. We evaluate the Ojibwe version of the system by checking the model's performance against a set of inflectional forms and example sentences from the Ojibwe People's Dictionary, and describe the application of the FST to create a linguistically analyzed corpus, an automatic verb conjugation tool for education, a spell-checker, and intelligent dictionary search.

本文描述了OjibweMorph的设计、评估和应用,OjibweMorph是一种有限状态传感器(FST),用于生成和分析中部阿尔冈昆语Ojibwe中的单词。我们创建了一个语言通用模块化系统,用于从人类和机器可读的电子表格中创建FST,其中可以定义屈折和派生形态学集,与词汇数据库相结合,并自动编译成FST。我们展示了如何将该系统应用于生成和分析Ojibwe中复杂的词性和词性形态,并着眼于如何使用我们的框架和工具包为其他形态复杂的语言创建fst。我们通过对照Ojibwe人民词典中的一组屈折形式和例句来评估该系统的Ojibwe版本,并描述了FST在创建语言分析语料库、用于教育的自动动词变位工具、拼写检查器和智能字典搜索方面的应用。
{"title":"OjibweMorph: an approachable finite-state transducer for Ojibwe (and beyond).","authors":"Christopher Hammerly, Nora Livesay, Antti Arppe, Anna Stacey, Miikka Silfverberg","doi":"10.1007/s10579-025-09887-4","DOIUrl":"https://doi.org/10.1007/s10579-025-09887-4","url":null,"abstract":"<p><p>This paper describes the design, evaluation, and application of <i>OjibweMorph</i>, a finite-state transducer (FST) for generating and analyzing words in the Central Algonquian language Ojibwe. We created a language-general modular system for creating FSTs from human- and machine-readable spreadsheets, where sets of inflectional and derivational morphology can be defined, combined with a lexical database, and automatically compiled into an FST. We show how this system is applied to generate and analyze the complex nominal and verbal morphology in Ojibwe, with an eye towards how our framework and toolkit can be used to create FSTs for other morphologically complex languages. We evaluate the Ojibwe version of the system by checking the model's performance against a set of inflectional forms and example sentences from the Ojibwe People's Dictionary, and describe the application of the FST to create a linguistically analyzed corpus, an automatic verb conjugation tool for education, a spell-checker, and intelligent dictionary search.</p>","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"60 2","pages":"27"},"PeriodicalIF":1.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12950098/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147345764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The narratives of war (NoW) corpus of written testimonies of the Russia-Ukraine war. 战争的叙述(NoW)语料库的书面证词的俄罗斯-乌克兰战争。
IF 1.8 3区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-01 Epub Date: 2025-02-19 DOI: 10.1007/s10579-025-09813-8
Serhii Zasiekin, Larysa Zasiekina, Emilie Altman, Mariia Hryntus, Victor Kuperman

Documentation and analysis of psychological states experienced by witnesses and survivors of catastrophic events is a critical concern of psychological research. This paper introduces the new corpus of written testimonies collected from nearly 1500 Ukrainian civilians from May 2022-January 2024, during Russia's invasion of Ukraine. The texts are available in the original Ukrainian and the English translation. The Narratives of War (NoW) corpus additionally contains demographic and geographic data on respondents, as well as their scores in tests of PTSD symptoms and moral injury. The paper provides a detailed introduction into the method of data collection and corpus structure. It also reports a quantitative frequency-based "keyness" analysis that identifies words particularly representative of the NoW corpus, as compared to the reference corpus of Ukrainian texts that predates the war with Russia. These key words shed light on the psychological state of witnesses of war. With its materials collected during the ongoing war, the corpus contributes to the body of knowledge for studies of the psychological impact of war and trauma on civilian populations.

记录和分析灾难性事件的目击者和幸存者所经历的心理状态是心理学研究的一个关键问题。本文介绍了俄罗斯入侵乌克兰期间从2022年5月至2024年1月收集的近1500名乌克兰平民的书面证词的新语料库。这些文本有乌克兰原文和英文译本。战争叙事(NoW)语料库还包含受访者的人口统计和地理数据,以及他们在创伤后应激障碍症状和道德伤害测试中的得分。本文详细介绍了数据收集方法和语料库结构。它还报告了一个定量的基于频率的“关键字”分析,该分析识别了NoW语料库中特别具有代表性的单词,与俄罗斯战争之前的乌克兰文本的参考语料库进行了比较。这些关键词揭示了战争目击者的心理状态。该语料库收集了正在进行的战争期间收集的材料,为研究战争和创伤对平民人口的心理影响提供了知识体系。
{"title":"The narratives of war (NoW) corpus of written testimonies of the Russia-Ukraine war.","authors":"Serhii Zasiekin, Larysa Zasiekina, Emilie Altman, Mariia Hryntus, Victor Kuperman","doi":"10.1007/s10579-025-09813-8","DOIUrl":"10.1007/s10579-025-09813-8","url":null,"abstract":"<p><p>Documentation and analysis of psychological states experienced by witnesses and survivors of catastrophic events is a critical concern of psychological research. This paper introduces the new corpus of written testimonies collected from nearly 1500 Ukrainian civilians from May 2022-January 2024, during Russia's invasion of Ukraine. The texts are available in the original Ukrainian and the English translation. The Narratives of War (NoW) corpus additionally contains demographic and geographic data on respondents, as well as their scores in tests of PTSD symptoms and moral injury. The paper provides a detailed introduction into the method of data collection and corpus structure. It also reports a quantitative frequency-based \"keyness\" analysis that identifies words particularly representative of the NoW corpus, as compared to the reference corpus of Ukrainian texts that predates the war with Russia. These key words shed light on the psychological state of witnesses of war. With its materials collected during the ongoing war, the corpus contributes to the body of knowledge for studies of the psychological impact of war and trauma on civilian populations.</p>","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"59 3","pages":"2415-2426"},"PeriodicalIF":1.8,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12296800/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144734916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VeLeSpa: An inflected verbal lexicon of Peninsular Spanish and a quantitative analysis of paradigmatic predictability. VeLeSpa:半岛西班牙语的屈折词汇和范式可预测性的定量分析。
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-01 Epub Date: 2024-10-09 DOI: 10.1007/s10579-024-09776-2
Borja Herce

This paper presents VeLeSpa, a verbal lexicon of Peninsular Spanish, which contains the full paradigms (all 63 cells) in phonological form of 6553 verbs, along with their corresponding frequencies. In this paper, the process and decisions involved in the building of the resource are presented. In addition, based on the most frequent 3000 + verbs, a quantitative analysis is conducted of morphological predictability in Spanish verbal inflection. The results and their drivers are discussed, as well as observed differences with other Romance languages and Latin.

本文介绍了一个半岛西班牙语的动词词汇VeLeSpa,它包含6553个动词的语音形式的完整范式(全部63个单元格),以及它们对应的频率。本文介绍了资源建设的过程和决策。此外,基于使用频率最高的3000多个动词,对西班牙语词形变化的形态可预测性进行了定量分析。讨论了结果及其驱动因素,以及观察到的与其他罗曼语和拉丁语的差异。
{"title":"VeLeSpa: An inflected verbal lexicon of Peninsular Spanish and a quantitative analysis of paradigmatic predictability.","authors":"Borja Herce","doi":"10.1007/s10579-024-09776-2","DOIUrl":"10.1007/s10579-024-09776-2","url":null,"abstract":"<p><p>This paper presents VeLeSpa, a verbal lexicon of Peninsular Spanish, which contains the full paradigms (all 63 cells) in phonological form of 6553 verbs, along with their corresponding frequencies. In this paper, the process and decisions involved in the building of the resource are presented. In addition, based on the most frequent 3000 + verbs, a quantitative analysis is conducted of morphological predictability in Spanish verbal inflection. The results and their drivers are discussed, as well as observed differences with other Romance languages and Latin.</p>","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"59 2","pages":"1705-1718"},"PeriodicalIF":1.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12086111/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144112555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sentiment analysis dataset in Moroccan dialect: bridging the gap between Arabic and Latin scripted dialect 摩洛哥方言情感分析数据集:弥合阿拉伯语和拉丁字母方言之间的差距
IF 2.7 3区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-09-11 DOI: 10.1007/s10579-024-09764-6
Mouad Jbel, Mourad Jabrane, Imad Hafidi, Abdulmutallib Metrane

Sentiment analysis, the automated process of determining emotions or opinions expressed in text, has seen extensive exploration in the field of natural language processing. However, one aspect that has remained underrepresented is the sentiment analysis of the Moroccan dialect, which boasts a unique linguistic landscape and the coexistence of multiple scripts. Previous works in sentiment analysis primarily targeted dialects employing Arabic script. While these efforts provided valuable insights, they may not fully capture the complexity of Moroccan web content, which features a blend of Arabic and Latin script. As a result, our study emphasizes the importance of extending sentiment analysis to encompass the entire spectrum of Moroccan linguistic diversity. Central to our research is the creation of the largest public dataset for Moroccan dialect sentiment analysis that incorporates not only Moroccan dialect written in Arabic script but also in Latin characters. By assembling a diverse range of textual data, we were able to construct a dataset with a range of 19,991 manually labeled texts in Moroccan dialect and also publicly available lists of stop words in Moroccan dialect as a new contribution to Moroccan Arabic resources. In our exploration of sentiment analysis, we undertook a comprehensive study encompassing various machine-learning models to assess their compatibility with our dataset. While our investigation revealed that the highest accuracy of 98.42% was attained through the utilization of the DarijaBert-mix transfer-learning model, we also delved into deep learning models. Notably, our experimentation yielded a commendable accuracy rate of 92% when employing a CNN model. Furthermore, in an effort to affirm the reliability of our dataset, we tested the CNN model using smaller publicly available datasets of Moroccan dialect, with results that proved to be promising and supportive of our findings.

情感分析是确定文本中表达的情感或观点的自动化过程,在自然语言处理领域有着广泛的探索。然而,摩洛哥方言的情感分析却一直没有得到充分的体现,因为摩洛哥方言具有独特的语言景观,多种文字并存。以前的情感分析工作主要针对使用阿拉伯文字的方言。虽然这些工作提供了有价值的见解,但它们可能无法完全捕捉到摩洛哥网络内容的复杂性,因为摩洛哥网络内容融合了阿拉伯语和拉丁语文字。因此,我们的研究强调了将情感分析扩展到摩洛哥语言多样性整个范围的重要性。我们研究的核心是创建最大的摩洛哥方言情感分析公共数据集,该数据集不仅包含以阿拉伯文字书写的摩洛哥方言,还包含以拉丁字母书写的摩洛哥方言。通过收集各种文本数据,我们构建了一个包含 19991 个人工标注的摩洛哥方言文本的数据集,并公开了摩洛哥方言中的停顿词列表,为摩洛哥阿拉伯语资源做出了新的贡献。在对情感分析的探索中,我们进行了一项包含各种机器学习模型的综合研究,以评估它们与我们的数据集的兼容性。调查显示,通过使用 DarijaBert-mix 转移学习模型,我们获得了 98.42% 的最高准确率,同时我们还深入研究了深度学习模型。值得注意的是,在采用 CNN 模型时,我们的实验取得了令人称道的 92% 的准确率。此外,为了证实我们的数据集的可靠性,我们使用较小的摩洛哥方言公开数据集对 CNN 模型进行了测试,结果证明是有希望的,支持了我们的研究结果。
{"title":"Sentiment analysis dataset in Moroccan dialect: bridging the gap between Arabic and Latin scripted dialect","authors":"Mouad Jbel, Mourad Jabrane, Imad Hafidi, Abdulmutallib Metrane","doi":"10.1007/s10579-024-09764-6","DOIUrl":"https://doi.org/10.1007/s10579-024-09764-6","url":null,"abstract":"<p>Sentiment analysis, the automated process of determining emotions or opinions expressed in text, has seen extensive exploration in the field of natural language processing. However, one aspect that has remained underrepresented is the sentiment analysis of the Moroccan dialect, which boasts a unique linguistic landscape and the coexistence of multiple scripts. Previous works in sentiment analysis primarily targeted dialects employing Arabic script. While these efforts provided valuable insights, they may not fully capture the complexity of Moroccan web content, which features a blend of Arabic and Latin script. As a result, our study emphasizes the importance of extending sentiment analysis to encompass the entire spectrum of Moroccan linguistic diversity. Central to our research is the creation of the largest public dataset for Moroccan dialect sentiment analysis that incorporates not only Moroccan dialect written in Arabic script but also in Latin characters. By assembling a diverse range of textual data, we were able to construct a dataset with a range of 19,991 manually labeled texts in Moroccan dialect and also publicly available lists of stop words in Moroccan dialect as a new contribution to Moroccan Arabic resources. In our exploration of sentiment analysis, we undertook a comprehensive study encompassing various machine-learning models to assess their compatibility with our dataset. While our investigation revealed that the highest accuracy of 98.42% was attained through the utilization of the DarijaBert-mix transfer-learning model, we also delved into deep learning models. Notably, our experimentation yielded a commendable accuracy rate of 92% when employing a CNN model. Furthermore, in an effort to affirm the reliability of our dataset, we tested the CNN model using smaller publicly available datasets of Moroccan dialect, with results that proved to be promising and supportive of our findings.</p>","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"6 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142199509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Studying word meaning evolution through incremental semantic shift detection 通过增量语义转换检测研究词义演变
IF 2.7 3区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-09-09 DOI: 10.1007/s10579-024-09769-1
Francesco Periti, Sergio Picascia, Stefano Montanelli, Alfio Ferrara, Nina Tahmasebi

The study of semantic shift, that is, of how words change meaning as a consequence of social practices, events and political circumstances, is relevant in Natural Language Processing, Linguistics, and Social Sciences. The increasing availability of large diachronic corpora and advance in computational semantics have accelerated the development of computational approaches to detecting such shift. In this paper, we introduce a novel approach to tracing the evolution of word meaning over time. Our analysis focuses on gradual changes in word semantics and relies on an incremental approach to semantic shift detection (SSD) called What is Done is Done (WiDiD). WiDiD leverages scalable and evolutionary clustering of contextualised word embeddings to detect semantic shift and capture temporal transactions in word meanings. Existing approaches to SSD: (a) significantly simplify the semantic shift problem to cover change between two (or a few) time points, and (b) consider the existing corpora as static. We instead treat SSD as an organic process in which word meanings evolve across tens or even hundreds of time periods as the corpus is progressively made available. This results in an extremely demanding task that entails a multitude of intricate decisions. We demonstrate the applicability of this incremental approach on a diachronic corpus of Italian parliamentary speeches spanning eighteen distinct time periods. We also evaluate its performance on seven popular labelled benchmarks for SSD across multiple languages. Empirical results show that our results are comparable to state-of-the-art approaches, while outperforming the state-of-the-art for certain languages.

语义转换研究,即词语如何因社会实践、事件和政治环境而改变意义的研究,与自然语言处理、语言学和社会科学息息相关。随着大型非同步语料库的不断增加以及计算语义学的进步,检测语义转换的计算方法也在加速发展。在本文中,我们介绍了一种追踪词义随时间演变的新方法。我们的分析侧重于词义的渐进变化,并依赖于一种名为 "所做即所为"(WiDiD)的增量式语义转变检测(SSD)方法。WiDiD 利用上下文词嵌入的可扩展和进化聚类来检测语义转换并捕捉词义中的时间交易。现有的 SSD 方法:(a) 将语义转换问题大幅简化为涵盖两个(或几个)时间点之间的变化;(b) 将现有语料库视为静态。而我们将 SSD 视为一个有机的过程,在这一过程中,随着语料库的逐步完善,词义会在数十个甚至数百个时间段内发生演变。这导致了一项要求极高的任务,需要做出大量复杂的决定。我们在横跨 18 个不同时期的意大利议会演讲的非同步语料库中演示了这种增量方法的适用性。我们还评估了它在七种流行的多语言 SSD 标签基准上的性能。实证结果表明,我们的结果与最先进的方法不相上下,而在某些语言上则优于最先进的方法。
{"title":"Studying word meaning evolution through incremental semantic shift detection","authors":"Francesco Periti, Sergio Picascia, Stefano Montanelli, Alfio Ferrara, Nina Tahmasebi","doi":"10.1007/s10579-024-09769-1","DOIUrl":"https://doi.org/10.1007/s10579-024-09769-1","url":null,"abstract":"<p>The study of <i>semantic shift</i>, that is, of how words change meaning as a consequence of social practices, events and political circumstances, is relevant in Natural Language Processing, Linguistics, and Social Sciences. The increasing availability of large diachronic corpora and advance in computational semantics have accelerated the development of computational approaches to detecting such shift. In this paper, we introduce a novel approach to tracing the evolution of word meaning over time. Our analysis focuses on gradual changes in word semantics and relies on an incremental approach to semantic shift detection (SSD) called <i>What is Done is Done</i> (WiDiD). WiDiD leverages scalable and evolutionary clustering of contextualised word embeddings to detect semantic shift and capture temporal <i>transactions</i> in word meanings. Existing approaches to SSD: (a) significantly simplify the semantic shift problem to cover change between two (or a few) time points, and (b) consider the existing corpora as static. We instead treat SSD as an organic process in which word meanings evolve across tens or even hundreds of time periods as the corpus is progressively made available. This results in an extremely demanding task that entails a multitude of intricate decisions. We demonstrate the applicability of this incremental approach on a diachronic corpus of Italian parliamentary speeches spanning eighteen distinct time periods. We also evaluate its performance on seven popular labelled benchmarks for SSD across multiple languages. Empirical results show that our results are comparable to state-of-the-art approaches, while outperforming the state-of-the-art for certain languages.</p>","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"26 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142199510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PARSEME-AR: Arabic reference corpus for multiword expressions using PARSEME annotation guidelines PARSEME-AR:使用 PARSEME 注释指南的阿拉伯语多词表达参考语料库
IF 2.7 3区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-28 DOI: 10.1007/s10579-024-09763-7
Najet Hadj Mohamed, Cherifa Ben Khelil, Agata Savary, Iskander Keskes, Jean Yves Antoine, Lamia Belguith Hadrich

In this paper we present PARSEME-AR, the first openly available Arabic corpus manually annotated for Verbal Multiword Expressions (VMWEs). The annotation process is carried out based on guidelines put forward by PARSEME, a multilingual project for more than 26 languages. The corpus contains 4749 VMWEs in about 7500 sentences taken from the Prague Arabic Dependency Treebank. The results notably show a high degree of discontinuity in Arabic VMWEs in comparison to other languages in the PARSEME suite. We also propose analyses of interesting and challenging phenomena encountered during the annotation process. Moreover, we offer the first benchmark for the VMWE identification task in Arabic, by training two state-of-the-art systems, on our Arabic data.

在本文中,我们介绍了 PARSEME-AR,这是第一个公开可用的阿拉伯语语料库,人工注释了口头多词表达 (VMWE)。注释过程是根据 PARSEME 提出的指导原则进行的,PARSEME 是一个多语言项目,涉及超过 26 种语言。该语料库包含来自布拉格阿拉伯语依存关系树库的约 7500 个句子中的 4749 个 VMWE。研究结果表明,与 PARSEME 套件中的其他语言相比,阿拉伯语 VMWE 的不连续性程度很高。我们还对注释过程中遇到的有趣和具有挑战性的现象进行了分析。此外,通过在我们的阿拉伯语数据上训练两个最先进的系统,我们为阿拉伯语 VMWE 识别任务提供了第一个基准。
{"title":"PARSEME-AR: Arabic reference corpus for multiword expressions using PARSEME annotation guidelines","authors":"Najet Hadj Mohamed, Cherifa Ben Khelil, Agata Savary, Iskander Keskes, Jean Yves Antoine, Lamia Belguith Hadrich","doi":"10.1007/s10579-024-09763-7","DOIUrl":"https://doi.org/10.1007/s10579-024-09763-7","url":null,"abstract":"<p>In this paper we present PARSEME-AR, the first openly available Arabic corpus manually annotated for Verbal Multiword Expressions (VMWEs). The annotation process is carried out based on guidelines put forward by PARSEME, a multilingual project for more than 26 languages. The corpus contains 4749 VMWEs in about 7500 sentences taken from the Prague Arabic Dependency Treebank. The results notably show a high degree of discontinuity in Arabic VMWEs in comparison to other languages in the PARSEME suite. We also propose analyses of interesting and challenging phenomena encountered during the annotation process. Moreover, we offer the first benchmark for the VMWE identification task in Arabic, by training two state-of-the-art systems, on our Arabic data.</p>","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"146 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142199512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Normalized dataset for Sanskrit word segmentation and morphological parsing 用于梵语单词分割和形态解析的规范化数据集
IF 2.7 3区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-28 DOI: 10.1007/s10579-024-09724-0
Sriram Krishnan, Amba Kulkarni, Gérard Huet

Sanskrit processing has seen a surge in the use of data-driven approaches over the past decade. Various tasks such as segmentation, morphological parsing, and dependency analysis have been tackled through the development of state-of-the-art models despite working with relatively limited datasets compared to other languages. However, a significant challenge lies in the availability of annotated datasets that are lexically, morphologically, syntactically, and semantically tagged. While syntactic and semantic tags are preferable for later stages of processing such as sentential parsing and disambiguation, lexical and morphological tags are crucial for low-level tasks of word segmentation and morphological parsing. The Digital Corpus of Sanskrit (DCS) is one notable effort that hosts over 650,000 lexically and morphologically tagged sentences from around 250 texts but also comes with its limitations at different levels of a sentence like chunk, segment, stem and morphological analysis. To overcome these limitations, we look at alternatives such as Sanskrit Heritage Segmenter (SH) and Saṃsādhanī tools, that provide information complementing DCS’ data. This work focuses on enriching the DCS dataset by incorporating analyses from SH, thereby creating a dataset that is rich in lexical and morphological information. Furthermore, this work also discusses the impact of such datasets on the performances of existing segmenters, specifically the Sanskrit Heritage Segmenter.

在过去十年中,梵语处理中数据驱动方法的使用激增。尽管与其他语言相比,使用的数据集相对有限,但通过开发最先进的模型,已经解决了分段、形态解析和依赖性分析等各种任务。然而,一个重大的挑战在于如何获得带有词法、词形、句法和语义标记的注释数据集。句法和语义标记更适合后期处理阶段,如句法分析和消歧,而词法和形态标记则对单词分割和形态分析等低级任务至关重要。梵文数字语料库(DCS)是一项值得注意的工作,该语料库收录了来自约 250 个文本的超过 650,000 个带有词法和词形标签的句子,但在句子的不同层次(如词块、词段、词干和词形分析)上也有其局限性。为了克服这些局限性,我们研究了梵文遗产分段器(SH)和 Saṃsādhanī 工具等替代工具,它们能提供补充 DCS 数据的信息。这项工作的重点是通过纳入 SH 的分析来丰富 DCS 数据集,从而创建一个词法和形态信息丰富的数据集。此外,这项工作还讨论了此类数据集对现有分词器(特别是梵文遗产分词器)性能的影响。
{"title":"Normalized dataset for Sanskrit word segmentation and morphological parsing","authors":"Sriram Krishnan, Amba Kulkarni, Gérard Huet","doi":"10.1007/s10579-024-09724-0","DOIUrl":"https://doi.org/10.1007/s10579-024-09724-0","url":null,"abstract":"<p>Sanskrit processing has seen a surge in the use of data-driven approaches over the past decade. Various tasks such as segmentation, morphological parsing, and dependency analysis have been tackled through the development of state-of-the-art models despite working with relatively limited datasets compared to other languages. However, a significant challenge lies in the availability of annotated datasets that are lexically, morphologically, syntactically, and semantically tagged. While syntactic and semantic tags are preferable for later stages of processing such as sentential parsing and disambiguation, lexical and morphological tags are crucial for low-level tasks of word segmentation and morphological parsing. The Digital Corpus of Sanskrit (DCS) is one notable effort that hosts over 650,000 lexically and morphologically tagged sentences from around 250 texts but also comes with its limitations at different levels of a sentence like chunk, segment, stem and morphological analysis. To overcome these limitations, we look at alternatives such as Sanskrit Heritage Segmenter (SH) and <i>Saṃsādhanī</i> tools, that provide information complementing DCS’ data. This work focuses on enriching the DCS dataset by incorporating analyses from SH, thereby creating a dataset that is rich in lexical and morphological information. Furthermore, this work also discusses the impact of such datasets on the performances of existing segmenters, specifically the Sanskrit Heritage Segmenter.</p>","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"14 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142199513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conversion of the Spanish WordNet databases into a Prolog-readable format 将西班牙文 WordNet 数据库转换为 Prolog 可读格式
IF 2.7 3区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-21 DOI: 10.1007/s10579-024-09752-w
Pascual Julián-Iranzo, Germán Rigau, Fernando Sáenz-Pérez, Pablo Velasco-Crespo

WordNet is a lexical database for English that is supplied in a variety of formats, including one compatible with the Prolog programming language. Given the success and usefulness of WordNet, wordnets of other languages have been developed, including Spanish. The Spanish WordNet, like others, does not provide a version compatible with Prolog. This work aims to fill this gap by translating the Multilingual Central Repository (MCR) version of the Spanish WordNet into a Prolog-compatible format. Thanks to this translation, a set of Spanish lexical databases are obtained, which allows access to WordNet information using declarative techniques and the deductive capabilities of the Prolog language. Also, this work facilitates the development of other programs to analyze the obtained information. Remarkably, we have adapted the technique of differential testing, used in software testing, to verify the correctness of this conversion. In addition, to ensure the consistency of the generated Prolog databases, as well as the databases from which we started, a complete series of integrity constraint tests have been carried out. In this way we have discovered some inconsistency problems in the MCR databases that have a reflection in the generated Prolog databases and have been reported to the owners of those databases.

WordNet 是一个英语词库,以多种格式提供,包括与 Prolog 编程语言兼容的格式。鉴于 WordNet 的成功和实用性,其他语言的词库也被开发出来,包括西班牙语。西班牙语 WordNet 和其他语言一样,没有提供与 Prolog 兼容的版本。这项工作旨在通过将西班牙语 WordNet 的多语种中央资源库 (MCR) 版本翻译成与 Prolog 兼容的格式来填补这一空白。通过翻译,我们获得了一套西班牙语词库,可以使用声明式技术和 Prolog 语言的演绎能力访问 WordNet 信息。此外,这项工作还有助于开发其他程序来分析所获得的信息。值得注意的是,我们采用了软件测试中使用的差分测试技术来验证这种转换的正确性。此外,为了确保生成的 Prolog 数据库和我们最初使用的数据库的一致性,我们还进行了一系列完整的完整性约束测试。通过这种方式,我们在 MCR 数据库中发现了一些不一致的问题,这些问题在生成的 Prolog 数据库中也有反映,并已报告给这些数据库的所有者。
{"title":"Conversion of the Spanish WordNet databases into a Prolog-readable format","authors":"Pascual Julián-Iranzo, Germán Rigau, Fernando Sáenz-Pérez, Pablo Velasco-Crespo","doi":"10.1007/s10579-024-09752-w","DOIUrl":"https://doi.org/10.1007/s10579-024-09752-w","url":null,"abstract":"<p>WordNet is a lexical database for English that is supplied in a variety of formats, including one compatible with the <span>Prolog</span> programming language. Given the success and usefulness of WordNet, wordnets of other languages have been developed, including Spanish. The Spanish WordNet, like others, does not provide a version compatible with <span>Prolog</span>. This work aims to fill this gap by translating the Multilingual Central Repository (MCR) version of the Spanish WordNet into a <span>Prolog</span>-compatible format. Thanks to this translation, a set of Spanish lexical databases are obtained, which allows access to WordNet information using declarative techniques and the deductive capabilities of the <span>Prolog</span> language. Also, this work facilitates the development of other programs to analyze the obtained information. Remarkably, we have adapted the technique of differential testing, used in software testing, to verify the correctness of this conversion. In addition, to ensure the consistency of the generated <span>Prolog</span> databases, as well as the databases from which we started, a complete series of integrity constraint tests have been carried out. In this way we have discovered some inconsistency problems in the MCR databases that have a reflection in the generated <span>Prolog</span> databases and have been reported to the owners of those databases.</p>","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"5 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142199539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Annotation and evaluation of a dialectal Arabic sentiment corpus against benchmark datasets using transformers 使用转换器对照基准数据集对阿拉伯语方言情感语料库进行注释和评估
IF 2.7 3区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-18 DOI: 10.1007/s10579-024-09750-y
Ibtissam Touahri, Azzeddine Mazroui

Sentiment analysis is a task in natural language processing aiming to identify the overall polarity of reviews for subsequent analysis. This study used the Arabic speech-act and sentiment analysis, Arabic sentiment tweets dataset, and SemEval benchmark datasets, along with the Moroccan sentiment analysis corpus, which focuses on the Moroccan dialect. Furthermore, the modern standard and dialectal Arabic corpus dataset has been created and annotated based on the three language types: modern standard Arabic, Moroccan Arabic Dialect, and Mixed Language. Additionally, the annotation has been performed at the sentiment level, categorizing sentiments as positive, negative, or mixed. The sizes of the datasets range from 2000 to 21,000 reviews. The essential dialectal characteristics to enhance a sentiment classification system have been outlined. The proposed approach has involved deploying several models employing the supervised approach, including occurrence vectors, Recurrent Neural Network-Long Short Term Memory, and the pre-trained transformer model Arabic bidirectional encoder representations from transformers (AraBERT), complemented by the integration of Generative Adversarial Networks (GANs). The uniqueness of the proposed approach lies in constructing and annotating manually a dialectal sentiment corpus and studying carefully its main characteristics, which are used then to feed the classical supervised model. Moreover, GANs that widen the gap between the studied classes have been used to enhance the obtained results with AraBERT. The classification test results have been promising, enabling a comparison with other systems. The proposed system has been evaluated against Mazajak and CAMelTools state-of-the-art systems, designed for most Arabic dialects, using the mentioned datasets. A significant improvement of 30 points in FNN has been observed. These results have affirmed the versatility of the proposed system, demonstrating its effectiveness across multi-dialectal, multi-domain datasets, as well as balanced and unbalanced ones.

情感分析是自然语言处理中的一项任务,旨在识别评论的整体极性,以便进行后续分析。本研究使用了阿拉伯语语音行为和情感分析、阿拉伯语情感推文数据集和 SemEval 基准数据集,以及侧重于摩洛哥方言的摩洛哥情感分析语料库。此外,还创建了现代标准和方言阿拉伯语语料库数据集,并根据现代标准阿拉伯语、摩洛哥阿拉伯方言和混合语言这三种语言类型进行了注释。此外,还在情感层面进行了注释,将情感分为正面、负面和混合情感。数据集的规模从 2000 到 21000 条评论不等。概述了增强情感分类系统的基本方言特征。所提出的方法涉及部署多个采用监督方法的模型,包括发生向量、循环神经网络-长短期记忆和预先训练的变压器模型阿拉伯变压器双向编码器表示法(AraBERT),并辅以生成对抗网络(GAN)的集成。所提议方法的独特之处在于通过手动方式构建和注释方言情感语料库,并仔细研究其主要特征,然后将其用于为经典监督模型提供信息。此外,还使用了 GANs 来拉大所研究类别之间的差距,以增强 AraBERT 所获得的结果。分类测试结果很不错,可以与其他系统进行比较。我们使用上述数据集,与 Mazajak 和 CAMelTools 这两个针对大多数阿拉伯语方言设计的最先进系统进行了评估。结果表明,FNN 明显提高了 30 个百分点。这些结果证实了所提系统的多功能性,证明了它在多方言、多领域数据集以及平衡和不平衡数据集上的有效性。
{"title":"Annotation and evaluation of a dialectal Arabic sentiment corpus against benchmark datasets using transformers","authors":"Ibtissam Touahri, Azzeddine Mazroui","doi":"10.1007/s10579-024-09750-y","DOIUrl":"https://doi.org/10.1007/s10579-024-09750-y","url":null,"abstract":"<p>Sentiment analysis is a task in natural language processing aiming to identify the overall polarity of reviews for subsequent analysis. This study used the Arabic speech-act and sentiment analysis, Arabic sentiment tweets dataset, and SemEval benchmark datasets, along with the Moroccan sentiment analysis corpus, which focuses on the Moroccan dialect. Furthermore, the modern standard and dialectal Arabic corpus dataset has been created and annotated based on the three language types: modern standard Arabic, Moroccan Arabic Dialect, and Mixed Language. Additionally, the annotation has been performed at the sentiment level, categorizing sentiments as positive, negative, or mixed. The sizes of the datasets range from 2000 to 21,000 reviews. The essential dialectal characteristics to enhance a sentiment classification system have been outlined. The proposed approach has involved deploying several models employing the supervised approach, including occurrence vectors, Recurrent Neural Network-Long Short Term Memory, and the pre-trained transformer model Arabic bidirectional encoder representations from transformers (AraBERT), complemented by the integration of Generative Adversarial Networks (GANs). The uniqueness of the proposed approach lies in constructing and annotating manually a dialectal sentiment corpus and studying carefully its main characteristics, which are used then to feed the classical supervised model. Moreover, GANs that widen the gap between the studied classes have been used to enhance the obtained results with AraBERT. The classification test results have been promising, enabling a comparison with other systems. The proposed system has been evaluated against Mazajak and CAMelTools state-of-the-art systems, designed for most Arabic dialects, using the mentioned datasets. A significant improvement of 30 points in F<sup>NN</sup> has been observed. These results have affirmed the versatility of the proposed system, demonstrating its effectiveness across multi-dialectal, multi-domain datasets, as well as balanced and unbalanced ones.</p>","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"1 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142199538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Language Resources and Evaluation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1