首页 > 最新文献

EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020最新文献

英文 中文
DIACR-Ita @ EVALITA2020: Overview of the EVALITA2020 Diachronic Lexical Semantics (DIACR-Ita) Task DIACR-Ita @ EVALITA2020: EVALITA2020历时词汇语义(DIACR-Ita)任务概述
Pub Date : 2020-12-09 DOI: 10.4000/BOOKS.AACCADEMIA.7613
Pierpaolo Basile, A. Caputo, Tommaso Caselli, Pierluigi Cassotti, Rossella Varvara
This paper describes the first edition of the “Diachronic Lexical Seman-tics” (DIACR-Ita) task at the EVALITA2020 campaign. The task challenges participants to develop systems that can automatically detect if a given word has changed its meaning over time, given con-textual information from corpora.The task, at its first edition, attracted 9 participant teams and collected a total of 36 sub-mission runs
本文描述了EVALITA2020活动中“历时词汇语义”(DIACR-Ita)任务的第一版。这项任务要求参与者开发一种系统,根据语料库中的上下文信息,自动检测给定单词是否随时间改变了其含义。这项任务的第一版吸引了9个参与小组,共收集了36次提交任务
{"title":"DIACR-Ita @ EVALITA2020: Overview of the EVALITA2020 Diachronic Lexical Semantics (DIACR-Ita) Task","authors":"Pierpaolo Basile, A. Caputo, Tommaso Caselli, Pierluigi Cassotti, Rossella Varvara","doi":"10.4000/BOOKS.AACCADEMIA.7613","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7613","url":null,"abstract":"This paper describes the first edition of the “Diachronic Lexical Seman-tics” (DIACR-Ita) task at the EVALITA2020 campaign. The task challenges participants to develop systems that can automatically detect if a given word has changed its meaning over time, given con-textual information from corpora.The task, at its first edition, attracted 9 participant teams and collected a total of 36 sub-mission runs","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"08 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123401879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
QMUL-SDS @ DIACR-Ita: Evaluating Unsupervised Diachronic Lexical Semantics Classification in Italian (short paper) qmull - sds @ DIACR-Ita:评估意大利语的无监督历时词汇语义分类(短文)
Pub Date : 2020-11-05 DOI: 10.4000/BOOKS.AACCADEMIA.7638
Rabab Alkhalifa, A. Tsakalidis, A. Zubiaga, Maria Liakata
In this paper, we present the results and main findings of our system for the DIACR-ITA 2020 Task. Our system focuses on using variations of training sets and different semantic detection methods. The task involves training, aligning and predicting a word's vector change from two diachronic Italian corpora. We demonstrate that using Temporal Word Embeddings with a Compass C-BOW model is more effective compared to different approaches including Logistic Regression and a Feed Forward Neural Network using accuracy. Our model ranked 3rd with an accuracy of 83.3%.
在本文中,我们介绍了我们的系统在DIACR-ITA 2020任务中的结果和主要发现。我们的系统侧重于使用不同的训练集和不同的语义检测方法。这项任务包括训练、对齐和预测两个历时意大利语语料库中单词向量的变化。我们证明,与其他方法(包括逻辑回归和使用精度的前馈神经网络)相比,使用Compass C-BOW模型的时间词嵌入更有效。我们的模型以83.3%的准确率排名第三。
{"title":"QMUL-SDS @ DIACR-Ita: Evaluating Unsupervised Diachronic Lexical Semantics Classification in Italian (short paper)","authors":"Rabab Alkhalifa, A. Tsakalidis, A. Zubiaga, Maria Liakata","doi":"10.4000/BOOKS.AACCADEMIA.7638","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7638","url":null,"abstract":"In this paper, we present the results and main findings of our system for the DIACR-ITA 2020 Task. Our system focuses on using variations of training sets and different semantic detection methods. The task involves training, aligning and predicting a word's vector change from two diachronic Italian corpora. We demonstrate that using Temporal Word Embeddings with a Compass C-BOW model is more effective compared to different approaches including Logistic Regression and a Feed Forward Neural Network using accuracy. Our model ranked 3rd with an accuracy of 83.3%.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"8 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129116868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
DH-FBK @ HaSpeeDe2: Italian Hate Speech Detection via Self-Training and Oversampling 基于自我训练和过采样的意大利语仇恨言论检测
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.6934
E. Leonardelli, S. Menini, Sara Tonelli
We describe in this paper the system submitted by the DH-FBK team to the HaSpeeDe evaluation task, and dealing with Italian hate speech detection (Task A). While we adopt a standard approach for fine-tuning AlBERTo, the Italian BERT model trained on tweets, we propose to improve the final classification performance by two additional steps, i.e. self-training and oversampling. Indeed, we extend the initial training data with additional silver data, carefully sampled from domain-specific tweets and obtained after first training our system only with the task training data. Then, we retrain the classifier by merging silver and task training data but oversampling the latter, so that the obtained model is more robust to possible inconsistencies in the silver data. With this configuration, we obtain a macro-averaged F1 of 0.753 on tweets, and 0.702 on news headlines.
我们在本文中描述了DH-FBK团队提交给HaSpeeDe评估任务的系统,并处理意大利语仇恨言论检测(任务A)。虽然我们采用标准方法对AlBERTo进行微调,但我们建议通过两个额外的步骤来提高最终的分类性能,即自我训练和过采样。实际上,我们用额外的银数据扩展了初始训练数据,这些数据是从特定领域的推文中仔细采样的,并且在第一次训练我们的系统后仅使用任务训练数据获得。然后,我们通过合并银和任务训练数据来重新训练分类器,但对后者进行过采样,以便获得的模型对银数据中可能存在的不一致性更具鲁棒性。通过这种配置,我们获得tweet上的宏观平均F1为0.753,新闻标题上的宏观平均F1为0.702。
{"title":"DH-FBK @ HaSpeeDe2: Italian Hate Speech Detection via Self-Training and Oversampling","authors":"E. Leonardelli, S. Menini, Sara Tonelli","doi":"10.4000/BOOKS.AACCADEMIA.6934","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6934","url":null,"abstract":"We describe in this paper the system submitted by the DH-FBK team to the HaSpeeDe evaluation task, and dealing with Italian hate speech detection (Task A). While we adopt a standard approach for fine-tuning AlBERTo, the Italian BERT model trained on tweets, we propose to improve the final classification performance by two additional steps, i.e. self-training and oversampling. Indeed, we extend the initial training data with additional silver data, carefully sampled from domain-specific tweets and obtained after first training our system only with the task training data. Then, we retrain the classifier by merging silver and task training data but oversampling the latter, so that the obtained model is more robust to possible inconsistencies in the silver data. With this configuration, we obtain a macro-averaged F1 of 0.753 on tweets, and 0.702 on news headlines.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126180666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Venses @ HaSpeeDe2 & SardiStance: Multilevel Deep Linguistically Based Supervised Approach to Classification Venses @ HaSpeeDe2 & SardiStance:基于多层深度语言的监督分类方法
Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.6962
R. Delmonte
In this paper we present the results obtained with ItVENSES a system for syntactic and semantic processing that is based on the parser for Italian called ItGetaruns to analyse each sentence. In previous EVALITA tasks we only used semantics to produce the results. In this year EVALITA, we used both a fully and mixed statistically based approach and the semantic one used previously. The statistic approaches are all characterized by the use of n-grams and the usual tf-idf indices. We added another parameter called the Kullback-Leibler Divergence to compute similarities. In addition we used emoticons and hashtags. Results for the two runs allowed have been fairly low – around 40% F1-score. We continued producing other runs on the basis of the statistical approach and after receiving the goldtest version and the evaluation script we discovered that in one of these additional runs the fourth we improved up to 54% macro F1 for HaSpeeDe2 task and up to 48% macro F1 for Sardines.
在本文中,我们介绍了使用ItVENSES获得的结果,ItVENSES是一个句法和语义处理系统,它基于意大利语解析器ItGetaruns来分析每个句子。在以前的EVALITA任务中,我们只使用语义来生成结果。在今年的EVALITA中,我们使用了完全混合的基于统计的方法和之前使用的语义方法。统计方法的特点都是使用n-grams和通常的tf-idf指标。我们添加了另一个称为Kullback-Leibler散度的参数来计算相似度。此外,我们还使用了表情符号和话题标签。两次测试的结果相当低,f1得分约为40%。我们继续在统计方法的基础上进行其他运行,在收到黄金测试版本和评估脚本之后,我们发现在这些额外的运行中,我们为HaSpeeDe2任务提高了高达54%的宏F1,为Sardines任务提高了高达48%的宏F1。
{"title":"Venses @ HaSpeeDe2 & SardiStance: Multilevel Deep Linguistically Based Supervised Approach to Classification","authors":"R. Delmonte","doi":"10.4000/books.aaccademia.6962","DOIUrl":"https://doi.org/10.4000/books.aaccademia.6962","url":null,"abstract":"In this paper we present the results obtained with ItVENSES a system for syntactic and semantic processing that is based on the parser for Italian called ItGetaruns to analyse each sentence. In previous EVALITA tasks we only used semantics to produce the results. In this year EVALITA, we used both a fully and mixed statistically based approach and the semantic one used previously. The statistic approaches are all characterized by the use of n-grams and the usual tf-idf indices. We added another parameter called the Kullback-Leibler Divergence to compute similarities. In addition we used emoticons and hashtags. Results for the two runs allowed have been fairly low – around 40% F1-score. We continued producing other runs on the basis of the statistical approach and after receiving the goldtest version and the evaluation script we discovered that in one of these additional runs the fourth we improved up to 54% macro F1 for HaSpeeDe2 task and up to 48% macro F1 for Sardines.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125528658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
MDD @ AMI: Vanilla Classifiers for Misogyny Identification (short paper) MDD @ AMI:鉴别厌女症的香草分类器(短文)
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.6819
Samer El Abassi, Sergiu Nisioi
In this report1, we present a set of vanilla classifiers that we used to identify misogynous and aggressive texts in Italian social media. Our analysis shows that simple classifiers with little feature engineering have a strong tendency to overfit and yield a strong bias on the test set. Additionally, we investigate the usefulness of function words, pronouns, and shallow-syntactical features to observe whether misogynous or aggressive texts have specific stylistic elements.
在本报告中,我们提出了一套香草分类器,我们用它来识别意大利社交媒体上的厌女和攻击性文本。我们的分析表明,带有少量特征工程的简单分类器有很强的过拟合倾向,并在测试集上产生很强的偏差。此外,我们还研究了虚词、代词和浅层句法特征的有用性,以观察厌女或攻击性文本是否具有特定的风格元素。
{"title":"MDD @ AMI: Vanilla Classifiers for Misogyny Identification (short paper)","authors":"Samer El Abassi, Sergiu Nisioi","doi":"10.4000/BOOKS.AACCADEMIA.6819","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6819","url":null,"abstract":"In this report1, we present a set of vanilla classifiers that we used to identify misogynous and aggressive texts in Italian social media. Our analysis shows that simple classifiers with little feature engineering have a strong tendency to overfit and yield a strong bias on the test set. Additionally, we investigate the usefulness of function words, pronouns, and shallow-syntactical features to observe whether misogynous or aggressive texts have specific stylistic elements.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115102837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
GUL.LE.VER @ GhigliottinAI: A Glove based Artificial Player to Solve the Language Game "La Ghigliottina" (short paper) 一个基于手套的人工玩家解决语言游戏“La Ghigliottina”(短文)
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7500
Nazareno de Francesco
The paper describes GUL.LE.VER, GUiLlottine gLovE resolVER, a Glove based system developed to solve the game “La Ghigliottina” which participated in the Evalita 2020 (Basile et al., 2020) task Ghigliottin-AI. The system described positioned #2, with 0.26 of Precision and 0.46 R@10, more than one guillotine is solved every four games, achieving results comparable to human players. The system proved to solve a different kind of guillotines compared to the first classified system ’Il Mago della ghigliottina’ (Sangati et al., 2018). An approach based on these two kinds of systems may result in a boost in this field of research.
本文描述了gull . le。VER, GUiLlottine gLovE resolVER,一个基于手套的系统,用于解决参与Evalita 2020 (Basile et al., 2020)任务Ghigliottin-AI的“La Ghigliottina”游戏。该系统描述的位置2,精度为0.26,R@10为0.46,每四局解决一个以上的断头台,达到与人类玩家相当的结果。与第一个分类系统“Il Mago della ghigliottina”相比,该系统被证明可以解决一种不同的断头台问题(Sangati等人,2018)。基于这两种系统的方法可能会促进这一领域的研究。
{"title":"GUL.LE.VER @ GhigliottinAI: A Glove based Artificial Player to Solve the Language Game \"La Ghigliottina\" (short paper)","authors":"Nazareno de Francesco","doi":"10.4000/BOOKS.AACCADEMIA.7500","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7500","url":null,"abstract":"The paper describes GUL.LE.VER, GUiLlottine gLovE resolVER, a Glove based system developed to solve the game “La Ghigliottina” which participated in the Evalita 2020 (Basile et al., 2020) task Ghigliottin-AI. The system described positioned #2, with 0.26 of Precision and 0.46 R@10, more than one guillotine is solved every four games, achieving results comparable to human players. The system proved to solve a different kind of guillotines compared to the first classified system ’Il Mago della ghigliottina’ (Sangati et al., 2018). An approach based on these two kinds of systems may result in a boost in this field of research.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129692154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
KonKretiKa @ CONcreTEXT: Computing Concreteness Indexes with Sigmoid Transformation and Adjustment for Context conkretika @ CONcreTEXT:用s型变换和上下文调整计算具体指数
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7478
Yulia Badryzlova
The present paper is a technical report of KonKretiKa, a system for computation of concreteness indexes of words in context, submitted to the English track of the CONcreTEXT shared task. We treat concreteness as a bimodal problem and compute the concreteness indexes using paradigms of concrete and abstract seed words and distributional semantic similarity. We also conduct sigmoid transformation to achieve greater similarity to the psycholinguistically attested data, and apply dynamic adjustment of static indexes for sentential context. One of the modifications of the presented system ranked third in the task, with rs = .6634 and r = .6685 against the gold standard.
本文是提交给CONcreTEXT共享任务英语轨道的语境词具体指标计算系统KonKretiKa的技术报告。我们将具体问题视为一个双峰问题,并使用具体和抽象种子词和分布语义相似度范式计算具体指标。我们还进行了s型变换,以获得与心理语言学证明数据更大的相似性,并对句子上下文应用静态指标的动态调整。所提出的系统的一个修改在任务中排名第三,相对于金本位制的r = 0.6634和r = 0.6685。
{"title":"KonKretiKa @ CONcreTEXT: Computing Concreteness Indexes with Sigmoid Transformation and Adjustment for Context","authors":"Yulia Badryzlova","doi":"10.4000/BOOKS.AACCADEMIA.7478","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7478","url":null,"abstract":"The present paper is a technical report of KonKretiKa, a system for computation of concreteness indexes of words in context, submitted to the English track of the CONcreTEXT shared task. We treat concreteness as a bimodal problem and compute the concreteness indexes using paradigms of concrete and abstract seed words and distributional semantic similarity. We also conduct sigmoid transformation to achieve greater similarity to the psycholinguistically attested data, and apply dynamic adjustment of static indexes for sentential context. One of the modifications of the presented system ranked third in the task, with rs = .6634 and r = .6685 against the gold standard.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129428944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Ghigliottin-AI@EVALITA2020: Evaluating Artificial Players for the Language Game "La Ghigliottina" (short paper) Ghigliottin-AI@EVALITA2020:语言游戏“La Ghigliottina”的人工玩家评估(短文)
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7488
Pierpaolo Basile, M. Lovetere, J. Monti, A. Pascucci, Federico Sangati, Lucia Siciliani
English. Evaluating Artificial Players for the Language Game “La Ghigliottina” (Ghigliottin-AI) task is one of the tasks organized in the context of the 2020 EVALITA edition, a periodic evaluation campaign of Natural Language Processing (NLP) and speech tools for the Italian language. Ghigliottin-AI participants are asked to build an artificial player able to solve “La Ghigliottina”, namely the final game of an Italian TV show called “L’Eredità”. The game involves a single player who is given a set of five words unrelated to each other, but related with a sixth word that represents the solution to the game. Fourteen teams registered to Ghigliottin-AI. Nevertheless, only two teams submitted their run. In order to evaluate the submitted systems, we rely on an API base methodology, via a Remote Evaluation Server (RES). In this report we describe the Ghigliottin-AI task, the data, the evaluation and we discuss results. Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 1 Background and Motivation Language games draw their challenge and excitement from the richness and ambiguity of natural language, and therefore have attracted the attention of researchers in the fields of Artificial Intelligence and Natural Language Processing. For instance, IBM Watson is a system which successfully challenged human champions of “Jeopardy!”, a game in which contestants are presented with clues in the form of answers, and must phrase their responses in the form of a question (Ferrucci et al., 2010; Molino et al., 2015). Another popular language game is solving crossword puzzles. The first experience reported in the literature is Proverb (Littman et al., 2002), that exploits large libraries of clues and solutions to past crossword puzzles. WebCrow is the first solver for Italian crosswords (Ernandes et al., 2008). Following the first edition of the NLP4FUN task (Basile et al., 2018), proposed at EVALITA 2018, we propose a new edition of the task whose aim is to design a solver for “The Guillotine” (La Ghigliottina, in Italian) game. It is inspired by the final game of an Italian TV show called “L’Eredità”. The game, broadcast by Italian national TV, involves a single player, who is given a set of five words the clues each linked in some way to a specific word that represents the unique solution of the game. Words are unrelated to each other, but each of them has a hidden association with the solution. Once the clues are given, the player has one minute to find the solution. For example, given the five clues: pie, bad, Adam, core, eye the solution is apple, because: apple-pie is a kind of pie; bad apple is a way of referring to a trouble maker; Adam’s apple is the prominent part of men’s throat; apple core is the centre of the apple; apple of someone’s eye is way of referring to someone’s beloved person. This report is organized as follows: in Section 2 we describe the Ghigliottin-AI tas
英语。评估语言游戏“La Ghigliottina”(Ghigliottin-AI)任务的人工玩家是在2020年EVALITA版本背景下组织的任务之一,该版本是对意大利语自然语言处理(NLP)和语音工具的定期评估活动。Ghigliottin-AI参与者被要求构建一个能够解决“La Ghigliottina”的人工玩家,即意大利电视节目“L ' eredit”的最后一场比赛。在这个游戏中,每个玩家都有5个互不相关的单词,而第6个单词则代表了游戏的解决方案。14支队伍注册到Ghigliottin-AI。然而,只有两队提交了他们的比赛。为了评估提交的系统,我们依靠API基础方法,通过远程评估服务器(RES)。在本报告中,我们描述了Ghigliottin-AI任务,数据,评估并讨论了结果。本文版权所有©2020。在知识共享许可国际署名4.0 (CC BY 4.0)下允许使用。语言游戏从自然语言的丰富性和模糊性中汲取挑战和刺激,因此引起了人工智能和自然语言处理领域研究人员的关注。例如,IBM沃森是一个成功挑战“危险!”,在这个游戏中,参赛者以答案的形式呈现线索,并且必须以问题的形式表达他们的回答(Ferrucci et al., 2010;Molino et al., 2015)。另一个流行的语言游戏是填字游戏。文献中报道的第一个经验是《谚语》(Littman et al., 2002),它利用了大量的线索库和过去填字游戏的解决方案。WebCrow是意大利语填字游戏的第一个解算器(Ernandes et al., 2008)。继在EVALITA 2018上提出的第一版NLP4FUN任务(Basile等人,2018)之后,我们提出了一个新版本的任务,其目的是为“断头台”(意大利语为La Ghigliottina)游戏设计求解器。它的灵感来自意大利电视节目“L ' eredit”的决赛。这个游戏由意大利国家电视台播出,游戏中只有一名玩家,他会得到一组由五个单词组成的线索,每个线索都以某种方式与代表游戏唯一答案的特定单词相关联。单词之间是不相关的,但每个单词都与解决方案有一个隐藏的联系。一旦给出线索,玩家有一分钟的时间找到解决办法。例如,给出五个线索:派、坏、亚当、核、眼,答案是苹果,因为:苹果派是派的一种;Bad apple是指制造麻烦的人;喉结是男人喉咙的突出部位;苹果核是苹果的中心;Apple of someone 's eye是指某人深爱的人。本报告组织如下:在第2节中,我们描述了ghigliotin - ai任务。在第3节中,我们给出了数据集。任务评估在第4节。参与者取得的成果见第5节。结论见第6节。
{"title":"Ghigliottin-AI@EVALITA2020: Evaluating Artificial Players for the Language Game \"La Ghigliottina\" (short paper)","authors":"Pierpaolo Basile, M. Lovetere, J. Monti, A. Pascucci, Federico Sangati, Lucia Siciliani","doi":"10.4000/BOOKS.AACCADEMIA.7488","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7488","url":null,"abstract":"English. Evaluating Artificial Players for the Language Game “La Ghigliottina” (Ghigliottin-AI) task is one of the tasks organized in the context of the 2020 EVALITA edition, a periodic evaluation campaign of Natural Language Processing (NLP) and speech tools for the Italian language. Ghigliottin-AI participants are asked to build an artificial player able to solve “La Ghigliottina”, namely the final game of an Italian TV show called “L’Eredità”. The game involves a single player who is given a set of five words unrelated to each other, but related with a sixth word that represents the solution to the game. Fourteen teams registered to Ghigliottin-AI. Nevertheless, only two teams submitted their run. In order to evaluate the submitted systems, we rely on an API base methodology, via a Remote Evaluation Server (RES). In this report we describe the Ghigliottin-AI task, the data, the evaluation and we discuss results. Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 1 Background and Motivation Language games draw their challenge and excitement from the richness and ambiguity of natural language, and therefore have attracted the attention of researchers in the fields of Artificial Intelligence and Natural Language Processing. For instance, IBM Watson is a system which successfully challenged human champions of “Jeopardy!”, a game in which contestants are presented with clues in the form of answers, and must phrase their responses in the form of a question (Ferrucci et al., 2010; Molino et al., 2015). Another popular language game is solving crossword puzzles. The first experience reported in the literature is Proverb (Littman et al., 2002), that exploits large libraries of clues and solutions to past crossword puzzles. WebCrow is the first solver for Italian crosswords (Ernandes et al., 2008). Following the first edition of the NLP4FUN task (Basile et al., 2018), proposed at EVALITA 2018, we propose a new edition of the task whose aim is to design a solver for “The Guillotine” (La Ghigliottina, in Italian) game. It is inspired by the final game of an Italian TV show called “L’Eredità”. The game, broadcast by Italian national TV, involves a single player, who is given a set of five words the clues each linked in some way to a specific word that represents the unique solution of the game. Words are unrelated to each other, but each of them has a hidden association with the solution. Once the clues are given, the player has one minute to find the solution. For example, given the five clues: pie, bad, Adam, core, eye the solution is apple, because: apple-pie is a kind of pie; bad apple is a way of referring to a trouble maker; Adam’s apple is the prominent part of men’s throat; apple core is the centre of the apple; apple of someone’s eye is way of referring to someone’s beloved person. This report is organized as follows: in Section 2 we describe the Ghigliottin-AI tas","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130402730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
AcCompl-it @ EVALITA2020: Overview of the Acceptability & Complexity Evaluation Task for Italian 完成-it @ EVALITA2020:意大利语可接受性和复杂性评估任务概述
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7725
D. Brunato, C. Chesi, F. Dell’Orletta, S. Montemagni, Giulia Venturi, Roberto Zamparelli
The Acceptability and Complexity evaluation task for Italian (AcCompl-it) was aimed at developing and evaluating methods to classify Italian sentences according to Acceptability and Complexity. It consists of two independent tasks asking participants to predict either the acceptability or the complexity rate (or both) of a given set of sentences previously scored by native speakers on a 1-to-7 points Likert scale. In this paper, we introduce the datasets distributed to the participants, we describe the different approaches of the participating systems and provide a first analysis of the obtained results.
意大利语可接受性和复杂性评价任务(accomplit)旨在开发和评价根据可接受性和复杂性对意大利语句子进行分类的方法。它由两个独立的任务组成,要求参与者预测一组给定的句子的可接受性或复杂性(或两者兼而有之),这些句子之前是由母语人士在1到7分的李克特量表上打分的。在本文中,我们介绍了分配给参与者的数据集,我们描述了参与系统的不同方法,并对所获得的结果进行了初步分析。
{"title":"AcCompl-it @ EVALITA2020: Overview of the Acceptability & Complexity Evaluation Task for Italian","authors":"D. Brunato, C. Chesi, F. Dell’Orletta, S. Montemagni, Giulia Venturi, Roberto Zamparelli","doi":"10.4000/BOOKS.AACCADEMIA.7725","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7725","url":null,"abstract":"The Acceptability and Complexity evaluation task for Italian (AcCompl-it) was aimed at developing and evaluating methods to classify Italian sentences according to Acceptability and Complexity. It consists of two independent tasks asking participants to predict either the acceptability or the complexity rate (or both) of a given set of sentences previously scored by native speakers on a 1-to-7 points Likert scale. In this paper, we introduce the datasets distributed to the participants, we describe the different approaches of the participating systems and provide a first analysis of the obtained results.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114357937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
HaSpeeDe 2 @ EVALITA2020: Overview of the EVALITA 2020 Hate Speech Detection Task HaSpeeDe 2 @ EVALITA2020: EVALITA2020仇恨言论检测任务概述
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.6897
M. Sanguinetti, G. Comandini, Elisa Di Nuovo, Simona Frenda, M. Stranisci, C. Bosco, Tommaso Caselli, V. Patti, Irene Russo
The Hate Speech Detection (HaSpeeDe 2) task is the second edition of a shared task on the detection of hateful content in Italian Twitter messages. HaSpeeDe 2 is composed of a Main task (hate speech detection) and two Pilot tasks, (stereotype and nominal utterance detection). Systems were challenged along two dimensions: (i) time, with test data coming from a different time period than the training data, and (ii) domain, with test data coming from the news domain (i.e., news headlines). Overall, 14 teams participated in the Main task, the best systems achieved a macro F1-score of 0.8088 and 0.7744 on the indomain in the out-of-domain test sets, respectively; 6 teams submitted their results for Pilot task 1 (stereotype detection), the best systems achieved a macro F1-score of 0.7719 and 0.7203 on in-domain and outof-domain test sets. We did not receive any submission for Pilot task 2.
仇恨言论检测(HaSpeeDe 2)任务是关于检测意大利Twitter消息中仇恨内容的共享任务的第二版。HaSpeeDe 2由一个主任务(仇恨言语检测)和两个先导任务(刻板印象和名义话语检测)组成。系统在两个维度上受到挑战:(i)时间,测试数据来自与训练数据不同的时间段;(ii)领域,测试数据来自新闻领域(即新闻标题)。总体而言,有14个团队参与了Main任务,其中最好的系统在域外测试集中分别获得了0.8088和0.7744的宏观f1分数;6个团队提交了他们的实验任务1(刻板印象检测)的结果,最好的系统在域内和域外测试集中获得了0.7719和0.7203的宏观f1分数。我们没有收到任何关于试点任务2的提交。
{"title":"HaSpeeDe 2 @ EVALITA2020: Overview of the EVALITA 2020 Hate Speech Detection Task","authors":"M. Sanguinetti, G. Comandini, Elisa Di Nuovo, Simona Frenda, M. Stranisci, C. Bosco, Tommaso Caselli, V. Patti, Irene Russo","doi":"10.4000/BOOKS.AACCADEMIA.6897","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6897","url":null,"abstract":"The Hate Speech Detection (HaSpeeDe 2) task is the second edition of a shared task on the detection of hateful content in Italian Twitter messages. HaSpeeDe 2 is composed of a Main task (hate speech detection) and two Pilot tasks, (stereotype and nominal utterance detection). Systems were challenged along two dimensions: (i) time, with test data coming from a different time period than the training data, and (ii) domain, with test data coming from the news domain (i.e., news headlines). Overall, 14 teams participated in the Main task, the best systems achieved a macro F1-score of 0.8088 and 0.7744 on the indomain in the out-of-domain test sets, respectively; 6 teams submitted their results for Pilot task 1 (stereotype detection), the best systems achieved a macro F1-score of 0.7719 and 0.7203 on in-domain and outof-domain test sets. We did not receive any submission for Pilot task 2.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"8 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114024740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
期刊
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1