首页 > 最新文献

EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020最新文献

英文 中文
CAPISCO @ CONcreTEXT 2020: (Un)supervised Systems to Contextualize Concreteness with Norming Data CAPISCO @ CONcreTEXT 2020:(非)监督系统,用规范化数据将具体情境化
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7475
Alessandro Bondielli, Gianluca E. Lebani, Lucia C. Passaro, Alessandro Lenci
English. This paper describes several approaches to the automatic rating of the concreteness of concepts in context, to approach the EVALITA 2020 “CONcreTEXT” task. Our systems focus on the interplay between words and their surrounding context by (i) exploiting annotated resources, (ii) using BERT masking to find potential substitutes of the target in specific contexts and measuring their average similarity with concrete and abstract centroids, and (iii) automatically generating labelled datasets to fine tune transformer models for regression. All the approaches have been tested both on English and Italian data. Both the best systems for each language ranked second in the task.
英语。本文描述了几种自动评估上下文中概念具体程度的方法,以接近EVALITA 2020“CONcreTEXT”任务。我们的系统通过(i)利用带注释的资源,(ii)使用BERT掩蔽来寻找特定上下文中目标的潜在替代品,并测量其与具体和抽象质心的平均相似度,以及(iii)自动生成标记数据集以微调回归变压器模型,从而专注于单词与其周围上下文之间的相互作用。所有的方法都在英语和意大利语的数据上进行了测试。每种语言的最佳系统在任务中都排名第二。
{"title":"CAPISCO @ CONcreTEXT 2020: (Un)supervised Systems to Contextualize Concreteness with Norming Data","authors":"Alessandro Bondielli, Gianluca E. Lebani, Lucia C. Passaro, Alessandro Lenci","doi":"10.4000/BOOKS.AACCADEMIA.7475","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7475","url":null,"abstract":"English. This paper describes several approaches to the automatic rating of the concreteness of concepts in context, to approach the EVALITA 2020 “CONcreTEXT” task. Our systems focus on the interplay between words and their surrounding context by (i) exploiting annotated resources, (ii) using BERT masking to find potential substitutes of the target in specific contexts and measuring their average similarity with concrete and abstract centroids, and (iii) automatically generating labelled datasets to fine tune transformer models for regression. All the approaches have been tested both on English and Italian data. Both the best systems for each language ranked second in the task.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125638555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
App2Check @ ATE_ABSITA 2020: Aspect Term Extraction and Aspect-based Sentiment Analysis (short paper) App2Check @ ATE_ABSITA 2020:面向术语提取和基于面向的情感分析(短论文)
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.6892
E. Rosa, A. Durante
In this paper we describe and present the results of the system we specifically developed and submitted for our participation to the ATE ABSITA 2020 evaluation campaign on the Aspect Term Extraction (ATE), Aspect-based Sentiment Analysis (ABSA), and Sentiment Analysis (SA) tasks. The official results show that App2Check ranks first in all of the three tasks, reaching a F1 score which is 0.14236 higher than the second best system in the ATE task and 0.11943 higher in the ABSA task; it shows a Root-MeanSquare Error (RMSE) that is 0.13075 lower than the second classified in the SA
在本文中,我们描述并展示了我们专门开发并提交给ATE ABSITA 2020评估活动的系统的结果,该活动涉及方面术语提取(ATE)、基于方面的情感分析(ABSA)和情感分析(SA)任务。官方结果显示,App2Check在三个任务中都排名第一,在ATE任务中比第二名高0.14236分,在ABSA任务中比第二名高0.11943分;它显示的均方根误差(RMSE)比SA中分类的第二个误差低0.13075
{"title":"App2Check @ ATE_ABSITA 2020: Aspect Term Extraction and Aspect-based Sentiment Analysis (short paper)","authors":"E. Rosa, A. Durante","doi":"10.4000/BOOKS.AACCADEMIA.6892","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6892","url":null,"abstract":"In this paper we describe and present the results of the system we specifically developed and submitted for our participation to the ATE ABSITA 2020 evaluation campaign on the Aspect Term Extraction (ATE), Aspect-based Sentiment Analysis (ABSA), and Sentiment Analysis (SA) tasks. The official results show that App2Check ranks first in all of the three tasks, reaching a F1 score which is 0.14236 higher than the second best system in the ATE task and 0.11943 higher in the ABSA task; it shows a Root-MeanSquare Error (RMSE) that is 0.13075 lower than the second classified in the SA","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133619020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
CHILab @ HaSpeeDe 2: Enhancing Hate Speech Detection with Part-of-Speech Tagging (short paper) CHILab @ HaSpeeDe 2:利用词性标注增强仇恨言论检测(短文)
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7057
Giuseppe Gambino, R. Pirrone
The present paper describes two neural network systems used for Hate Speech Detection tasks that make use not only of the pre-processed text but also of its Partof-Speech (PoS) tag. The first system uses a Transformer Encoder block, a relatively novel neural network architecture that arises as a substitute for recurrent neural networks. The second system uses a Depth-wise Separable Convolutional Neural Network, a new type of CNN that has become known in the field of image processing thanks to its computational efficiency. These systems have been used for the participation to the HaSpeeDe 2 task of the EVALITA 2020 workshop with CHILab as the team name, where our best system, the one that uses Transformer, ranked first in two out of four tasks and ranked third in the other two tasks. The systems have also been tested on English, Spanish and German languages.
本文描述了两种用于仇恨语音检测任务的神经网络系统,它们不仅利用预处理文本,而且利用其词性(PoS)标签。第一个系统使用变压器编码器块,这是一种相对较新的神经网络架构,作为循环神经网络的替代品。第二个系统使用深度可分离卷积神经网络,这是一种新型的CNN,由于其计算效率而在图像处理领域广为人知。这些系统已用于参与EVALITA 2020研讨会的HaSpeeDe 2任务(以CHILab为团队名称),其中我们最好的系统(使用Transformer的系统)在四个任务中的两个中排名第一,在其他两个任务中排名第三。该系统还对英语、西班牙语和德语进行了测试。
{"title":"CHILab @ HaSpeeDe 2: Enhancing Hate Speech Detection with Part-of-Speech Tagging (short paper)","authors":"Giuseppe Gambino, R. Pirrone","doi":"10.4000/BOOKS.AACCADEMIA.7057","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7057","url":null,"abstract":"The present paper describes two neural network systems used for Hate Speech Detection tasks that make use not only of the pre-processed text but also of its Partof-Speech (PoS) tag. The first system uses a Transformer Encoder block, a relatively novel neural network architecture that arises as a substitute for recurrent neural networks. The second system uses a Depth-wise Separable Convolutional Neural Network, a new type of CNN that has become known in the field of image processing thanks to its computational efficiency. These systems have been used for the participation to the HaSpeeDe 2 task of the EVALITA 2020 workshop with CHILab as the team name, where our best system, the one that uses Transformer, ranked first in two out of four tasks and ranked third in the other two tasks. The systems have also been tested on English, Spanish and German languages.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"875 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127589246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
UniBO @ KIPoS: Fine-tuning the Italian "BERTology" for PoS-tagging Spoken Data (short paper) UniBO @ KIPoS:为pos标注语音数据微调意大利语“BERTology”(短文)
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7768
F. Tamburini
English. The use of contextualised word embeddings allowed for a relevant performance increase for almost all Natural Language Processing (NLP) applications. Recently some new models especially developed for Italian became available to scholars. This work aims at applying simple fine-tuning methods for producing highperformance solutions at the EVALITA KIPOS PoS-tagging task (Bosco et al., 2020). Italian. L’utilizzazione di word embedding contestuali ha consentito notevoli incrementi nelle performance dei sistemi automatici sviluppati per affrontare vari task nell’ambito dell’elaborazione del linguaggio naturale. Recentemente sono stati introdotti alcuni nuovi modelli sviluppati specificatamente per la lingua italiana. Lo scopo di questo lavoro è valutare se un semplice fine-tuning di questi modelli sia sufficiente per ottenere performance di alto livello nel task KIPOS di EVALITA 2020.
English。允许使用内容嵌入的单词来提高几乎所有自然语言处理(NLP)应用程序的相关性能。最近为意大利人开发了一些新的特别设计的新模型。这项工作的目的是简单地调整生产高绩效解决方案的方法,在逃避KIPOS pos标签任务(Bosco et al., 2020)。英语。上下文嵌入式word的使用使为处理自然语言处理领域的几个任务而开发的自动化系统的性能有了显著的提高。最近引进了专门为意大利语开发的新模型。这项工作的目的是评估这些模型的简单微调是否足以在ev预期2020年工作队中获得高水平的性能。
{"title":"UniBO @ KIPoS: Fine-tuning the Italian \"BERTology\" for PoS-tagging Spoken Data (short paper)","authors":"F. Tamburini","doi":"10.4000/BOOKS.AACCADEMIA.7768","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7768","url":null,"abstract":"English. The use of contextualised word embeddings allowed for a relevant performance increase for almost all Natural Language Processing (NLP) applications. Recently some new models especially developed for Italian became available to scholars. This work aims at applying simple fine-tuning methods for producing highperformance solutions at the EVALITA KIPOS PoS-tagging task (Bosco et al., 2020). Italian. L’utilizzazione di word embedding contestuali ha consentito notevoli incrementi nelle performance dei sistemi automatici sviluppati per affrontare vari task nell’ambito dell’elaborazione del linguaggio naturale. Recentemente sono stati introdotti alcuni nuovi modelli sviluppati specificatamente per la lingua italiana. Lo scopo di questo lavoro è valutare se un semplice fine-tuning di questi modelli sia sufficiente per ottenere performance di alto livello nel task KIPOS di EVALITA 2020.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116338771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
SNK @ DANKMEMES: Leveraging Pretrained Embeddings for Multimodal Meme Detection (short paper) SNK @ DANKMEMES:利用预训练嵌入进行多模态模因检测(短文)
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7352
S. Fiorucci
English. In this paper, we describe and present the results of meme detection system, specifically developed and submitted for our participation to the first subtask of DANKMEMES (EVALITA 2020). We built simple classifiers, consisting in feed forward neural networks. They leverage existing pretrained embeddings, both for text and image representation. Our best system (SNK1) achieves good results in meme detection (F1 = 0.8473), ranking 2nd in the competition, at a distance of 0.0028 from the first classified. Italiano. In questo articolo, descriviamo e presentiamo i risultati di un sistema di individuazione dei meme, ideato e sviluppato per partecipare al primo subtask di DANKMEMES (EVALITA 2020). Abbiamo realizzato dei semplici classificatori, costituiti da una rete neurale feed-forward: essi sfruttano embedding preesistenti, per la rappresentazione numerica di testo e immagini. Il nostro miglior sistema (SNK1) raggiunge buoni risultati nell’individuazione dei meme (F1 = 0.8473) e si è classificato secondo nella competizione, ad una distanza di 0.0028 dal primo classificato. 1 System description 1.1 General approach and tools DANKMEMES (Miliani et al., 2020) is a task for meme recognition and hate speech/event identification in memes and is part of the EVALITA 2020 evaluation campaign (Basile et al., 2020). Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) For our participation to the first subtask of DANKEMES, we built simple classification models for meme detection. The main challenge is to effectively combine textual and image inputs. We tried to exploit the ability of pretrained embedding to represent the information present in text and images, paying a limited computational cost. To quickly build various prototypes of neural networks, we used Uber Ludwig framework (Molino et al., 2019): a toolbox built on top of TensorFlow, which facilitates and speeds up the training and testing of various models. We trained our models using Google Colaboratory, a hosted Jupyter notebook service, which provides free access to GPUs, with some resource and time limitations.
English。在这份文件中,我们描述并展示了模因探测系统的结果,这些结果是专门为我们参与DANKMEMES的第一个子任务而开发和限制的。我们建立了简单的排名,包括向前神经网络。他们对文本和图像表示进行预先培训和嵌入。我们最好的系统(SNK1)在模因探测方面取得了良好的结果(F1 = 0.8473),在比赛中排名第二,距离第一名0.0028。意大利。在这篇文章中,我们描述并展示了一个模因识别系统的结果,该系统旨在参与DANKMEMES的第一个子任务(eveta 2020)。我们开发了简单的分类器,它是一个神经网络的反馈-前置:它们利用现有的嵌入来生成文本和图像的数字表示。我们最好的系统(SNK1)在识别模因方面表现良好(F1 = 0.8473),在比赛中排名第二,距离第一名0.0028。1系统描述1.1通用方法和工具DANKMEMES (Miliani et al., 2020)是一个模因识别和仇恨言论/事件识别工作组,是2020年evaluation运动的一部分(Basile et al., 2020)。版权所有©2020 for this paper by its authors。使用知识共享许可归属4.0国际(CC BY 4.0)授权我们参与第一个DANKEMES子任务,我们为模因探测构建简单的分类模型。主要的挑战是有效地结合文本和图像输入。我们试图利用预先训练和嵌入文本和图像中的信息的能力,支付有限的计算成本。为了快速构建神经网络的不同原型,我们使用路德维希框架(Molino et al., 2019):在TensorFlow的顶部建立一个工具箱,它提供设施,并提供各种模型的培训和测试。我们训练我们的模型使用谷歌Colaboratory,一个托管Jupyter notebook服务,提供免费访问GPUs,有一些资源和时间限制。
{"title":"SNK @ DANKMEMES: Leveraging Pretrained Embeddings for Multimodal Meme Detection (short paper)","authors":"S. Fiorucci","doi":"10.4000/BOOKS.AACCADEMIA.7352","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7352","url":null,"abstract":"English. In this paper, we describe and present the results of meme detection system, specifically developed and submitted for our participation to the first subtask of DANKMEMES (EVALITA 2020). We built simple classifiers, consisting in feed forward neural networks. They leverage existing pretrained embeddings, both for text and image representation. Our best system (SNK1) achieves good results in meme detection (F1 = 0.8473), ranking 2nd in the competition, at a distance of 0.0028 from the first classified. Italiano. In questo articolo, descriviamo e presentiamo i risultati di un sistema di individuazione dei meme, ideato e sviluppato per partecipare al primo subtask di DANKMEMES (EVALITA 2020). Abbiamo realizzato dei semplici classificatori, costituiti da una rete neurale feed-forward: essi sfruttano embedding preesistenti, per la rappresentazione numerica di testo e immagini. Il nostro miglior sistema (SNK1) raggiunge buoni risultati nell’individuazione dei meme (F1 = 0.8473) e si è classificato secondo nella competizione, ad una distanza di 0.0028 dal primo classificato. 1 System description 1.1 General approach and tools DANKMEMES (Miliani et al., 2020) is a task for meme recognition and hate speech/event identification in memes and is part of the EVALITA 2020 evaluation campaign (Basile et al., 2020). Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) For our participation to the first subtask of DANKEMES, we built simple classification models for meme detection. The main challenge is to effectively combine textual and image inputs. We tried to exploit the ability of pretrained embedding to represent the information present in text and images, paying a limited computational cost. To quickly build various prototypes of neural networks, we used Uber Ludwig framework (Molino et al., 2019): a toolbox built on top of TensorFlow, which facilitates and speeds up the training and testing of various models. We trained our models using Google Colaboratory, a hosted Jupyter notebook service, which provides free access to GPUs, with some resource and time limitations.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123921639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
PoliTeam @ AMI: Improving Sentence Embedding Similarity with Misogyny Lexicons for Automatic Misogyny Identification in Italian Tweets politteam @ AMI:提高句子嵌入与厌女词汇的相似度,用于意大利语推文中的厌女自动识别
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.6807
Giuseppe Attanasio, Eliana Pastor
We present a multi-agent classification solution for identifying misogynous and aggressive content in Italian tweets. A first agent uses modern Sentence Embedding techniques to encode tweets and a SVM classifier to produce initial labels. A second agent, based on TF-IDF and Misogyny Italian lexicons, is jointly adopted to improve the first agent on uncertain predictions. We evaluate our approach in the Automatic Misogyny Identification Shared Task of the EVALITA 2020 campaign. Results show that TF-IDF and lexicons effectively improve the supervised agent trained on sentence embeddings. Italiano. Presentiamo un classificatore multi-agente per identificare tweet italiani misogini e aggressivi. Un primo agente codifica i tweet con Sentence Embedding e una SVM per produrre le etichette iniziali. Un secondo agente, basato su TF-IDF e lessici misogini, è usato per coadiuvare il primo agente nelle predizioni incerte. Applichiamo la soluzione al task AMI della campagna EVALITA 2020. I risultati mostrano che TF-IDF e i lessici migliorano le performance del primo agente addestrato su sentence embedding.
我们提出了一个多智能体分类解决方案,用于识别意大利语推文中的厌女和攻击性内容。第一智能体使用现代句子嵌入技术对tweet进行编码,并使用支持向量机分类器生成初始标签。基于TF-IDF和Misogyny意大利语词汇的第二个代理被联合采用,以改进第一个代理对不确定预测的处理。我们在EVALITA 2020运动的自动厌女症识别共享任务中评估了我们的方法。结果表明,TF-IDF和词典有效地改善了句子嵌入训练的监督智能体。意大利语。呈现一种非分类的、多代理的、每条身份推文的意大利式厌女攻击。利用支持向量机对推文和句子嵌入的初始化问题进行求解。第二剂,basato su TF-IDF,较弱的misogini, è usato per codiuva,第一剂,较弱的预测。应用解决方案的所有任务AMI della campagna EVALITA 2020。结果表明,TF-IDF算法在句子嵌入中具有较低的性能和较低的性能。
{"title":"PoliTeam @ AMI: Improving Sentence Embedding Similarity with Misogyny Lexicons for Automatic Misogyny Identification in Italian Tweets","authors":"Giuseppe Attanasio, Eliana Pastor","doi":"10.4000/BOOKS.AACCADEMIA.6807","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6807","url":null,"abstract":"We present a multi-agent classification solution for identifying misogynous and aggressive content in Italian tweets. A first agent uses modern Sentence Embedding techniques to encode tweets and a SVM classifier to produce initial labels. A second agent, based on TF-IDF and Misogyny Italian lexicons, is jointly adopted to improve the first agent on uncertain predictions. We evaluate our approach in the Automatic Misogyny Identification Shared Task of the EVALITA 2020 campaign. Results show that TF-IDF and lexicons effectively improve the supervised agent trained on sentence embeddings. Italiano. Presentiamo un classificatore multi-agente per identificare tweet italiani misogini e aggressivi. Un primo agente codifica i tweet con Sentence Embedding e una SVM per produrre le etichette iniziali. Un secondo agente, basato su TF-IDF e lessici misogini, è usato per coadiuvare il primo agente nelle predizioni incerte. Applichiamo la soluzione al task AMI della campagna EVALITA 2020. I risultati mostrano che TF-IDF e i lessici migliorano le performance del primo agente addestrato su sentence embedding.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115946515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
DeepReading @ SardiStance 2020: Combining Textual, Social and Emotional Features 深度阅读@ SardiStance 2020:结合文本、社交和情感特征
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7129
María S. Espinosa, Rodrigo Agerri, Álvaro Rodrigo, Roberto Centeno
In this paper we describe our participation to the SardiStance shared task held at EVALITA 2020. We developed a set of classifiers that combined text features, such as the best performing systems based on large pre-trained language models, together with user profile features, such as psychological traits and social media user interactions. The classification algorithms chosen for our models were various monolingual and multilingual Transformer models for text only classification, and XGBoost for the non-textual features. The combination of the textual and contextual models was performed by a weighted voting ensemble learning system. Our approach obtained the best score for Task B, on Contextual Stance Detection.
在本文中,我们描述了我们参与在EVALITA 2020举行的SardiStance共享任务。我们开发了一套分类器,将文本特征(如基于大型预训练语言模型的最佳表现系统)与用户配置文件特征(如心理特征和社交媒体用户交互)结合在一起。为我们的模型选择的分类算法是用于纯文本分类的各种单语言和多语言Transformer模型,以及用于非文本特征的XGBoost。文本模型和上下文模型的结合由加权投票集成学习系统完成。我们的方法在任务B的情境姿态检测中获得了最高分。
{"title":"DeepReading @ SardiStance 2020: Combining Textual, Social and Emotional Features","authors":"María S. Espinosa, Rodrigo Agerri, Álvaro Rodrigo, Roberto Centeno","doi":"10.4000/BOOKS.AACCADEMIA.7129","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7129","url":null,"abstract":"In this paper we describe our participation to the SardiStance shared task held at EVALITA 2020. We developed a set of classifiers that combined text features, such as the best performing systems based on large pre-trained language models, together with user profile features, such as psychological traits and social media user interactions. The classification algorithms chosen for our models were various monolingual and multilingual Transformer models for text only classification, and XGBoost for the non-textual features. The combination of the textual and contextual models was performed by a weighted voting ensemble learning system. Our approach obtained the best score for Task B, on Contextual Stance Detection.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"12 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114105625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
CONcreTEXT @ EVALITA2020: The Concreteness in Context Task 语境中的具体任务
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7445
Lorenzo Gregori, Maria Montefinese, D. Radicioni, Andrea Amelio Ravelli, Rossella Varvara
Focus of the CONCRETEXT task is conceptual concreteness: systems were solicited to compute a value expressing to what extent target concepts are concrete (i.e., more or less perceptually salient) within a given context of occurrence. To these ends, we have developed a new dataset which was annotated with concreteness ratings and used as gold standard in the evaluation of systems. Four teams participated in this first edition of the task, with a total of 15 runs submitted. Interestingly, these works extend information on conceptual concreteness available in existing (non contextual) norms derived from human judgments with new knowledge from recently developed neural architectures, in much the same multidisciplinary spirit whereby the CONCRETEXT task was organized.
CONCRETEXT任务的重点是概念的具体性:系统被要求计算一个值,表示在给定的发生背景下目标概念在多大程度上是具体的(即,或多或少在感知上显着)。为此,我们开发了一个新的数据集,其中标注了具体等级,并将其用作系统评估的金标准。四个团队参加了这个任务的第一版,总共提交了15次运行。有趣的是,这些工作扩展了现有(非上下文)规范中可用的概念性具体信息,这些信息来自于最近开发的神经架构的新知识,与组织CONCRETEXT任务的多学科精神大致相同。
{"title":"CONcreTEXT @ EVALITA2020: The Concreteness in Context Task","authors":"Lorenzo Gregori, Maria Montefinese, D. Radicioni, Andrea Amelio Ravelli, Rossella Varvara","doi":"10.4000/BOOKS.AACCADEMIA.7445","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7445","url":null,"abstract":"Focus of the CONCRETEXT task is conceptual concreteness: systems were solicited to compute a value expressing to what extent target concepts are concrete (i.e., more or less perceptually salient) within a given context of occurrence. To these ends, we have developed a new dataset which was annotated with concreteness ratings and used as gold standard in the evaluation of systems. Four teams participated in this first edition of the task, with a total of 15 runs submitted. Interestingly, these works extend information on conceptual concreteness available in existing (non contextual) norms derived from human judgments with new knowledge from recently developed neural architectures, in much the same multidisciplinary spirit whereby the CONCRETEXT task was organized.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124198085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
UNITOR @ DANKMEME: Combining Convolutional Models and Transformer-based architectures for accurate MEME management unit @ DANKMEME:结合卷积模型和基于变压器的架构,实现准确的MEME管理
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7420
Claudia Breazzano, E. Rubino, D. Croce, R. Basili
This paper describes the UNITOR system that participated to the “multimoDal Artefacts recogNition Knowledge for MEMES” (DANKMEMES) task within the context of EVALITA 2020. UNITOR implements a neural model which combines a Deep Convolutional Neural Network to encode visual information of input images and a Transformerbased architecture to encode the meaning of the attached texts. UNITOR ranked first in all subtasks, clearly confirming the robustness of the investigated neural architectures and suggesting the beneficial impact of the proposed combination strategy.
本文描述了在EVALITA 2020的背景下参与“MEMES的多模态人工制品识别知识”(DANKMEMES)任务的UNITOR系统。UNITOR实现了一个神经模型,该模型结合了深度卷积神经网络来编码输入图像的视觉信息,以及基于transform的架构来编码附加文本的含义。UNITOR在所有子任务中排名第一,清楚地证实了所研究的神经结构的鲁棒性,并表明所提出的组合策略的有益影响。
{"title":"UNITOR @ DANKMEME: Combining Convolutional Models and Transformer-based architectures for accurate MEME management","authors":"Claudia Breazzano, E. Rubino, D. Croce, R. Basili","doi":"10.4000/BOOKS.AACCADEMIA.7420","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7420","url":null,"abstract":"This paper describes the UNITOR system that participated to the “multimoDal Artefacts recogNition Knowledge for MEMES” (DANKMEMES) task within the context of EVALITA 2020. UNITOR implements a neural model which combines a Deep Convolutional Neural Network to encode visual information of input images and a Transformerbased architecture to encode the meaning of the attached texts. UNITOR ranked first in all subtasks, clearly confirming the robustness of the investigated neural architectures and suggesting the beneficial impact of the proposed combination strategy.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127682060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fontana-Unipi @ HaSpeeDe2: Ensemble of transformers for the Hate Speech task at Evalita (short paper) Fontana-Unipi @ HaSpeeDe2: Evalita仇恨言论任务的变形金刚集合(短文)
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.6979
Michele Fontana, Giuseppe Attardi
We describe our approach and experiments to tackle Task A of the second edition of HaSpeeDe, within the Evalita 2020 evaluation campaign. The proposed model consists in an ensemble of classifiers built from three variants of a common neural architecture. Each classifier uses contextual representations from transformers trained on Italian texts, fine tuned on the training set of the challenge. We tested the proposed model on the two official test sets, the in-domain test set containing just tweets and the out-of-domain one including also news headlines. Our submissions ranked 4th on the tweets test set and 17th on the second test set.
我们描述了我们在Evalita 2020评估活动中解决HaSpeeDe第二版任务A的方法和实验。提出的模型由由三种常见神经结构变体构建的分类器集成而成。每个分类器使用来自意大利语文本训练的转换器的上下文表示,并对挑战的训练集进行微调。我们在两个官方测试集上测试了提出的模型,域内测试集只包含tweet,域外测试集也包括新闻标题。我们的提交在推文测试集中排名第4,在第二个测试集中排名第17。
{"title":"Fontana-Unipi @ HaSpeeDe2: Ensemble of transformers for the Hate Speech task at Evalita (short paper)","authors":"Michele Fontana, Giuseppe Attardi","doi":"10.4000/BOOKS.AACCADEMIA.6979","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6979","url":null,"abstract":"We describe our approach and experiments to tackle Task A of the second edition of HaSpeeDe, within the Evalita 2020 evaluation campaign. The proposed model consists in an ensemble of classifiers built from three variants of a common neural architecture. Each classifier uses contextual representations from transformers trained on Italian texts, fine tuned on the training set of the challenge. We tested the proposed model on the two official test sets, the in-domain test set containing just tweets and the out-of-domain one including also news headlines. Our submissions ranked 4th on the tweets test set and 17th on the second test set.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131225876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1