Pub Date : 1900-01-01DOI: 10.4000/BOOKS.AACCADEMIA.7475
Alessandro Bondielli, Gianluca E. Lebani, Lucia C. Passaro, Alessandro Lenci
English. This paper describes several approaches to the automatic rating of the concreteness of concepts in context, to approach the EVALITA 2020 “CONcreTEXT” task. Our systems focus on the interplay between words and their surrounding context by (i) exploiting annotated resources, (ii) using BERT masking to find potential substitutes of the target in specific contexts and measuring their average similarity with concrete and abstract centroids, and (iii) automatically generating labelled datasets to fine tune transformer models for regression. All the approaches have been tested both on English and Italian data. Both the best systems for each language ranked second in the task.
{"title":"CAPISCO @ CONcreTEXT 2020: (Un)supervised Systems to Contextualize Concreteness with Norming Data","authors":"Alessandro Bondielli, Gianluca E. Lebani, Lucia C. Passaro, Alessandro Lenci","doi":"10.4000/BOOKS.AACCADEMIA.7475","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7475","url":null,"abstract":"English. This paper describes several approaches to the automatic rating of the concreteness of concepts in context, to approach the EVALITA 2020 “CONcreTEXT” task. Our systems focus on the interplay between words and their surrounding context by (i) exploiting annotated resources, (ii) using BERT masking to find potential substitutes of the target in specific contexts and measuring their average similarity with concrete and abstract centroids, and (iii) automatically generating labelled datasets to fine tune transformer models for regression. All the approaches have been tested both on English and Italian data. Both the best systems for each language ranked second in the task.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125638555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/BOOKS.AACCADEMIA.6892
E. Rosa, A. Durante
In this paper we describe and present the results of the system we specifically developed and submitted for our participation to the ATE ABSITA 2020 evaluation campaign on the Aspect Term Extraction (ATE), Aspect-based Sentiment Analysis (ABSA), and Sentiment Analysis (SA) tasks. The official results show that App2Check ranks first in all of the three tasks, reaching a F1 score which is 0.14236 higher than the second best system in the ATE task and 0.11943 higher in the ABSA task; it shows a Root-MeanSquare Error (RMSE) that is 0.13075 lower than the second classified in the SA
{"title":"App2Check @ ATE_ABSITA 2020: Aspect Term Extraction and Aspect-based Sentiment Analysis (short paper)","authors":"E. Rosa, A. Durante","doi":"10.4000/BOOKS.AACCADEMIA.6892","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6892","url":null,"abstract":"In this paper we describe and present the results of the system we specifically developed and submitted for our participation to the ATE ABSITA 2020 evaluation campaign on the Aspect Term Extraction (ATE), Aspect-based Sentiment Analysis (ABSA), and Sentiment Analysis (SA) tasks. The official results show that App2Check ranks first in all of the three tasks, reaching a F1 score which is 0.14236 higher than the second best system in the ATE task and 0.11943 higher in the ABSA task; it shows a Root-MeanSquare Error (RMSE) that is 0.13075 lower than the second classified in the SA","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133619020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/BOOKS.AACCADEMIA.7014
Mariano Jason Rodriguez Cisnero, Reynier Ortega Bueno
English. This document describes our participation in the Hate Speech Detection task at Evalita 2020. Our system is based on deep learning techniques, specifically RNNs and attention mechanism, mixed with transformer representations and linguistic features. In the training process a multi task learning was used to increase the system effectiveness. The results show how some of the selected features were not a good combination within the model. Nevertheless, the generalization level achieved yield encourage results.
{"title":"UO @ HaSpeeDe2: Ensemble Model for Italian Hate Speech Detection (short paper)","authors":"Mariano Jason Rodriguez Cisnero, Reynier Ortega Bueno","doi":"10.4000/BOOKS.AACCADEMIA.7014","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7014","url":null,"abstract":"English. This document describes our participation in the Hate Speech Detection task at Evalita 2020. Our system is based on deep learning techniques, specifically RNNs and attention mechanism, mixed with transformer representations and linguistic features. In the training process a multi task learning was used to increase the system effectiveness. The results show how some of the selected features were not a good combination within the model. Nevertheless, the generalization level achieved yield encourage results.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130527307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/BOOKS.AACCADEMIA.7207
S. Kayalvizhi, D. Thenmozhi, Aravindan Chandrabose
Stance detection refers to the detection of one’s opinion about the target from their statements. The aim of sardistance task is to classify the Italian tweets into classes of favor, against or no feeling towards the target. The task has two sub-tasks : in Task A, the classification has to be done by considering only the textual meaning whereas in Task B the tweets must be classified by considering the contextual information along with the textual meaning. We have presented our solution to detect the stance utilizing only the textual meaning (Task A) using encoder-decoder model and transformers. Among these two approaches, simple transformers have performed better than the encoder-decoder model with an average F1-score of 0.4707.
{"title":"SSN NLP @ SardiStance : Stance Detection from Italian Tweets using RNN and Transformers (short paper)","authors":"S. Kayalvizhi, D. Thenmozhi, Aravindan Chandrabose","doi":"10.4000/BOOKS.AACCADEMIA.7207","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7207","url":null,"abstract":"Stance detection refers to the detection of one’s opinion about the target from their statements. The aim of sardistance task is to classify the Italian tweets into classes of favor, against or no feeling towards the target. The task has two sub-tasks : in Task A, the classification has to be done by considering only the textual meaning whereas in Task B the tweets must be classified by considering the contextual information along with the textual meaning. We have presented our solution to detect the stance utilizing only the textual meaning (Task A) using encoder-decoder model and transformers. Among these two approaches, simple transformers have performed better than the encoder-decoder model with an average F1-score of 0.4707.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130083990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/BOOKS.AACCADEMIA.7092
Simone Giorgioni, Marcello Politi, Samir Salman, R. Basili, D. Croce
English. This paper describes the UNITOR system that participated to the Stance Detection in Italian tweets (Sardistance) task within the context of EVALITA 2020. UNITOR implements a transformer-based architecture whose accuracy is improved by adopting a Transfer Learning technique. In particular, this work investigates the possible contribution of three auxiliary tasks related to Stance Detection, i.e., Sentiment Detection, Hate Speech Detection and Irony Detection. Moreover, UNITOR relies on an additional dataset automatically downloaded and labeled through distant supervision. The UNITOR system ranked first in Task A within the competition. This confirms the effectiveness of Transformer-based architectures and the beneficial impact of the adopted strategies. Italiano. Questo lavoro descrive UNITOR, uno dei sistemi partecipanti allo Stance Detection in Italian tweet (SardiStance) task. UNITOR implementa un’architettura neurale basata su Transformer, la cui accuratezza viene migliorata applicando un metodo di Transfer Learning, che sfrutta le informazioni di tre task ausiliari, ovvero Sentiment Detection, Hate Speech Detection e Irony Detection. Inoltre, l’addestramento di UNITOR puó contare su un insieme di dati scaricati ed etichettati automaticamente applicando un semplice metodo di Distant Supervision. Il sistema si é classificato al primo posto nella competizione, confermando l’efficacia delle architetture basate su Transformer e il contributo delle strategie
{"title":"UNITOR @ Sardistance2020: Combining Transformer-based Architectures and Transfer Learning for Robust Stance Detection","authors":"Simone Giorgioni, Marcello Politi, Samir Salman, R. Basili, D. Croce","doi":"10.4000/BOOKS.AACCADEMIA.7092","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7092","url":null,"abstract":"English. This paper describes the UNITOR system that participated to the Stance Detection in Italian tweets (Sardistance) task within the context of EVALITA 2020. UNITOR implements a transformer-based architecture whose accuracy is improved by adopting a Transfer Learning technique. In particular, this work investigates the possible contribution of three auxiliary tasks related to Stance Detection, i.e., Sentiment Detection, Hate Speech Detection and Irony Detection. Moreover, UNITOR relies on an additional dataset automatically downloaded and labeled through distant supervision. The UNITOR system ranked first in Task A within the competition. This confirms the effectiveness of Transformer-based architectures and the beneficial impact of the adopted strategies. Italiano. Questo lavoro descrive UNITOR, uno dei sistemi partecipanti allo Stance Detection in Italian tweet (SardiStance) task. UNITOR implementa un’architettura neurale basata su Transformer, la cui accuratezza viene migliorata applicando un metodo di Transfer Learning, che sfrutta le informazioni di tre task ausiliari, ovvero Sentiment Detection, Hate Speech Detection e Irony Detection. Inoltre, l’addestramento di UNITOR puó contare su un insieme di dati scaricati ed etichettati automaticamente applicando un semplice metodo di Distant Supervision. Il sistema si é classificato al primo posto nella competizione, confermando l’efficacia delle architetture basate su Transformer e il contributo delle strategie","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131011621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/BOOKS.AACCADEMIA.7129
María S. Espinosa, Rodrigo Agerri, Álvaro Rodrigo, Roberto Centeno
In this paper we describe our participation to the SardiStance shared task held at EVALITA 2020. We developed a set of classifiers that combined text features, such as the best performing systems based on large pre-trained language models, together with user profile features, such as psychological traits and social media user interactions. The classification algorithms chosen for our models were various monolingual and multilingual Transformer models for text only classification, and XGBoost for the non-textual features. The combination of the textual and contextual models was performed by a weighted voting ensemble learning system. Our approach obtained the best score for Task B, on Contextual Stance Detection.
{"title":"DeepReading @ SardiStance 2020: Combining Textual, Social and Emotional Features","authors":"María S. Espinosa, Rodrigo Agerri, Álvaro Rodrigo, Roberto Centeno","doi":"10.4000/BOOKS.AACCADEMIA.7129","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7129","url":null,"abstract":"In this paper we describe our participation to the SardiStance shared task held at EVALITA 2020. We developed a set of classifiers that combined text features, such as the best performing systems based on large pre-trained language models, together with user profile features, such as psychological traits and social media user interactions. The classification algorithms chosen for our models were various monolingual and multilingual Transformer models for text only classification, and XGBoost for the non-textual features. The combination of the textual and contextual models was performed by a weighted voting ensemble learning system. Our approach obtained the best score for Task B, on Contextual Stance Detection.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"12 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114105625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/BOOKS.AACCADEMIA.7768
F. Tamburini
English. The use of contextualised word embeddings allowed for a relevant performance increase for almost all Natural Language Processing (NLP) applications. Recently some new models especially developed for Italian became available to scholars. This work aims at applying simple fine-tuning methods for producing highperformance solutions at the EVALITA KIPOS PoS-tagging task (Bosco et al., 2020). Italian. L’utilizzazione di word embedding contestuali ha consentito notevoli incrementi nelle performance dei sistemi automatici sviluppati per affrontare vari task nell’ambito dell’elaborazione del linguaggio naturale. Recentemente sono stati introdotti alcuni nuovi modelli sviluppati specificatamente per la lingua italiana. Lo scopo di questo lavoro è valutare se un semplice fine-tuning di questi modelli sia sufficiente per ottenere performance di alto livello nel task KIPOS di EVALITA 2020.
English。允许使用内容嵌入的单词来提高几乎所有自然语言处理(NLP)应用程序的相关性能。最近为意大利人开发了一些新的特别设计的新模型。这项工作的目的是简单地调整生产高绩效解决方案的方法,在逃避KIPOS pos标签任务(Bosco et al., 2020)。英语。上下文嵌入式word的使用使为处理自然语言处理领域的几个任务而开发的自动化系统的性能有了显著的提高。最近引进了专门为意大利语开发的新模型。这项工作的目的是评估这些模型的简单微调是否足以在ev预期2020年工作队中获得高水平的性能。
{"title":"UniBO @ KIPoS: Fine-tuning the Italian \"BERTology\" for PoS-tagging Spoken Data (short paper)","authors":"F. Tamburini","doi":"10.4000/BOOKS.AACCADEMIA.7768","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7768","url":null,"abstract":"English. The use of contextualised word embeddings allowed for a relevant performance increase for almost all Natural Language Processing (NLP) applications. Recently some new models especially developed for Italian became available to scholars. This work aims at applying simple fine-tuning methods for producing highperformance solutions at the EVALITA KIPOS PoS-tagging task (Bosco et al., 2020). Italian. L’utilizzazione di word embedding contestuali ha consentito notevoli incrementi nelle performance dei sistemi automatici sviluppati per affrontare vari task nell’ambito dell’elaborazione del linguaggio naturale. Recentemente sono stati introdotti alcuni nuovi modelli sviluppati specificatamente per la lingua italiana. Lo scopo di questo lavoro è valutare se un semplice fine-tuning di questi modelli sia sufficiente per ottenere performance di alto livello nel task KIPOS di EVALITA 2020.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116338771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.4000/BOOKS.AACCADEMIA.7445
Lorenzo Gregori, Maria Montefinese, D. Radicioni, Andrea Amelio Ravelli, Rossella Varvara
Focus of the CONCRETEXT task is conceptual concreteness: systems were solicited to compute a value expressing to what extent target concepts are concrete (i.e., more or less perceptually salient) within a given context of occurrence. To these ends, we have developed a new dataset which was annotated with concreteness ratings and used as gold standard in the evaluation of systems. Four teams participated in this first edition of the task, with a total of 15 runs submitted. Interestingly, these works extend information on conceptual concreteness available in existing (non contextual) norms derived from human judgments with new knowledge from recently developed neural architectures, in much the same multidisciplinary spirit whereby the CONCRETEXT task was organized.
{"title":"CONcreTEXT @ EVALITA2020: The Concreteness in Context Task","authors":"Lorenzo Gregori, Maria Montefinese, D. Radicioni, Andrea Amelio Ravelli, Rossella Varvara","doi":"10.4000/BOOKS.AACCADEMIA.7445","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7445","url":null,"abstract":"Focus of the CONCRETEXT task is conceptual concreteness: systems were solicited to compute a value expressing to what extent target concepts are concrete (i.e., more or less perceptually salient) within a given context of occurrence. To these ends, we have developed a new dataset which was annotated with concreteness ratings and used as gold standard in the evaluation of systems. Four teams participated in this first edition of the task, with a total of 15 runs submitted. Interestingly, these works extend information on conceptual concreteness available in existing (non contextual) norms derived from human judgments with new knowledge from recently developed neural architectures, in much the same multidisciplinary spirit whereby the CONCRETEXT task was organized.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124198085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}