首页 > 最新文献

EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020最新文献

英文 中文
No Place For Hate Speech @ AMI: Convolutional Neural Network and Word Embedding for the Identification of Misogyny in Italian (short paper) 没有仇恨言论的地方@ AMI:卷积神经网络和词嵌入识别意大利语中的厌女症(短文)
Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.6834
Adriano dos S. R. da Silva, N. T. Roman
English. In this article, we describe two classification models (a Convolutional Neural Network and a Logistic Regression classifier), arranged according to three different strategies, submitted to subtask A of Automatic Misogyny Identification at EVALITA 2020. Results were very encouraging for detecting misogyny, even though aggressiveness was less accurate. Our second strategy, consisting of a Convolutional Neural Network and logistic regression to identify misogyny and aggressiveness, respectively, won the sixth place in the competition. Italiano. In questo articolo, descriviamo due modelli di classificazione (i.e., Convolutional Neural Network e Regressione Logistica), organizzati secondo tre diverse strategie, per il subtask A dello shared task Automatic Misogyny Identification a EVALITA 2020. I risultati sono stati molto incoraggianti nel rilevamento della misoginia, anche se l’aggressività viene riconosciuta con una precisione più basse. La nostra seconda strategia (Convolutional Neural Network per misoginia e Regressione Logistica per aggressività) ci ha permesso di ottenere il sesto posto
English。在这篇文章中,我们描述了两种不同的分类模式,同意了三种不同的策略,承诺在2020年进行自动歧视女性识别子任务。结果对检测厌女症非常有吸引力,即使攻击性更小。我们的第二项战略是建立一个扭曲的神经网络和物流倒退,以识别厌恶女性和侵略性,尊重,赢得比赛的第六个位置。意大利。在这篇文章中,我们描述了两种分类模型(人工智能、Convolutional神经网络和物流回归),根据三种不同的策略组织起来,为共享的、自动化的、厌恶女性的身份识别子任务组(evita 2020)。在发现厌女症方面,结果非常令人鼓舞,尽管人们对攻击性的认识较低。我们的第二种策略(针对女性歧视和攻击性行为的阴谋神经网络)使我们排在第六位
{"title":"No Place For Hate Speech @ AMI: Convolutional Neural Network and Word Embedding for the Identification of Misogyny in Italian (short paper)","authors":"Adriano dos S. R. da Silva, N. T. Roman","doi":"10.4000/books.aaccademia.6834","DOIUrl":"https://doi.org/10.4000/books.aaccademia.6834","url":null,"abstract":"English. In this article, we describe two classification models (a Convolutional Neural Network and a Logistic Regression classifier), arranged according to three different strategies, submitted to subtask A of Automatic Misogyny Identification at EVALITA 2020. Results were very encouraging for detecting misogyny, even though aggressiveness was less accurate. Our second strategy, consisting of a Convolutional Neural Network and logistic regression to identify misogyny and aggressiveness, respectively, won the sixth place in the competition. Italiano. In questo articolo, descriviamo due modelli di classificazione (i.e., Convolutional Neural Network e Regressione Logistica), organizzati secondo tre diverse strategie, per il subtask A dello shared task Automatic Misogyny Identification a EVALITA 2020. I risultati sono stati molto incoraggianti nel rilevamento della misoginia, anche se l’aggressività viene riconosciuta con una precisione più basse. La nostra seconda strategia (Convolutional Neural Network per misoginia e Regressione Logistica per aggressività) ci ha permesso di ottenere il sesto posto","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114752438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Venses @ AcCompl-It: Computing Complexity vs Acceptability with a Constituent Trigram Model and Semantics 计算复杂性与组成三角模型和语义的可接受性
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7735
R. Delmonte
In this paper we present work carried out for the Ac-ComplIt task. ItVENSES is a system for syntactic and semantic processing that is based on the parser for Italian called ItGetaruns to analyse each sentence. In previous EVALITA tasks we only used semantics to produce the results. In this year EVALITA, we used both a statistically based approach and the semantic one used previously. The statistic approach is characterized by the use of trigrams of constituents computed by the system and checked against a trigram model derived from the constituency version of VIT – Venice Italian Treebank. Results measured in term of a correlation, are not particularly high, below 50% the Acceptability task and slightly over 30% the Complexity one.
在本文中,我们介绍了Ac-ComplIt任务所进行的工作。ItVENSES是一个句法和语义处理系统,它基于意大利语语法分析器ItGetaruns来分析每个句子。在以前的EVALITA任务中,我们只使用语义来生成结果。在今年的EVALITA中,我们使用了基于统计的方法和之前使用的语义方法。统计方法的特点是使用由系统计算的成分三元组,并根据VIT -威尼斯意大利树库的选区版本派生的三元组模型进行检查。根据相关性测量的结果不是特别高,低于可接受性任务的50%,略高于复杂性任务的30%。
{"title":"Venses @ AcCompl-It: Computing Complexity vs Acceptability with a Constituent Trigram Model and Semantics","authors":"R. Delmonte","doi":"10.4000/BOOKS.AACCADEMIA.7735","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7735","url":null,"abstract":"In this paper we present work carried out for the Ac-ComplIt task. ItVENSES is a system for syntactic and semantic processing that is based on the parser for Italian called ItGetaruns to analyse each sentence. In previous EVALITA tasks we only used semantics to produce the results. In this year EVALITA, we used both a statistically based approach and the semantic one used previously. The statistic approach is characterized by the use of trigrams of constituents computed by the system and checked against a trigram model derived from the constituency version of VIT – Venice Italian Treebank. Results measured in term of a correlation, are not particularly high, below 50% the Acceptability task and slightly over 30% the Complexity one.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122169438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
DANKMEMES @ EVALITA 2020: The Memeing of Life: Memes, Multimodality and Politics DANKMEMES @ EVALITA 2020:生活的模因:模因,多模态和政治
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7330
Martina Miliani, Giulia Giorgi, Ilir Rama, G. Anselmi, Gianluca E. Lebani
DANKMEMES is a shared task proposed for the 2020 EVALITA campaign, focusing on the automatic classification of Internet memes. Providing a corpus of 2.361 memes on the 2019 Italian Government Crisis, DANKMEMES features three tasks: A) Meme Detection, B) Hate Speech Identification, and C) Event Clustering. Overall, 5 groups took part in the first task, 2 in the second and 1 in the third. The best system was proposed by the UniTor group and achieved a F1 score of 0.8501 for task A, 0.8235 for task B and 0.2657 for task C. In this report, we describe how the task was set up, we report the system results and we discuss them.
DANKMEMES是为2020年EVALITA活动提出的一项共享任务,重点是网络模因的自动分类。DANKMEMES提供了一个关于2019年意大利政府危机的2.361个模因的语料库,主要有三个任务:a)模因检测,B)仇恨言论识别,C)事件聚类。总共有5组参加了第一项任务,2组参加了第二项任务,1组参加了第三项任务。UniTor小组提出了最佳系统,任务a的F1得分为0.8501,任务B的F1得分为0.8235,任务c的F1得分为0.2657。在本报告中,我们描述了任务的设置过程,报告了系统结果并进行了讨论。
{"title":"DANKMEMES @ EVALITA 2020: The Memeing of Life: Memes, Multimodality and Politics","authors":"Martina Miliani, Giulia Giorgi, Ilir Rama, G. Anselmi, Gianluca E. Lebani","doi":"10.4000/BOOKS.AACCADEMIA.7330","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7330","url":null,"abstract":"DANKMEMES is a shared task proposed for the 2020 EVALITA campaign, focusing on the automatic classification of Internet memes. Providing a corpus of 2.361 memes on the 2019 Italian Government Crisis, DANKMEMES features three tasks: A) Meme Detection, B) Hate Speech Identification, and C) Event Clustering. Overall, 5 groups took part in the first task, 2 in the second and 1 in the third. The best system was proposed by the UniTor group and achieved a F1 score of 0.8501 for task A, 0.8235 for task B and 0.2657 for task C. In this report, we describe how the task was set up, we report the system results and we discuss them.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124897600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
EVALITA 2020: Overview of the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian EVALITA 2020:第七次意大利语自然语言处理和语音工具评估活动概述
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.6747
Valerio Basile, D. Croce, Maria Di Maro, Lucia C. Passaro
The Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA) is the biennial initiative aimed at promoting the development of language and speech technologies for the Italian language. EVALITA is promoted by the Italian Association of Computational Linguistics (AILC)1 and it is endorsed by the Italian Association for Artificial Intelligence (AIxIA)2 and the Italian Association for Speech Sciences (AISV)3. EVALITA provides a shared framework where different systems and approaches can be scientifically evaluated and compared with each other with respect to a large variety of tasks, suggested and organized by the Italian research community. The proposed tasks represent scientific challenges where methods, resources, and systems can be tested against shared benchmarks representing linguistic open issues or real world applications, possibly in a multilingual and/or multi-modal perspective. The collected data sets provide big opportunities for scientists to explore old and new problems concerning NLP in Italian as well as to develop solutions and to discuss the NLP-related issues within the community. Some tasks are traditionally present in the evaluation campaign, while others are completely new. This paper introduces the tasks proposed at EVALITA 2020 and provides an overview to the participants and systems whose descriptions and obtained results are reported in these Proceedings4. The EVALITA 2020 edition, held online on December 17th due to the COVID-19 pandemic, counts 14 different tasks. In particular, the selected tasks are grouped in five research areas (tracks) according to their objective and characteristics, namely (i) Affect, Hate, and Stance, (ii) Creativity and Style, (iii) New Challenges in Long-standing Tasks, (iv) Semantics and Multimodality, (v) Time and Diachrony. This edition was highly participated, with 51 groups whose participants have affiliation in 14 countries. Although EVALITA is generally promoted and targeted to the Italian research community, this edition saw an international participation, also thanks to the fact that several Italian researchers working in different countries contributed to the organization of the tasks or participated in them as authors. This overview is organized as follows: in Section 2 a brief description of the tasks belonging to the various areas is reported. Section 3 discusses the participation to the workshop referred to several aspects, from the research area, to the affiliation of authors. Section 4 describes the criteria used to assign the best system across tasks award, made by an ad-hoc committee starting from the suggestions of task organizers and reviewers. Finally, section 5 points out on both the obtained results and on the future of the workshop.
意大利语自然语言处理和语音工具评估运动(EVALITA)是两年一次的倡议,旨在促进意大利语语言和语音技术的发展。EVALITA由意大利计算语言学协会(AILC)1推广,并得到意大利人工智能协会(AIxIA)2和意大利语音科学协会(AISV)3的认可。EVALITA提供了一个共享框架,在这个框架中,意大利研究界建议和组织的各种任务可以对不同的系统和方法进行科学评估和相互比较。提出的任务代表了科学挑战,方法、资源和系统可以在多语言和/或多模式的视角下,针对代表语言开放问题或现实世界应用的共享基准进行测试。收集的数据集为科学家探索意大利语中有关NLP的新旧问题以及制定解决方案和在社区内讨论NLP相关问题提供了巨大的机会。有些任务传统上存在于评估活动中,而其他任务则是全新的。本文介绍了EVALITA 2020提出的任务,并概述了参与者和系统,其描述和获得的结果已在这些会议记录中报告。由于COVID-19大流行,EVALITA 2020版于12月17日在线举行,共有14项不同的任务。具体而言,所选任务根据其目标和特征分为五个研究领域(轨道),即(i)影响,仇恨和立场,(ii)创造力和风格,(iii)长期任务中的新挑战,(iv)语义和多模态,(v)时间和历时。这一届的参与率很高,有51个小组的参与者来自14个国家。虽然EVALITA一般是针对意大利研究界进行推广的,但这一版看到了国际参与,这也要归功于在不同国家工作的几位意大利研究人员为任务的组织做出了贡献或作为作者参与了任务。概述如下:在第2节中,报告了属于各个领域的任务的简要描述。第3节讨论了参与研讨会涉及的几个方面,从研究领域,作者隶属关系。第4节描述了用于分配跨任务最佳系统的标准,该标准由一个特设委员会根据任务组织者和评审者的建议制定。最后,第5节指出了所取得的成果和对研讨会的未来。
{"title":"EVALITA 2020: Overview of the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian","authors":"Valerio Basile, D. Croce, Maria Di Maro, Lucia C. Passaro","doi":"10.4000/BOOKS.AACCADEMIA.6747","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6747","url":null,"abstract":"The Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA) is the biennial initiative aimed at promoting the development of language and speech technologies for the Italian language. EVALITA is promoted by the Italian Association of Computational Linguistics (AILC)1 and it is endorsed by the Italian Association for Artificial Intelligence (AIxIA)2 and the Italian Association for Speech Sciences (AISV)3. EVALITA provides a shared framework where different systems and approaches can be scientifically evaluated and compared with each other with respect to a large variety of tasks, suggested and organized by the Italian research community. The proposed tasks represent scientific challenges where methods, resources, and systems can be tested against shared benchmarks representing linguistic open issues or real world applications, possibly in a multilingual and/or multi-modal perspective. The collected data sets provide big opportunities for scientists to explore old and new problems concerning NLP in Italian as well as to develop solutions and to discuss the NLP-related issues within the community. Some tasks are traditionally present in the evaluation campaign, while others are completely new. This paper introduces the tasks proposed at EVALITA 2020 and provides an overview to the participants and systems whose descriptions and obtained results are reported in these Proceedings4. The EVALITA 2020 edition, held online on December 17th due to the COVID-19 pandemic, counts 14 different tasks. In particular, the selected tasks are grouped in five research areas (tracks) according to their objective and characteristics, namely (i) Affect, Hate, and Stance, (ii) Creativity and Style, (iii) New Challenges in Long-standing Tasks, (iv) Semantics and Multimodality, (v) Time and Diachrony. This edition was highly participated, with 51 groups whose participants have affiliation in 14 countries. Although EVALITA is generally promoted and targeted to the Italian research community, this edition saw an international participation, also thanks to the fact that several Italian researchers working in different countries contributed to the organization of the tasks or participated in them as authors. This overview is organized as follows: in Section 2 a brief description of the tasks belonging to the various areas is reported. Section 3 discusses the participation to the workshop referred to several aspects, from the research area, to the affiliation of authors. Section 4 describes the criteria used to assign the best system across tasks award, made by an ad-hoc committee starting from the suggestions of task organizers and reviewers. Finally, section 5 points out on both the obtained results and on the future of the workshop.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127729046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 91
UR NLP @ HaSpeeDe 2 at EVALITA 2020: Towards Robust Hate Speech Detection with Contextual Embeddings 基于上下文嵌入的鲁棒仇恨语音检测
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.6967
J. Hoffmann, Udo Kruschwitz
We describe our approach to addressTask A of the EVALITA 2020 Hate SpeechDetection (HaSpeeDe2) challenge.Wesubmitted two runs that are both based oncontextual embeddings – which we hadchosen due to their effectiveness in solvinga wide range of NLP problems. For ourbaseline run we use stacked embeddingsthat serve as features in a linear SVM. Oursecond run is a simple ensemble approachof three SVMs with majority voting. Bothapproaches outperform the official base-lines by a large margin, and the ensembleclassifier in particular demonstrates robustperformance on different types of test datacoming 6th (out of 27 runs) for news head-lines and 10th (out of 27) for Twitter feeds.
我们描述了我们解决EVALITA 2020仇恨语音检测(HaSpeeDe2)挑战的任务A的方法。我们提交了两个基于上下文嵌入的运行-我们选择上下文嵌入是因为它们在解决广泛的NLP问题方面的有效性。对于我们的基线运行,我们使用堆叠嵌入作为线性支持向量机的特征。我们的第二次运行是三个支持向量机的简单集成方法,具有多数投票。这两种方法的性能都大大超过了官方基线,特别是集成分类器在不同类型的测试数据上表现出了强大的性能:在新闻标题行中获得第6名(27次运行),在Twitter feed中获得第10名(27次运行)。
{"title":"UR NLP @ HaSpeeDe 2 at EVALITA 2020: Towards Robust Hate Speech Detection with Contextual Embeddings","authors":"J. Hoffmann, Udo Kruschwitz","doi":"10.4000/BOOKS.AACCADEMIA.6967","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6967","url":null,"abstract":"We describe our approach to addressTask A of the EVALITA 2020 Hate SpeechDetection (HaSpeeDe2) challenge.Wesubmitted two runs that are both based oncontextual embeddings – which we hadchosen due to their effectiveness in solvinga wide range of NLP problems. For ourbaseline run we use stacked embeddingsthat serve as features in a linear SVM. Oursecond run is a simple ensemble approachof three SVMs with majority voting. Bothapproaches outperform the official base-lines by a large margin, and the ensembleclassifier in particular demonstrates robustperformance on different types of test datacoming 6th (out of 27 runs) for news head-lines and 10th (out of 27) for Twitter feeds.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"602 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116452039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
ghostwriter19 @ SardiStance: Generating New Tweets to Classify SardiStance EVALITA 2020 Political Tweets (short paper) ghostwriter19 @ SardiStance:生成新的推文来分类SardiStance EVALITA 2020政治推文(短文)
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7109
Mauro Bennici
English. Understanding the events and the dominant thought is of great help to convey the desired message to our potential audience, be it marketing or political propaganda. Succeeding while the event is still ongoing is of vital importance to prepare alerts that require immediate action. A micro message platform like Twitter is the ideal place to be able to read a large amount of data linked to a theme and selfcategorized by its users using hashtags and mentions. In this research, I will show how a simple translator can be used to bring styles, vocabulary, grammar, and other characteristics to a common factor that leads each of us to be unique in the way we express ourselves. Italiano. Comprendere gli eventi e il pensiero dominante è di grande aiuto per veicolare alla nostra potenziale audience il messaggio desiderato sia esso di marketing o di propaganda politica. Riuscirci mentre l'evento è ancora in corso è di vitale importanza per predisporre alert che richiedono un intervento immediato. Una piattaforma di micro messaggi come Twitter è il luogo ideale per poter leggere una grande quantità di dati legata ad un tema, e spesso auto categorizzati dai suoi 1 Copyright ©️2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). stessi utenti per mezzo di hashtag e menzioni. In questa ricerca mostrerò come un semplice traduttore può essere usato per portare a fattor comune stili, lessico, grammatica e altre caratteristiche che portano ognuno di noi ad essere unico nel modo di esprimersi.
英语。无论是市场营销还是政治宣传,了解事件和主导思想都有助于向潜在受众传达想要的信息。在事件仍在进行时取得成功对于准备需要立即采取行动的警报至关重要。像Twitter这样的微信平台是一个理想的地方,可以阅读与主题相关的大量数据,并通过用户使用标签和提及进行自我分类。在这项研究中,我将展示如何使用一个简单的翻译器将风格,词汇,语法和其他特征融合到一个共同因素中,从而使我们每个人在表达自己的方式上都是独一无二的。意大利语。全面了解事件,我们将掌握主导地位è我们将掌握所有潜在的受众,我们将掌握所需的信息,我们将掌握市场营销,我们将掌握政治宣传。1 .我们将继续努力è在corso è为贫困人口提供至关重要的帮助,提醒富人立即采取行动。微信息的分类和分类Twitter的分类和分类方式Twitter的分类和分类方式Twitter的分类和分类方式Twitter的分类和分类方式Twitter的分类和分类方式Twitter的分类和分类方式Twitter的分类和分类方式1版权所有©️2020。在知识共享许可国际署名4.0 (CC BY 4.0)下允许使用。我的意思是,我的意思是,我的意思是,我的意思是,我的意思是。在questa ricerca mostrerò中,出现了简单的传统方法può essere usto per portare are a fatfatcomcomstistio, lessico,语法和其他特征(che portano ognuno di noi和essere unido modo di esprimersi)。
{"title":"ghostwriter19 @ SardiStance: Generating New Tweets to Classify SardiStance EVALITA 2020 Political Tweets (short paper)","authors":"Mauro Bennici","doi":"10.4000/BOOKS.AACCADEMIA.7109","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7109","url":null,"abstract":"English. Understanding the events and the dominant thought is of great help to convey the desired message to our potential audience, be it marketing or political propaganda. Succeeding while the event is still ongoing is of vital importance to prepare alerts that require immediate action. A micro message platform like Twitter is the ideal place to be able to read a large amount of data linked to a theme and selfcategorized by its users using hashtags and mentions. In this research, I will show how a simple translator can be used to bring styles, vocabulary, grammar, and other characteristics to a common factor that leads each of us to be unique in the way we express ourselves. Italiano. Comprendere gli eventi e il pensiero dominante è di grande aiuto per veicolare alla nostra potenziale audience il messaggio desiderato sia esso di marketing o di propaganda politica. Riuscirci mentre l'evento è ancora in corso è di vitale importanza per predisporre alert che richiedono un intervento immediato. Una piattaforma di micro messaggi come Twitter è il luogo ideale per poter leggere una grande quantità di dati legata ad un tema, e spesso auto categorizzati dai suoi 1 Copyright ©️2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). stessi utenti per mezzo di hashtag e menzioni. In questa ricerca mostrerò come un semplice traduttore può essere usato per portare a fattor comune stili, lessico, grammatica e altre caratteristiche che portano ognuno di noi ad essere unico nel modo di esprimersi.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127119590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
SardiStance @ EVALITA2020: Overview of the Task on Stance Detection in Italian Tweets SardiStance @ EVALITA2020:意大利语推文中姿态检测任务概述
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7084
A. T. Cignarella, Mirko Lai, C. Bosco, V. Patti, Paolo Rosso
English. SardiStance is the first shared task for Italian on the automatic classification of stance in tweets. It is articulated in two different settings: A) Textual Stance Detection, exploiting only the information provided by the tweet, and B) Contextual Stance Detection, with the addition of information on the tweet itself such as the number of retweets, the number of favours or the date of posting; contextual information about the author, such as follower count, location, user’s biography; and additional knowledge extracted from the user’s network of friends, followers, retweets, quotes and replies. The task has been one of the most participated at EVALITA 2020 (Basile et al., 2020), with a total of 22 submitted runs for Task A, and 13 for Task B, and 12 different participating teams from both academia and industry.
英语。SardiStance是意大利语在推文立场自动分类方面的第一个共享任务。它有两种不同的设置:A)文本立场检测,仅利用推文提供的信息;B)上下文立场检测,添加推文本身的信息,如转发次数、支持次数或发布日期;关于作者的上下文信息,如关注者数量、位置、用户简介;以及从用户的朋友、关注者、转发、引用和回复网络中提取的额外知识。该任务是EVALITA 2020上参与最多的任务之一(Basile et al., 2020),共有22个任务a和13个任务B提交了运行,来自学术界和工业界的12个不同的参与团队。
{"title":"SardiStance @ EVALITA2020: Overview of the Task on Stance Detection in Italian Tweets","authors":"A. T. Cignarella, Mirko Lai, C. Bosco, V. Patti, Paolo Rosso","doi":"10.4000/BOOKS.AACCADEMIA.7084","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7084","url":null,"abstract":"English. SardiStance is the first shared task for Italian on the automatic classification of stance in tweets. It is articulated in two different settings: A) Textual Stance Detection, exploiting only the information provided by the tweet, and B) Contextual Stance Detection, with the addition of information on the tweet itself such as the number of retweets, the number of favours or the date of posting; contextual information about the author, such as follower count, location, user’s biography; and additional knowledge extracted from the user’s network of friends, followers, retweets, quotes and replies. The task has been one of the most participated at EVALITA 2020 (Basile et al., 2020), with a total of 22 submitted runs for Task A, and 13 for Task B, and 12 different participating teams from both academia and industry.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125191297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
matteo-brv @ DaDoEval: An SVM-based Approach for Automatic Document Dating (short paper) matteo-brv @ DaDoEval:一种基于svm的自动文档年代测定方法(短文)
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7593
M. Brivio
English. This paper describes our con-tribution to the EVALITA 2020 shared task DaDoEval – Dating Document Evaluation. The solution we present is based on a linear multi-class Support Vector Machine classifier trained on a combination of character and word n-grams, as well as number of word tokens per document. Despite its simplicity, the system ranked first both in the coarse-grained classification task on same-genre data and in the one on cross-genre data, achieving a macro-average F1 score of 0.934 and 0.413, respectively. The system implementation is available at https://github.com/ matteobrv/DaDoEval .
英语。本文描述了我们对EVALITA 2020共享任务DaDoEval - Dating Document Evaluation的贡献。我们提出的解决方案是基于一个线性多类支持向量机分类器,该分类器是在字符和单词n-gram的组合以及每个文档的单词令牌数量上训练的。虽然系统简单,但在同类型数据粗粒度分类任务和跨类型数据粗粒度分类任务中均排名第一,宏观平均F1得分分别为0.934和0.413。系统实现可从https://github.com/ matteobrv/DaDoEval获得。
{"title":"matteo-brv @ DaDoEval: An SVM-based Approach for Automatic Document Dating (short paper)","authors":"M. Brivio","doi":"10.4000/BOOKS.AACCADEMIA.7593","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7593","url":null,"abstract":"English. This paper describes our con-tribution to the EVALITA 2020 shared task DaDoEval – Dating Document Evaluation. The solution we present is based on a linear multi-class Support Vector Machine classifier trained on a combination of character and word n-grams, as well as number of word tokens per document. Despite its simplicity, the system ranked first both in the coarse-grained classification task on same-genre data and in the one on cross-genre data, achieving a macro-average F1 score of 0.934 and 0.413, respectively. The system implementation is available at https://github.com/ matteobrv/DaDoEval .","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122282195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ArchiMeDe @ DANKMEMES: A New Model Architecture for Meme Detection ArchiMeDe @ DANKMEMES:模因检测的新模型架构
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7405
Jinen Setpal, Gabriele Sarti
English. We introduce ArchiMeDe, a multimodal neural network-based architecture used to solve the DANKMEMES meme detections subtask at the 2020 EVALITA campaign. The system incor-porates information from visual and textual sources through a multimodal neural ensemble to predict if input images and their respective metadata are memes or not. Each pre-trained neural network in the ensemble is first fine-tuned indi-vidually on the training dataset to perform domain adaptation. Learned text and visual representations are then concatenated to obtain a single multimodal embedding
英语。我们介绍了ArchiMeDe,这是一种基于多模态神经网络的架构,用于解决2020年EVALITA竞选中的DANKMEMES模因检测子任务。该系统通过多模态神经系统集成来自视觉和文本来源的信息,以预测输入图像及其各自的元数据是否为模因。集成中的每个预训练神经网络首先在训练数据集上单独微调以执行域适应。然后将学习到的文本和视觉表示连接起来以获得单一的多模态嵌入
{"title":"ArchiMeDe @ DANKMEMES: A New Model Architecture for Meme Detection","authors":"Jinen Setpal, Gabriele Sarti","doi":"10.4000/BOOKS.AACCADEMIA.7405","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7405","url":null,"abstract":"English. We introduce ArchiMeDe, a multimodal neural network-based architecture used to solve the DANKMEMES meme detections subtask at the 2020 EVALITA campaign. The system incor-porates information from visual and textual sources through a multimodal neural ensemble to predict if input images and their respective metadata are memes or not. Each pre-trained neural network in the ensemble is first fine-tuned indi-vidually on the training dataset to perform domain adaptation. Learned text and visual representations are then concatenated to obtain a single multimodal embedding","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130728452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ghostwriter19 @ ATE_ABSITA: Zero-Shot and ONNX to Speed up BERT on Sentiment Analysis Tasks at EVALITA 2020 (short paper) Zero-Shot和ONNX将加速BERT在EVALITA 2020上的情感分析任务(短论文)
Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.6889
Mauro Bennici
English. With the arrival of BERT 2 in 2018, NLP research has taken a significant step forward. However, the necessary computing power has grown accordingly. Various distillation and optimization systems have been adopted but are costly in terms of cost-benefit ratio. The most important improvements are obtained by creating increasingly complex models with more layers and parameters. In this research, we will see how, by mixing transfer learning, zero-shot learning, and ONNX runtime, we can access the power of BERT right now, optimizing time and resources, achieving noticeable results on day one. Italiano. Con l'arrivo di BERT nel 2018, la ricerca nel campo dell'NLP ha fatto un notevole passo in avanti. La potenza di calcolo necessaria però è cresciuta di conseguenza. Diversi sistemi di distillazione e di ottimizzazione sono stati adottati ma risultano onerosi in termini di rapporto costo benefici. I vantaggi di maggior rilievo si ottengono creando modelli sempre più complessi con un maggior numero di layers e di parametri. In questa ricerca vedremo come mixando transfer learning, zero-shot learning e ONNX runtime si può accedere alla potenza di BERT da subito, ottimizzando tempo e risorse, raggiungendo risultati apprezzabili al day one. 1 Copyright ©️2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
English。随着伯特2号的到来,NLP的研究已经向前迈出了重要的一步。必要的计算能力已经达成一致。不同的蒸馏和优化系统已经被采用,但成本效益比率越来越高。最重要的改进是用更多的layers和参数创造出更复杂的模型。在这项研究中,我们将看到如何,通过混合学习,零点学习,和一次运行时间,我们现在可以获得伯特的力量,更好的时间和资源,在一天内获得可交付的结果。意大利。伯特于2018年抵达,nlp领域的研究取得了重大进展。然而,所需的计算能力因此而增加。采用了各种蒸馏和优化系统,但成本效益高。最大的好处是创建了越来越复杂的模型,拥有更多的玩家和参数。在这个搜索中,我们将看到如何混合传输学习,zero shot学习和ONNX runtime从现在开始访问BERT的能力,优化时间和资源,在第一天取得显著的结果。1版权所有©️2020 for this paper by its authors。使用知识共享许可归属4.0国际(CC BY 4.0)
{"title":"ghostwriter19 @ ATE_ABSITA: Zero-Shot and ONNX to Speed up BERT on Sentiment Analysis Tasks at EVALITA 2020 (short paper)","authors":"Mauro Bennici","doi":"10.4000/BOOKS.AACCADEMIA.6889","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6889","url":null,"abstract":"English. With the arrival of BERT 2 in 2018, NLP research has taken a significant step forward. However, the necessary computing power has grown accordingly. Various distillation and optimization systems have been adopted but are costly in terms of cost-benefit ratio. The most important improvements are obtained by creating increasingly complex models with more layers and parameters. In this research, we will see how, by mixing transfer learning, zero-shot learning, and ONNX runtime, we can access the power of BERT right now, optimizing time and resources, achieving noticeable results on day one. Italiano. Con l'arrivo di BERT nel 2018, la ricerca nel campo dell'NLP ha fatto un notevole passo in avanti. La potenza di calcolo necessaria però è cresciuta di conseguenza. Diversi sistemi di distillazione e di ottimizzazione sono stati adottati ma risultano onerosi in termini di rapporto costo benefici. I vantaggi di maggior rilievo si ottengono creando modelli sempre più complessi con un maggior numero di layers e di parametri. In questa ricerca vedremo come mixando transfer learning, zero-shot learning e ONNX runtime si può accedere alla potenza di BERT da subito, ottimizzando tempo e risorse, raggiungendo risultati apprezzabili al day one. 1 Copyright ©️2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134061698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1