CHANGE-IT @ EVALITA 2020: Change Headlines, Adapt News, GEnerate (short paper)
Lorenzo De Mattei, Michele Cafagna, F. Dell’Orletta, M. Nissim, Albert Gatt
{"title":"CHANGE-IT @ EVALITA 2020: Change Headlines, Adapt News, GEnerate (short paper)","authors":"Lorenzo De Mattei, Michele Cafagna, F. Dell’Orletta, M. Nissim, Albert Gatt","doi":"10.4000/BOOKS.AACCADEMIA.7250","DOIUrl":null,"url":null,"abstract":"We propose a generation task for Italian – more specifically, a style transfer task for headlines of Italian newspapers. This is the first shared task on generation included in the EVALITA evaluation framework. Indeed, one of the reasons to have this task is to stimulate more research on generation within the Italian community. With this aim in mind, we release to the participating teams not only training data, but also a baseline sequence to sequence model that performs the task in order to help everyone get started, even when not accustomed to Natural Language Generation (NLG) approaches. Contextually, we explore the complex issue of automatic evaluation of generated text, which is receiving particular attention in the NLG community. 1 Task and Motivation We propose a generation task for Italian in the context of the EVALITA 2020 campaign (Basile et al., 2020). More specifically, we design a style transfer task for headlines of Italian newspapers. We believe it is the first time that a shared task on generation is offered in the context of EVALITA. Indeed, one of the reasons to have this task is to stimulate more research on generation within the Italian community. With this goal in mind, we release to the potential participating Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). teams not only training data, but also a baseline sequence to sequence model that performs the task in order to help everyone get started, even when not accustomed to generation models, yet. This baseline model casts the style transfer problem as an extreme summarisation task, just showing how versatile the problem is in terms of possible approaches. Contextually, this task will help to further explore the complex issue of evaluation of generated text, which is receiving particular attention in the Natural Language Generation international community (Gatt and Krahmer, 2018; van der Lee et al., 2019). Task The task is cast as a “headline translation” problem, and it is as follows. Given a collection of headlines from two Italian newspapers at opposite ends of the political spectrum, call them G and R, change all G-headlines to headlines into style R, and all R-headlines to headlines in style G. In the context of this task we need to take care of two crucial aspects: data and evaluation. Details on data are provided in Section 2, and on evaluation in Section 3.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7250","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We propose a generation task for Italian – more specifically, a style transfer task for headlines of Italian newspapers. This is the first shared task on generation included in the EVALITA evaluation framework. Indeed, one of the reasons to have this task is to stimulate more research on generation within the Italian community. With this aim in mind, we release to the participating teams not only training data, but also a baseline sequence to sequence model that performs the task in order to help everyone get started, even when not accustomed to Natural Language Generation (NLG) approaches. Contextually, we explore the complex issue of automatic evaluation of generated text, which is receiving particular attention in the NLG community. 1 Task and Motivation We propose a generation task for Italian in the context of the EVALITA 2020 campaign (Basile et al., 2020). More specifically, we design a style transfer task for headlines of Italian newspapers. We believe it is the first time that a shared task on generation is offered in the context of EVALITA. Indeed, one of the reasons to have this task is to stimulate more research on generation within the Italian community. With this goal in mind, we release to the potential participating Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). teams not only training data, but also a baseline sequence to sequence model that performs the task in order to help everyone get started, even when not accustomed to generation models, yet. This baseline model casts the style transfer problem as an extreme summarisation task, just showing how versatile the problem is in terms of possible approaches. Contextually, this task will help to further explore the complex issue of evaluation of generated text, which is receiving particular attention in the Natural Language Generation international community (Gatt and Krahmer, 2018; van der Lee et al., 2019). Task The task is cast as a “headline translation” problem, and it is as follows. Given a collection of headlines from two Italian newspapers at opposite ends of the political spectrum, call them G and R, change all G-headlines to headlines into style R, and all R-headlines to headlines in style G. In the context of this task we need to take care of two crucial aspects: data and evaluation. Details on data are provided in Section 2, and on evaluation in Section 3.
Change - it @ EVALITA 2020:改变头条,改编新闻,生成(短文)
我们提出了意大利语的生成任务-更具体地说,是意大利语报纸标题的风格迁移任务。这是EVALITA评估框架中包含的第一个关于生成的共享任务。事实上,进行这项任务的原因之一是为了在意大利社区内激发更多关于世代的研究。考虑到这一目标,我们不仅向参与团队发布了训练数据,而且还发布了执行任务的基线序列到序列模型,以帮助每个人开始,即使不习惯自然语言生成(NLG)方法。在上下文中,我们探讨了自动评估生成文本的复杂问题,这在NLG社区受到特别关注。我们在EVALITA 2020活动(Basile et al., 2020)的背景下为意大利语提出了一个生成任务。更具体地说,我们为意大利报纸的标题设计了一个风格转移任务。我们认为这是第一次在EVALITA的背景下提供生成上的共享任务。事实上,进行这项任务的原因之一是为了在意大利社区内激发更多关于世代的研究。考虑到这一目标,我们向潜在的参与者发布本文作者的版权©2020。在知识共享许可国际署名4.0 (CC BY 4.0)下允许使用。团队不仅训练数据,而且还训练执行任务的基线序列到序列模型,以帮助每个人开始,即使还不习惯生成模型。这个基线模型将风格转移问题作为一个极端的总结任务,只是显示了这个问题在可能的方法方面是多么的通用。在上下文中,这项任务将有助于进一步探索生成文本评估的复杂问题,该问题在自然语言生成国际社区中受到特别关注(Gatt和Krahmer, 2018;van der Lee et al., 2019)。这个任务是一个“标题翻译”问题,它是这样的。给定两份意大利报纸的标题集合,它们分别来自政治光谱的两端,我们称它们为G和R,将所有G标题改为标题风格R,将所有R标题改为标题风格G。在本任务的背景下,我们需要注意两个关键方面:数据和评估。关于数据的详细信息见第2节,关于评估的信息见第3节。
本文章由计算机程序翻译,如有差异,请以英文原文为准。