首页 > 最新文献

Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021)最新文献

英文 中文
System Description for the CommonGen task with the POINTER model 使用POINTER模型的commonen任务的系统描述
Anna Shvets
In a current experiment we were testing CommonGen dataset for structure-to-text task from GEM living benchmark with the constraint based POINTER model. POINTER represents a hybrid architecture, combining insertion-based and transformer paradigms, predicting the token and the insertion position at the same time. The text is therefore generated gradually in a parallel non-autoregressive manner, given the set of keywords. The pretrained model was fine-tuned on a training split of the CommonGen dataset and the generation result was compared to the validation and challenge splits. The received metrics outputs, which measure lexical equivalence, semantic similarity and diversity, are discussed in details in a present system description.
在当前的实验中,我们使用基于约束的POINTER模型测试GEM活基准的commonen数据集的结构到文本任务。POINTER代表了一种混合架构,结合了基于插入的范式和转换范式,同时预测令牌和插入位置。因此,给定一组关键字,文本以并行非自回归的方式逐渐生成。在CommonGen数据集的训练分裂上对预训练模型进行微调,并将生成结果与验证分裂和挑战分裂进行比较。在本系统描述中详细讨论了接收到的度量输出,这些度量输出测量词汇等效性、语义相似性和多样性。
{"title":"System Description for the CommonGen task with the POINTER model","authors":"Anna Shvets","doi":"10.18653/v1/2021.gem-1.15","DOIUrl":"https://doi.org/10.18653/v1/2021.gem-1.15","url":null,"abstract":"In a current experiment we were testing CommonGen dataset for structure-to-text task from GEM living benchmark with the constraint based POINTER model. POINTER represents a hybrid architecture, combining insertion-based and transformer paradigms, predicting the token and the insertion position at the same time. The text is therefore generated gradually in a parallel non-autoregressive manner, given the set of keywords. The pretrained model was fine-tuned on a training split of the CommonGen dataset and the generation result was compared to the validation and challenge splits. The received metrics outputs, which measure lexical equivalence, semantic similarity and diversity, are discussed in details in a present system description.","PeriodicalId":431658,"journal":{"name":"Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128405737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
NUIG-DSI’s submission to The GEM Benchmark 2021 NUIG-DSI提交给创业板基准2021
Nivranshu Pasricha, Mihael Arcan, P. Buitelaar
This paper describes the submission by NUIG-DSI to the GEM benchmark 2021. We participate in the modeling shared task where we submit outputs on four datasets for data-to-text generation, namely, DART, WebNLG (en), E2E and CommonGen. We follow an approach similar to the one described in the GEM benchmark paper where we use the pre-trained T5-base model for our submission. We train this model on additional monolingual data where we experiment with different masking strategies specifically focused on masking entities, predicates and concepts as well as a random masking strategy for pre-training. In our results we find that random masking performs the best in terms of automatic evaluation metrics, though the results are not statistically significantly different compared to other masking strategies.
本文描述了NUIG-DSI提交给GEM基准2021的情况。我们参与了建模共享任务,我们提交了四个数据集的输出,用于数据到文本的生成,即DART、WebNLG (en)、E2E和commonen。我们遵循类似于GEM基准论文中描述的方法,我们在提交中使用预训练的t5基模型。我们在额外的单语数据上训练这个模型,在那里我们实验了不同的掩蔽策略,特别关注掩蔽实体、谓词和概念,以及用于预训练的随机掩蔽策略。在我们的结果中,我们发现随机掩蔽在自动评估指标方面表现最好,尽管结果与其他掩蔽策略相比没有统计学上的显着差异。
{"title":"NUIG-DSI’s submission to The GEM Benchmark 2021","authors":"Nivranshu Pasricha, Mihael Arcan, P. Buitelaar","doi":"10.18653/v1/2021.gem-1.13","DOIUrl":"https://doi.org/10.18653/v1/2021.gem-1.13","url":null,"abstract":"This paper describes the submission by NUIG-DSI to the GEM benchmark 2021. We participate in the modeling shared task where we submit outputs on four datasets for data-to-text generation, namely, DART, WebNLG (en), E2E and CommonGen. We follow an approach similar to the one described in the GEM benchmark paper where we use the pre-trained T5-base model for our submission. We train this model on additional monolingual data where we experiment with different masking strategies specifically focused on masking entities, predicates and concepts as well as a random masking strategy for pre-training. In our results we find that random masking performs the best in terms of automatic evaluation metrics, though the results are not statistically significantly different compared to other masking strategies.","PeriodicalId":431658,"journal":{"name":"Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117276439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SimpleNER Sentence Simplification System for GEM 2021 SimpleNER GEM 2021句子简化系统
KV Aditya Srivatsa, Monil Gokani, Manish Shrivastava
This paper describes SimpleNER, a model developed for the sentence simplification task at GEM-2021. Our system is a monolingual Seq2Seq Transformer architecture that uses control tokens pre-pended to the data, allowing the model to shape the generated simplifications according to user desired attributes. Additionally, we show that NER-tagging the training data before use helps stabilize the effect of the control tokens and significantly improves the overall performance of the system. We also employ pretrained embeddings to reduce data sparsity and allow the model to produce more generalizable outputs.
本文介绍了SimpleNER,这是一个为GEM-2021的句子简化任务开发的模型。我们的系统是单语言Seq2Seq Transformer体系结构,它使用预先挂起到数据的控制令牌,允许模型根据用户所需的属性塑造生成的简化。此外,我们表明,在使用前对训练数据进行自定义标记有助于稳定控制令牌的效果,并显着提高系统的整体性能。我们还使用预训练的嵌入来降低数据稀疏性,并允许模型产生更一般化的输出。
{"title":"SimpleNER Sentence Simplification System for GEM 2021","authors":"KV Aditya Srivatsa, Monil Gokani, Manish Shrivastava","doi":"10.18653/v1/2021.gem-1.14","DOIUrl":"https://doi.org/10.18653/v1/2021.gem-1.14","url":null,"abstract":"This paper describes SimpleNER, a model developed for the sentence simplification task at GEM-2021. Our system is a monolingual Seq2Seq Transformer architecture that uses control tokens pre-pended to the data, allowing the model to shape the generated simplifications according to user desired attributes. Additionally, we show that NER-tagging the training data before use helps stabilize the effect of the control tokens and significantly improves the overall performance of the system. We also employ pretrained embeddings to reduce data sparsity and allow the model to produce more generalizable outputs.","PeriodicalId":431658,"journal":{"name":"Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127699895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Human Perception in Natural Language Generation 自然语言生成中的人类感知
Lorenzo De Mattei, Huiyuan Lai, F. Dell’Orletta, M. Nissim
We ask subjects whether they perceive as human-produced a bunch of texts, some of which are actually human-written, while others are automatically generated. We use this data to fine-tune a GPT-2 model to push it to generate more human-like texts, and observe that this fine-tuned model produces texts that are indeed perceived more human-like than the original model. Contextually, we show that our automatic evaluation strategy well correlates with human judgements. We also run a linguistic analysis to unveil the characteristics of human- vs machine-perceived language.
我们问受试者,他们是否认为一堆文本是人类产生的,其中一些实际上是人类编写的,而另一些是自动生成的。我们使用这些数据对GPT-2模型进行微调,以促使它生成更像人类的文本,并观察到这个微调模型产生的文本确实比原始模型更像人类。在上下文中,我们表明我们的自动评估策略与人类的判断很好地相关。我们还进行了语言分析,以揭示人类和机器感知语言的特征。
{"title":"Human Perception in Natural Language Generation","authors":"Lorenzo De Mattei, Huiyuan Lai, F. Dell’Orletta, M. Nissim","doi":"10.18653/v1/2021.gem-1.2","DOIUrl":"https://doi.org/10.18653/v1/2021.gem-1.2","url":null,"abstract":"We ask subjects whether they perceive as human-produced a bunch of texts, some of which are actually human-written, while others are automatically generated. We use this data to fine-tune a GPT-2 model to push it to generate more human-like texts, and observe that this fine-tuned model produces texts that are indeed perceived more human-like than the original model. Contextually, we show that our automatic evaluation strategy well correlates with human judgements. We also run a linguistic analysis to unveil the characteristics of human- vs machine-perceived language.","PeriodicalId":431658,"journal":{"name":"Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124675960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Semantic Similarity Based Evaluation for Abstractive News Summarization 基于语义相似度的新闻文摘评价
Figen Beken Fikri, Kemal Oflazer, B. Yanikoglu
ROUGE is a widely used evaluation metric in text summarization. However, it is not suitable for the evaluation of abstractive summarization systems as it relies on lexical overlap between the gold standard and the generated summaries. This limitation becomes more apparent for agglutinative languages with very large vocabularies and high type/token ratios. In this paper, we present semantic similarity models for Turkish and apply them as evaluation metrics for an abstractive summarization task. To achieve this, we translated the English STSb dataset into Turkish and presented the first semantic textual similarity dataset for Turkish as well. We showed that our best similarity models have better alignment with average human judgments compared to ROUGE in both Pearson and Spearman correlations.
ROUGE是一种广泛应用于文本摘要的评价度量。然而,它不适合评估抽象摘要系统,因为它依赖于金标准和生成的摘要之间的词汇重叠。对于具有非常大的词汇表和高类型/标记比率的粘合语言,这种限制变得更加明显。在本文中,我们提出了土耳其语的语义相似度模型,并将其作为抽象摘要任务的评估指标。为了实现这一目标,我们将英文STSb数据集翻译成土耳其语,并提出了土耳其语的第一个语义文本相似度数据集。我们发现,在Pearson和Spearman相关性中,与ROUGE相比,我们的最佳相似性模型与人类平均判断有更好的一致性。
{"title":"Semantic Similarity Based Evaluation for Abstractive News Summarization","authors":"Figen Beken Fikri, Kemal Oflazer, B. Yanikoglu","doi":"10.18653/v1/2021.gem-1.3","DOIUrl":"https://doi.org/10.18653/v1/2021.gem-1.3","url":null,"abstract":"ROUGE is a widely used evaluation metric in text summarization. However, it is not suitable for the evaluation of abstractive summarization systems as it relies on lexical overlap between the gold standard and the generated summaries. This limitation becomes more apparent for agglutinative languages with very large vocabularies and high type/token ratios. In this paper, we present semantic similarity models for Turkish and apply them as evaluation metrics for an abstractive summarization task. To achieve this, we translated the English STSb dataset into Turkish and presented the first semantic textual similarity dataset for Turkish as well. We showed that our best similarity models have better alignment with average human judgments compared to ROUGE in both Pearson and Spearman correlations.","PeriodicalId":431658,"journal":{"name":"Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128843073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1