首页 > 最新文献

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing最新文献

英文 中文
Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval 用查询生成器增强双编码器跨语言密集检索能力
Houxing Ren, Linjun Shou, Ning Wu, Ming Gong, Daxin Jiang
In monolingual dense retrieval, lots of works focus on how to distill knowledge from cross-encoder re-ranker to dual-encoder retriever and these methods achieve better performance due to the effectiveness of cross-encoder re-ranker. However, we find that the performance of the cross-encoder re-ranker is heavily influenced by the number of training samples and the quality of negative samples, which is hard to obtain in the cross-lingual setting. In this paper, we propose to use a query generator as the teacher in the cross-lingual setting, which is less dependent on enough training samples and high-quality negative samples. In addition to traditional knowledge distillation, we further propose a novel enhancement method, which uses the query generator to help the dual-encoder align queries from different languages, but does not need any additional parallel sentences. The experimental results show that our method outperforms the state-of-the-art methods on two benchmark datasets.
在单语密集检索中,如何将知识从交叉编码器重排序中提取到双编码器检索中是很多研究的重点,由于交叉编码器重排序的有效性,这些方法获得了更好的性能。然而,我们发现交叉编码器重新排序的性能受到训练样本数量和负样本质量的严重影响,这在跨语言设置中很难获得。在本文中,我们建议在跨语言设置中使用查询生成器作为教师,这较少依赖于足够的训练样本和高质量的负样本。在传统知识蒸馏的基础上,我们进一步提出了一种新的增强方法,该方法使用查询生成器来帮助双编码器对齐不同语言的查询,但不需要任何额外的平行句。实验结果表明,在两个基准数据集上,我们的方法优于目前最先进的方法。
{"title":"Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval","authors":"Houxing Ren, Linjun Shou, Ning Wu, Ming Gong, Daxin Jiang","doi":"10.48550/arXiv.2303.14991","DOIUrl":"https://doi.org/10.48550/arXiv.2303.14991","url":null,"abstract":"In monolingual dense retrieval, lots of works focus on how to distill knowledge from cross-encoder re-ranker to dual-encoder retriever and these methods achieve better performance due to the effectiveness of cross-encoder re-ranker. However, we find that the performance of the cross-encoder re-ranker is heavily influenced by the number of training samples and the quality of negative samples, which is hard to obtain in the cross-lingual setting. In this paper, we propose to use a query generator as the teacher in the cross-lingual setting, which is less dependent on enough training samples and high-quality negative samples. In addition to traditional knowledge distillation, we further propose a novel enhancement method, which uses the query generator to help the dual-encoder align queries from different languages, but does not need any additional parallel sentences. The experimental results show that our method outperforms the state-of-the-art methods on two benchmark datasets.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"19 1","pages":"3107-3121"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73931418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification 借用人类感官:社交媒体多模态分类的评论感知自我训练
Chunpu Xu, Jing Li
Social media is daily creating massive multimedia content with paired image and text, presenting the pressing need to automate the vision and language understanding for various multimodal classification tasks. Compared to the commonly researched visual-lingual data, social media posts tend to exhibit more implicit image-text relations. To better glue the cross-modal semantics therein, we capture hinting features from user comments, which are retrieved via jointly leveraging visual and lingual similarity. Afterwards, the classification tasks are explored via self-training in a teacher-student framework, motivated by the usually limited labeled data scales in existing benchmarks. Substantial experiments are conducted on four multimodal social media benchmarks for image-text relation classification, sarcasm detection, sentiment classification, and hate speech detection. The results show that our method further advances the performance of previous state-of-the-art models, which do not employ comment modeling or self-training.
社交媒体每天都在创造大量的图像和文本配对的多媒体内容,迫切需要为各种多模态分类任务实现视觉和语言理解的自动化。与通常研究的视觉语言数据相比,社交媒体帖子往往表现出更隐含的图像-文本关系。为了更好地粘合其中的跨模态语义,我们从用户评论中捕获暗示特征,这些特征通过联合利用视觉和语言相似性来检索。然后,在现有基准中通常有限的标记数据尺度的激励下,通过教师-学生框架中的自我训练来探索分类任务。在图像-文本关系分类、讽刺检测、情感分类和仇恨言论检测四个多模态社交媒体基准上进行了大量实验。结果表明,我们的方法进一步提高了以前最先进的模型的性能,这些模型不使用评论建模或自我训练。
{"title":"Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification","authors":"Chunpu Xu, Jing Li","doi":"10.48550/arXiv.2303.15016","DOIUrl":"https://doi.org/10.48550/arXiv.2303.15016","url":null,"abstract":"Social media is daily creating massive multimedia content with paired image and text, presenting the pressing need to automate the vision and language understanding for various multimodal classification tasks. Compared to the commonly researched visual-lingual data, social media posts tend to exhibit more implicit image-text relations. To better glue the cross-modal semantics therein, we capture hinting features from user comments, which are retrieved via jointly leveraging visual and lingual similarity. Afterwards, the classification tasks are explored via self-training in a teacher-student framework, motivated by the usually limited labeled data scales in existing benchmarks. Substantial experiments are conducted on four multimodal social media benchmarks for image-text relation classification, sarcasm detection, sentiment classification, and hate speech detection. The results show that our method further advances the performance of previous state-of-the-art models, which do not employ comment modeling or self-training.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"1 1","pages":"5644-5656"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88744527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Understanding Social Media Cross-Modality Discourse in Linguistic Space 从语言空间看社交媒体跨情态话语
Chunpu Xu, Hanzhuo Tan, Jing Li, Piji Li
The multimedia communications with texts and images are popular on social media. However, limited studies concern how images are structured with texts to form coherent meanings in human cognition. To fill in the gap, we present a novel concept of cross-modality discourse, reflecting how human readers couple image and text understandings. Text descriptions are first derived from images (named as subtitles) in the multimedia contexts. Five labels -- entity-level insertion, projection and concretization and scene-level restatement and extension -- are further employed to shape the structure of subtitles and texts and present their joint meanings. As a pilot study, we also build the very first dataset containing 16K multimedia tweets with manually annotated discourse labels. The experimental results show that the multimedia encoder based on multi-head attention with captions is able to obtain the-state-of-the-art results.
在社交媒体上,文字和图片的多媒体交流非常流行。然而,在人类认知中,图像如何与文本相结合以形成连贯的意义的研究却很少。为了填补这一空白,我们提出了跨情态语篇的新概念,反映了人类读者如何将图像和文本理解结合起来。文本描述首先来源于多媒体上下文中的图像(称为字幕)。进一步运用实体层面的插入、投射与具体化、场景层面的重述与延伸这五个标签,塑造字幕与文本的结构,呈现二者的共同意义。作为一个试点研究,我们还构建了第一个包含16K多媒体推文的数据集,这些推文带有手动注释的话语标签。实验结果表明,基于多头注意加字幕的多媒体编码器能够获得较好的编码效果。
{"title":"Understanding Social Media Cross-Modality Discourse in Linguistic Space","authors":"Chunpu Xu, Hanzhuo Tan, Jing Li, Piji Li","doi":"10.48550/arXiv.2302.13311","DOIUrl":"https://doi.org/10.48550/arXiv.2302.13311","url":null,"abstract":"The multimedia communications with texts and images are popular on social media. However, limited studies concern how images are structured with texts to form coherent meanings in human cognition. To fill in the gap, we present a novel concept of cross-modality discourse, reflecting how human readers couple image and text understandings. Text descriptions are first derived from images (named as subtitles) in the multimedia contexts. Five labels -- entity-level insertion, projection and concretization and scene-level restatement and extension -- are further employed to shape the structure of subtitles and texts and present their joint meanings. As a pilot study, we also build the very first dataset containing 16K multimedia tweets with manually annotated discourse labels. The experimental results show that the multimedia encoder based on multi-head attention with captions is able to obtain the-state-of-the-art results.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"14 1","pages":"2459-2471"},"PeriodicalIF":0.0,"publicationDate":"2023-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87646296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Prompting for Multimodal Hateful Meme Classification 多模态仇恨模因分类提示
Rui Cao, R. Lee, Wen-Haw Chong, Jing Jiang
Hateful meme classification is a challenging multimodal task that requires complex reasoning and contextual background knowledge. Ideally, we could leverage an explicit external knowledge base to supplement contextual and cultural information in hateful memes. However, there is no known explicit external knowledge base that could provide such hate speech contextual information. To address this gap, we propose PromptHate, a simple yet effective prompt-based model that prompts pre-trained language models (PLMs) for hateful meme classification. Specifically, we construct simple prompts and provide a few in-context examples to exploit the implicit knowledge in the pre-trained RoBERTa language model for hateful meme classification. We conduct extensive experiments on two publicly available hateful and offensive meme datasets. Our experiment results show that PromptHate is able to achieve a high AUC of 90.96, outperforming state-of-the-art baselines on the hateful meme classification task. We also perform fine-grain analyses and case studies on various prompt settings and demonstrate the effectiveness of the prompts on hateful meme classification.
仇恨模因分类是一项具有挑战性的多模态任务,需要复杂的推理和上下文背景知识。理想情况下,我们可以利用明确的外部知识库来补充仇恨模因中的上下文和文化信息。然而,目前还没有明确的外部知识库可以提供这种仇恨言论的上下文信息。为了解决这一差距,我们提出了PromptHate,这是一个简单而有效的基于提示的模型,它提示预先训练的语言模型(PLMs)进行仇恨模因分类。具体来说,我们构建了简单的提示并提供了一些上下文示例,以利用预训练的RoBERTa语言模型中的隐式知识进行仇恨模因分类。我们在两个公开可用的仇恨和攻击性模因数据集上进行了广泛的实验。我们的实验结果表明,PromptHate能够在仇恨模因分类任务上实现90.96的高AUC,优于最先进的基线。我们还对各种提示设置进行了细粒度分析和案例研究,并展示了提示对仇恨模因分类的有效性。
{"title":"Prompting for Multimodal Hateful Meme Classification","authors":"Rui Cao, R. Lee, Wen-Haw Chong, Jing Jiang","doi":"10.48550/arXiv.2302.04156","DOIUrl":"https://doi.org/10.48550/arXiv.2302.04156","url":null,"abstract":"Hateful meme classification is a challenging multimodal task that requires complex reasoning and contextual background knowledge. Ideally, we could leverage an explicit external knowledge base to supplement contextual and cultural information in hateful memes. However, there is no known explicit external knowledge base that could provide such hate speech contextual information. To address this gap, we propose PromptHate, a simple yet effective prompt-based model that prompts pre-trained language models (PLMs) for hateful meme classification. Specifically, we construct simple prompts and provide a few in-context examples to exploit the implicit knowledge in the pre-trained RoBERTa language model for hateful meme classification. We conduct extensive experiments on two publicly available hateful and offensive meme datasets. Our experiment results show that PromptHate is able to achieve a high AUC of 90.96, outperforming state-of-the-art baselines on the hateful meme classification task. We also perform fine-grain analyses and case studies on various prompt settings and demonstrate the effectiveness of the prompts on hateful meme classification.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"83 1","pages":"321-332"},"PeriodicalIF":0.0,"publicationDate":"2023-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80098531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Capturing Topic Framing via Masked Language Modeling 通过屏蔽语言建模捕获主题框架
Xiaobo Guo, Weicheng Ma, Soroush Vosoughi
Differential framing of issues can lead to divergent world views on important issues. This is especially true in domains where the information presented can reach a large audience, such as traditional and social media. Scalable and reliable measurement of such differential framing is an important first step in addressing them. In this work, based on the intuition that framing affects the tone and word choices in written language, we propose a framework for modeling the differential framing of issues through masked token prediction via large-scale fine-tuned language models (LMs). Specifically, we explore three key factors for our framework: 1) prompt generation methods for the masked token prediction; 2) methods for normalizing the output of fine-tuned LMs; 3) robustness to the choice of pre-trained LMs used for fine-tuning. Through experiments on a dataset of articles from traditional media outlets covering five diverse and politically polarized topics, we show that our framework can capture differential framing of these topics with high reliability.
问题的不同框架可能导致对重要问题的不同世界观。在呈现的信息可以接触到大量受众的领域尤其如此,例如传统媒体和社交媒体。可扩展和可靠的测量这种差异框架是解决这些问题的重要的第一步。在这项工作中,基于框架影响书面语言语气和单词选择的直觉,我们提出了一个框架,通过大规模微调语言模型(lm)的掩模令牌预测来建模问题的差异框架。具体来说,我们探讨了框架的三个关键因素:1)用于掩码令牌预测的提示生成方法;2)微调LMs输出的归一化方法;3)对用于微调的预训练LMs选择的鲁棒性。通过对来自传统媒体的文章数据集的实验,该数据集涵盖了五个不同的和政治两极分化的主题,我们表明我们的框架可以高可靠性地捕获这些主题的差异框架。
{"title":"Capturing Topic Framing via Masked Language Modeling","authors":"Xiaobo Guo, Weicheng Ma, Soroush Vosoughi","doi":"10.48550/arXiv.2302.03183","DOIUrl":"https://doi.org/10.48550/arXiv.2302.03183","url":null,"abstract":"Differential framing of issues can lead to divergent world views on important issues. This is especially true in domains where the information presented can reach a large audience, such as traditional and social media. Scalable and reliable measurement of such differential framing is an important first step in addressing them. In this work, based on the intuition that framing affects the tone and word choices in written language, we propose a framework for modeling the differential framing of issues through masked token prediction via large-scale fine-tuned language models (LMs). Specifically, we explore three key factors for our framework: 1) prompt generation methods for the masked token prediction; 2) methods for normalizing the output of fine-tuned LMs; 3) robustness to the choice of pre-trained LMs used for fine-tuning. Through experiments on a dataset of articles from traditional media outlets covering five diverse and politically polarized topics, we show that our framework can capture differential framing of these topics with high reliability.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"56 1","pages":"6811-6825"},"PeriodicalIF":0.0,"publicationDate":"2023-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88012469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection 基于知识选择的预训练语言模型的改进知识蒸馏
Chenglong Wang, Yi Lu, Yongyu Mu, Yimin Hu, Tong Xiao, Jingbo Zhu
Knowledge distillation addresses the problem of transferring knowledge from a teacher model to a student model. In this process, we typically have multiple types of knowledge extracted from the teacher model. The problem is to make full use of them to train the student model. Our preliminary study shows that: (1) not all of the knowledge is necessary for learning a good student model, and (2) knowledge distillation can benefit from certain knowledge at different training steps. In response to these, we propose an actor-critic approach to selecting appropriate knowledge to transfer during the process of knowledge distillation. In addition, we offer a refinement of the training algorithm to ease the computational burden. Experimental results on the GLUE datasets show that our method outperforms several strong knowledge distillation baselines significantly.
知识蒸馏解决了将知识从教师模型转移到学生模型的问题。在这个过程中,我们通常会从教师模型中提取多种类型的知识。问题是如何充分利用它们来培养学生模式。我们的初步研究表明:(1)并不是所有的知识都是学习一个好的学生模型所必需的,(2)在不同的训练步骤中,知识蒸馏可以从某些知识中获益。针对这些问题,我们提出了一种行为者批判的方法,在知识蒸馏过程中选择合适的知识进行转移。此外,我们提供了一种改进的训练算法,以减轻计算负担。在GLUE数据集上的实验结果表明,我们的方法明显优于几种强知识蒸馏基线。
{"title":"Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection","authors":"Chenglong Wang, Yi Lu, Yongyu Mu, Yimin Hu, Tong Xiao, Jingbo Zhu","doi":"10.48550/arXiv.2302.00444","DOIUrl":"https://doi.org/10.48550/arXiv.2302.00444","url":null,"abstract":"Knowledge distillation addresses the problem of transferring knowledge from a teacher model to a student model. In this process, we typically have multiple types of knowledge extracted from the teacher model. The problem is to make full use of them to train the student model. Our preliminary study shows that: (1) not all of the knowledge is necessary for learning a good student model, and (2) knowledge distillation can benefit from certain knowledge at different training steps. In response to these, we propose an actor-critic approach to selecting appropriate knowledge to transfer during the process of knowledge distillation. In addition, we offer a refinement of the training algorithm to ease the computational burden. Experimental results on the GLUE datasets show that our method outperforms several strong knowledge distillation baselines significantly.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"47 1","pages":"6232-6244"},"PeriodicalIF":0.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85203036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Machine Translation Impact in E-commerce Multilingual Search 机器翻译对电子商务多语种搜索的影响
Bryan Zhang, Amita Misra
Previous work suggests that performance of cross-lingual information retrieval correlates highly with the quality of Machine Translation. However, there may be a threshold beyond which improving query translation quality yields little or no benefit to further improve the retrieval performance. This threshold may depend upon multiple factors including the source and target languages, the existing MT system quality and the search pipeline. In order to identify the benefit of improving an MT system for a given search pipeline, we investigate the sensitivity of retrieval quality to the presence of different levels of MT quality using experimental datasets collected from actual traffic. We systematically improve the performance of our MT systems quality on language pairs as measured by MT evaluation metrics including Bleu and Chrf to determine their impact on search precision metrics and extract signals that help to guide the improvement strategies. Using this information we develop techniques to compare query translations for multiple language pairs and identify the most promising language pairs to invest and improve.
以往的研究表明,跨语言信息检索的性能与机器翻译的质量高度相关。然而,可能存在一个阈值,超过这个阈值,提高查询翻译质量对进一步提高检索性能的好处很少或根本没有。这个阈值可能取决于多种因素,包括源语言和目标语言、现有的机器翻译系统质量和搜索管道。为了确定改进给定搜索管道的机器翻译系统的好处,我们使用从实际交通中收集的实验数据集研究了检索质量对不同水平机器翻译质量存在的敏感性。我们系统地改进了我们的机器翻译系统在语言对上的性能,通过机器翻译评估指标(包括Bleu和Chrf)来衡量,以确定它们对搜索精度指标的影响,并提取有助于指导改进策略的信号。利用这些信息,我们开发了一些技术来比较多个语言对的查询翻译,并确定最有希望进行投资和改进的语言对。
{"title":"Machine Translation Impact in E-commerce Multilingual Search","authors":"Bryan Zhang, Amita Misra","doi":"10.48550/arXiv.2302.00119","DOIUrl":"https://doi.org/10.48550/arXiv.2302.00119","url":null,"abstract":"Previous work suggests that performance of cross-lingual information retrieval correlates highly with the quality of Machine Translation. However, there may be a threshold beyond which improving query translation quality yields little or no benefit to further improve the retrieval performance. This threshold may depend upon multiple factors including the source and target languages, the existing MT system quality and the search pipeline. In order to identify the benefit of improving an MT system for a given search pipeline, we investigate the sensitivity of retrieval quality to the presence of different levels of MT quality using experimental datasets collected from actual traffic. We systematically improve the performance of our MT systems quality on language pairs as measured by MT evaluation metrics including Bleu and Chrf to determine their impact on search precision metrics and extract signals that help to guide the improvement strategies. Using this information we develop techniques to compare query translations for multiple language pairs and identify the most promising language pairs to invest and improve.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"7 1","pages":"99-109"},"PeriodicalIF":0.0,"publicationDate":"2023-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74304509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality 具有瓶颈诊断(非)组合性的递归神经网络
Verna Dankers, Ivan Titov
A recent line of work in NLP focuses on the (dis)ability of models to generalise compositionally for artificial languages. However, when considering natural language tasks, the data involved is not strictly, or locally, compositional. Quantifying the compositionality of data is a challenging task, which has been investigated primarily for short utterances. We use recursive neural models (Tree-LSTMs) with bottlenecks that limit the transfer of information between nodes. We illustrate that comparing data's representations in models with and without the bottleneck can be used to produce a compositionality metric. The procedure is applied to the evaluation of arithmetic expressions using synthetic data, and sentiment classification using natural language data. We demonstrate that compression through a bottleneck impacts non-compositional examples disproportionately and then use the bottleneck compositionality metric (BCM) to distinguish compositional from non-compositional samples, yielding a compositionality ranking over a dataset.
NLP最近的一项工作集中在模型对人工语言进行组合泛化的能力上。然而,在考虑自然语言任务时,所涉及的数据并不是严格的或局部的组合数据。量化数据的组合性是一项具有挑战性的任务,主要针对短话语进行了研究。我们使用具有瓶颈的递归神经模型(Tree-LSTMs)来限制节点之间的信息传递。我们说明,比较有瓶颈和没有瓶颈的模型中的数据表示可用于生成组合性度量。该程序应用于使用合成数据的算术表达式的评估和使用自然语言数据的情感分类。我们证明了通过瓶颈的压缩对非组成样本的影响不成比例,然后使用瓶颈组合度度量(BCM)来区分组成和非组成样本,从而在数据集上产生组合度排名。
{"title":"Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality","authors":"Verna Dankers, Ivan Titov","doi":"10.48550/arXiv.2301.13714","DOIUrl":"https://doi.org/10.48550/arXiv.2301.13714","url":null,"abstract":"A recent line of work in NLP focuses on the (dis)ability of models to generalise compositionally for artificial languages. However, when considering natural language tasks, the data involved is not strictly, or locally, compositional. Quantifying the compositionality of data is a challenging task, which has been investigated primarily for short utterances. We use recursive neural models (Tree-LSTMs) with bottlenecks that limit the transfer of information between nodes. We illustrate that comparing data's representations in models with and without the bottleneck can be used to produce a compositionality metric. The procedure is applied to the evaluation of arithmetic expressions using synthetic data, and sentiment classification using natural language data. We demonstrate that compression through a bottleneck impacts non-compositional examples disproportionately and then use the bottleneck compositionality metric (BCM) to distinguish compositional from non-compositional samples, yielding a compositionality ranking over a dataset.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"14 1","pages":"4361-4378"},"PeriodicalIF":0.0,"publicationDate":"2023-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89799812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Audience-Centric Natural Language Generation via Style Infusion 通过风格注入生成以受众为中心的自然语言
Samraj Moorjani, A. Krishnan, Hari Sundaram, E. Maslowska, Aravind Sankar
Adopting contextually appropriate, audience-tailored linguistic styles is critical to the success of user-centric language generation systems (e.g., chatbots, computer-aided writing, dialog systems). While existing approaches demonstrate textual style transfer with large volumes of parallel or non-parallel data, we argue that grounding style on audience-independent external factors is innately limiting for two reasons. First, it is difficult to collect large volumes of audience-specific stylistic data. Second, some stylistic objectives (e.g., persuasiveness, memorability, empathy) are hard to define without audience feedback. In this paper, we propose the novel task of style infusion - infusing the stylistic preferences of audiences in pretrained language generation models. Since humans are better at pairwise comparisons than direct scoring - i.e., is Sample-A more persuasive/polite/empathic than Sample-B - we leverage limited pairwise human judgments to bootstrap a style analysis model and augment our seed set of judgments. We then infuse the learned textual style in a GPT-2 based text generator while balancing fluency and style adoption. With quantitative and qualitative assessments, we show that our infusion approach can generate compelling stylized examples with generic text prompts. The code and data are accessible at https://github.com/CrowdDynamicsLab/StyleInfusion.
采用上下文适当的、受众定制的语言风格对于以用户为中心的语言生成系统(例如,聊天机器人、计算机辅助写作、对话系统)的成功至关重要。虽然现有的方法证明了大量并行或非并行数据的文本风格迁移,但我们认为,基于与受众无关的外部因素的文本风格天生具有局限性,原因有两个。首先,很难收集大量特定于受众的文体数据。其次,如果没有观众的反馈,一些文体目标(如说服力、可记忆性、共情性)很难定义。在本文中,我们提出了一种新的任务,即在预训练的语言生成模型中注入受众的风格偏好。由于人类比直接评分更擅长两两比较——也就是说,样本a比样本b更有说服力/礼貌/共情——我们利用有限的人类两两判断来引导风格分析模型,并增加我们的种子判断集。然后,我们将学习到的文本风格注入到基于GPT-2的文本生成器中,同时平衡流畅性和风格采用。通过定量和定性评估,我们表明我们的注入方法可以生成具有通用文本提示的引人注目的风格化示例。代码和数据可在https://github.com/CrowdDynamicsLab/StyleInfusion上访问。
{"title":"Audience-Centric Natural Language Generation via Style Infusion","authors":"Samraj Moorjani, A. Krishnan, Hari Sundaram, E. Maslowska, Aravind Sankar","doi":"10.48550/arXiv.2301.10283","DOIUrl":"https://doi.org/10.48550/arXiv.2301.10283","url":null,"abstract":"Adopting contextually appropriate, audience-tailored linguistic styles is critical to the success of user-centric language generation systems (e.g., chatbots, computer-aided writing, dialog systems). While existing approaches demonstrate textual style transfer with large volumes of parallel or non-parallel data, we argue that grounding style on audience-independent external factors is innately limiting for two reasons. First, it is difficult to collect large volumes of audience-specific stylistic data. Second, some stylistic objectives (e.g., persuasiveness, memorability, empathy) are hard to define without audience feedback. In this paper, we propose the novel task of style infusion - infusing the stylistic preferences of audiences in pretrained language generation models. Since humans are better at pairwise comparisons than direct scoring - i.e., is Sample-A more persuasive/polite/empathic than Sample-B - we leverage limited pairwise human judgments to bootstrap a style analysis model and augment our seed set of judgments. We then infuse the learned textual style in a GPT-2 based text generator while balancing fluency and style adoption. With quantitative and qualitative assessments, we show that our infusion approach can generate compelling stylized examples with generic text prompts. The code and data are accessible at https://github.com/CrowdDynamicsLab/StyleInfusion.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2 1","pages":"1919-1932"},"PeriodicalIF":0.0,"publicationDate":"2023-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75486985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multitask Instruction-based Prompting for Fallacy Recognition 基于多任务指令的谬误识别提示
Tariq Alhindi, Tuhin Chakrabarty, Elena Musi, S. Muresan
Fallacies are used as seemingly valid arguments to support a position and persuade the audience about its validity. Recognizing fallacies is an intrinsically difficult task both for humans and machines. Moreover, a big challenge for computational models lies in the fact that fallacies are formulated differently across the datasets with differences in the input format (e.g., question-answer pair, sentence with fallacy fragment), genre (e.g., social media, dialogue, news), as well as types and number of fallacies (from 5 to 18 types per dataset). To move towards solving the fallacy recognition task, we approach these differences across datasets as multiple tasks and show how instruction-based prompting in a multitask setup based on the T5 model improves the results against approaches built for a specific dataset such as T5, BERT or GPT-3. We show the ability of this multitask prompting approach to recognize 28 unique fallacies across domains and genres and study the effect of model size and prompt choice by analyzing the per-class (i.e., fallacy type) results. Finally, we analyze the effect of annotation quality on model performance, and the feasibility of complementing this approach with external knowledge.
谬论是用来作为看似有效的论点来支持一个立场,并说服听众相信它的有效性。对人类和机器来说,识别谬误本质上都是一项艰巨的任务。此外,计算模型面临的一个巨大挑战在于,由于输入格式(例如,问答对,带有谬误片段的句子),类型(例如,社交媒体,对话,新闻)以及谬误的类型和数量(每个数据集从5到18种类型)的不同,不同数据集的谬误表述方式不同。为了解决错误识别任务,我们将数据集之间的这些差异作为多个任务来处理,并展示了基于T5模型的多任务设置中基于指令的提示如何改善针对特定数据集(如T5, BERT或GPT-3)构建的方法的结果。我们展示了这种多任务提示方法识别跨领域和类型的28种独特谬误的能力,并通过分析每个类(即谬误类型)的结果来研究模型大小和提示选择的影响。最后,我们分析了标注质量对模型性能的影响,以及利用外部知识补充该方法的可行性。
{"title":"Multitask Instruction-based Prompting for Fallacy Recognition","authors":"Tariq Alhindi, Tuhin Chakrabarty, Elena Musi, S. Muresan","doi":"10.48550/arXiv.2301.09992","DOIUrl":"https://doi.org/10.48550/arXiv.2301.09992","url":null,"abstract":"Fallacies are used as seemingly valid arguments to support a position and persuade the audience about its validity. Recognizing fallacies is an intrinsically difficult task both for humans and machines. Moreover, a big challenge for computational models lies in the fact that fallacies are formulated differently across the datasets with differences in the input format (e.g., question-answer pair, sentence with fallacy fragment), genre (e.g., social media, dialogue, news), as well as types and number of fallacies (from 5 to 18 types per dataset). To move towards solving the fallacy recognition task, we approach these differences across datasets as multiple tasks and show how instruction-based prompting in a multitask setup based on the T5 model improves the results against approaches built for a specific dataset such as T5, BERT or GPT-3. We show the ability of this multitask prompting approach to recognize 28 unique fallacies across domains and genres and study the effect of model size and prompt choice by analyzing the per-class (i.e., fallacy type) results. Finally, we analyze the effect of annotation quality on model performance, and the feasibility of complementing this approach with external knowledge.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"39 1","pages":"8172-8187"},"PeriodicalIF":0.0,"publicationDate":"2023-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84945825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
期刊
Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1