首页 > 最新文献

Annual Meeting of the Association for Computational Linguistics最新文献

英文 中文
Enhancing Document-level Event Argument Extraction with Contextual Clues and Role Relevance 利用上下文线索和角色相关性增强文档级事件参数提取
Pub Date : 2023-10-08 DOI: 10.18653/v1/2023.findings-acl.817
Wanlong Liu, Shaohuan Cheng, Di Zeng, Hong Qu
Document-level event argument extraction poses new challenges of long input and cross-sentence inference compared to its sentence-level counterpart. However, most prior works focus on capturing the relations between candidate arguments and the event trigger in each event, ignoring two crucial points: a) non-argument contextual clue information; b) the relevance among argument roles. In this paper, we propose a SCPRG (Span-trigger-based Contextual Pooling and latent Role Guidance) model, which contains two novel and effective modules for the above problem. The Span-Trigger-based Contextual Pooling(STCP) adaptively selects and aggregates the information of non-argument clue words based on the context attention weights of specific argument-trigger pairs from pre-trained model. The Role-based Latent Information Guidance (RLIG) module constructs latent role representations, makes them interact through role-interactive encoding to capture semantic relevance, and merges them into candidate arguments. Both STCP and RLIG introduce no more than 1% new parameters compared with the base model and can be easily applied to other event extraction models, which are compact and transplantable. Experiments on two public datasets show that our SCPRG outperforms previous state-of-the-art methods, with 1.13 F1 and 2.64 F1 improvements on RAMS and WikiEvents respectively. Further analyses illustrate the interpretability of our model.
与句子级事件参数提取相比,文档级事件参数提取提出了长输入和跨句子推理的新挑战。然而,大多数先前的工作都集中在捕获每个事件中候选参数与事件触发器之间的关系,忽略了两个关键点:a)非参数上下文线索信息;B)论点角色之间的相关性。在本文中,我们提出了一个基于Span-trigger-based Contextual Pooling and latent Role Guidance的SCPRG模型,该模型包含了两个新颖有效的模块。基于span -trigger的上下文池(STCP)基于预训练模型中特定参数触发对的上下文关注权重,自适应地选择和聚合非参数线索词的信息。基于角色的潜在信息引导(rligg)模块构建潜在的角色表示,通过角色交互编码实现交互,获取语义相关性,并将其合并为候选参数。与基本模型相比,STCP和RLIG都引入了不超过1%的新参数,并且易于应用于其他事件提取模型,具有紧凑和可移植的特点。在两个公共数据集上的实验表明,我们的SCPRG优于以前最先进的方法,在RAMS和WikiEvents上分别提高了1.13 F1和2.64 F1。进一步的分析说明了我们模型的可解释性。
{"title":"Enhancing Document-level Event Argument Extraction with Contextual Clues and Role Relevance","authors":"Wanlong Liu, Shaohuan Cheng, Di Zeng, Hong Qu","doi":"10.18653/v1/2023.findings-acl.817","DOIUrl":"https://doi.org/10.18653/v1/2023.findings-acl.817","url":null,"abstract":"Document-level event argument extraction poses new challenges of long input and cross-sentence inference compared to its sentence-level counterpart. However, most prior works focus on capturing the relations between candidate arguments and the event trigger in each event, ignoring two crucial points: a) non-argument contextual clue information; b) the relevance among argument roles. In this paper, we propose a SCPRG (Span-trigger-based Contextual Pooling and latent Role Guidance) model, which contains two novel and effective modules for the above problem. The Span-Trigger-based Contextual Pooling(STCP) adaptively selects and aggregates the information of non-argument clue words based on the context attention weights of specific argument-trigger pairs from pre-trained model. The Role-based Latent Information Guidance (RLIG) module constructs latent role representations, makes them interact through role-interactive encoding to capture semantic relevance, and merges them into candidate arguments. Both STCP and RLIG introduce no more than 1% new parameters compared with the base model and can be easily applied to other event extraction models, which are compact and transplantable. Experiments on two public datasets show that our SCPRG outperforms previous state-of-the-art methods, with 1.13 F1 and 2.64 F1 improvements on RAMS and WikiEvents respectively. Further analyses illustrate the interpretability of our model.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128518032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
How-to Guides for Specific Audiences: A Corpus and Initial Findings 特定受众指南:语料库和初步发现
Pub Date : 2023-09-21 DOI: 10.18653/v1/2023.acl-srw.46
Nicola Fanton, Agnieszka Falenska, Michael Roth
We collect how-to guides for different target audiences and investigate qualitative and quantitative differences.
我们收集针对不同目标受众的操作指南,并调查定性和定量差异。
{"title":"How-to Guides for Specific Audiences: A Corpus and Initial Findings","authors":"Nicola Fanton, Agnieszka Falenska, Michael Roth","doi":"10.18653/v1/2023.acl-srw.46","DOIUrl":"https://doi.org/10.18653/v1/2023.acl-srw.46","url":null,"abstract":"We collect how-to guides for different target audiences and investigate qualitative and quantitative differences.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131012749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Substitution-based Semantic Change Detection using Contextual Embeddings 基于替换的基于上下文嵌入的语义变化检测
Pub Date : 2023-09-05 DOI: 10.18653/v1/2023.acl-short.52
Dallas Card
Measuring semantic change has thus far remained a task where methods using contextual embeddings have struggled to improve upon simpler techniques relying only on static word vectors. Moreover, many of the previously proposed approaches suffer from downsides related to scalability and ease of interpretation. We present a simplified approach to measuring semantic change using contextual embeddings, relying only on the most probable substitutes for masked terms. Not only is this approach directly interpretable, it is also far more efficient in terms of storage, achieves superior average performance across the most frequently cited datasets for this task, and allows for more nuanced investigation of change than is possible with static word vectors.
到目前为止,测量语义变化仍然是一项任务,使用上下文嵌入的方法在仅依赖静态词向量的更简单技术的基础上一直在努力改进。此外,以前提出的许多方法都存在与可伸缩性和易于解释相关的缺点。我们提出了一种使用上下文嵌入来测量语义变化的简化方法,仅依赖于被掩盖术语的最可能替代品。这种方法不仅可以直接解释,而且在存储方面效率更高,在最常被引用的数据集上实现了优越的平均性能,并且允许比静态词向量更细致的变化调查。
{"title":"Substitution-based Semantic Change Detection using Contextual Embeddings","authors":"Dallas Card","doi":"10.18653/v1/2023.acl-short.52","DOIUrl":"https://doi.org/10.18653/v1/2023.acl-short.52","url":null,"abstract":"Measuring semantic change has thus far remained a task where methods using contextual embeddings have struggled to improve upon simpler techniques relying only on static word vectors. Moreover, many of the previously proposed approaches suffer from downsides related to scalability and ease of interpretation. We present a simplified approach to measuring semantic change using contextual embeddings, relying only on the most probable substitutes for masked terms. Not only is this approach directly interpretable, it is also far more efficient in terms of storage, achieves superior average performance across the most frequently cited datasets for this task, and allows for more nuanced investigation of change than is possible with static word vectors.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130556087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning MultiCapCLIP:自动编码提示零镜头多语言视觉字幕
Pub Date : 2023-08-25 DOI: 10.18653/v1/2023.acl-long.664
Bang Yang, Fenglin Liu, X. Wu, Yaowei Wang, Xu Sun, Yuexian Zou
Supervised visual captioning models typically require a large scale of images or videos paired with descriptions in a specific language (i.e., the vision-caption pairs) for training. However, collecting and labeling large-scale datasets is time-consuming and expensive for many scenarios and languages. Therefore, sufficient labeled pairs are usually not available. To deal with the label shortage problem, we present a simple yet effective zero-shot approach MultiCapCLIP that can generate visual captions for different scenarios and languages without any labeled vision-caption pairs of downstream datasets. In the training stage, MultiCapCLIP only requires text data for input. Then it conducts two main steps: 1) retrieving concept prompts that preserve the corresponding domain knowledge of new scenarios; 2) auto-encoding the prompts to learn writing styles to output captions in a desired language. In the testing stage, MultiCapCLIP instead takes visual data as input directly to retrieve the concept prompts to generate the final visual descriptions. The extensive experiments on image and video captioning across four benchmarks and four languages (i.e., English, Chinese, German, and French) confirm the effectiveness of our approach. Compared with state-of-the-art zero-shot and weakly-supervised methods, our method achieves 4.8% and 21.5% absolute improvements in terms of BLEU@4 and CIDEr metrics. Our code is available at https://github.com/yangbang18/MultiCapCLIP.
有监督的视觉字幕模型通常需要大量的图像或视频与特定语言的描述配对(即视觉字幕对)进行训练。然而,对于许多场景和语言来说,收集和标记大规模数据集既耗时又昂贵。因此,通常没有足够的标记对。为了解决标签短缺问题,我们提出了一种简单而有效的零射击方法MultiCapCLIP,它可以在不需要任何标记的下游数据集的视觉标题对的情况下为不同的场景和语言生成视觉标题。在训练阶段,MultiCapCLIP只需要输入文本数据。然后进行两个主要步骤:1)检索保留新场景相应领域知识的概念提示;2)自动编码提示,以学习写作风格,以所需的语言输出字幕。在测试阶段,MultiCapCLIP将可视数据作为直接输入来检索概念提示,以生成最终的可视描述。在四个基准和四种语言(即英语、中文、德语和法语)上对图像和视频字幕进行了广泛的实验,证实了我们方法的有效性。与最先进的零射击和弱监督方法相比,我们的方法在BLEU@4和CIDEr指标方面实现了4.8%和21.5%的绝对改进。我们的代码可在https://github.com/yangbang18/MultiCapCLIP上获得。
{"title":"MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning","authors":"Bang Yang, Fenglin Liu, X. Wu, Yaowei Wang, Xu Sun, Yuexian Zou","doi":"10.18653/v1/2023.acl-long.664","DOIUrl":"https://doi.org/10.18653/v1/2023.acl-long.664","url":null,"abstract":"Supervised visual captioning models typically require a large scale of images or videos paired with descriptions in a specific language (i.e., the vision-caption pairs) for training. However, collecting and labeling large-scale datasets is time-consuming and expensive for many scenarios and languages. Therefore, sufficient labeled pairs are usually not available. To deal with the label shortage problem, we present a simple yet effective zero-shot approach MultiCapCLIP that can generate visual captions for different scenarios and languages without any labeled vision-caption pairs of downstream datasets. In the training stage, MultiCapCLIP only requires text data for input. Then it conducts two main steps: 1) retrieving concept prompts that preserve the corresponding domain knowledge of new scenarios; 2) auto-encoding the prompts to learn writing styles to output captions in a desired language. In the testing stage, MultiCapCLIP instead takes visual data as input directly to retrieve the concept prompts to generate the final visual descriptions. The extensive experiments on image and video captioning across four benchmarks and four languages (i.e., English, Chinese, German, and French) confirm the effectiveness of our approach. Compared with state-of-the-art zero-shot and weakly-supervised methods, our method achieves 4.8% and 21.5% absolute improvements in terms of BLEU@4 and CIDEr metrics. Our code is available at https://github.com/yangbang18/MultiCapCLIP.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126093612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
DaMSTF: Domain Adversarial Learning Enhanced Meta Self-Training for Domain Adaptation 领域对抗学习增强元自训练的领域适应
Pub Date : 2023-08-05 DOI: 10.18653/v1/2023.acl-long.92
Menglong Lu, Zhen Huang, Yunxiang Zhao, Zhiliang Tian, Yang Liu, Dongsheng Li
Self-training emerges as an important research line on domain adaptation. By taking the model’s prediction as the pseudo labels of the unlabeled data, self-training bootstraps the model with pseudo instances in the target domain. However, the prediction errors of pseudo labels (label noise) challenge the performance of self-training. To address this problem, previous approaches only use reliable pseudo instances, i.e., pseudo instances with high prediction confidence, to retrain the model. Although these strategies effectively reduce the label noise, they are prone to miss the hard examples. In this paper, we propose a new self-training framework for domain adaptation, namely Domain adversarial learning enhanced Self-Training Framework (DaMSTF). Firstly, DaMSTF involves meta-learning to estimate the importance of each pseudo instance, so as to simultaneously reduce the label noise and preserve hard examples. Secondly, we design a meta constructor for constructing the meta-validation set, which guarantees the effectiveness of the meta-learning module by improving the quality of the meta-validation set. Thirdly, we find that the meta-learning module suffers from the training guidance vanish- ment and tends to converge to an inferior optimal. To this end, we employ domain adversarial learning as a heuristic neural network initialization method, which can help the meta-learning module converge to a better optimal. Theoretically and experimentally, we demonstrate the effectiveness of the proposed DaMSTF. On the cross-domain sentiment classification task, DaMSTF improves the performance of BERT with an average of nearly 4%.
自我训练成为领域适应的重要研究方向。通过将模型的预测作为未标记数据的伪标签,在目标域中使用伪实例对模型进行自训练。然而,伪标签的预测误差(标签噪声)对自训练的性能提出了挑战。为了解决这个问题,以前的方法只使用可靠的伪实例,即具有高预测置信度的伪实例来重新训练模型。虽然这些策略有效地减少了标签噪声,但它们容易错过困难的例子。本文提出了一种新的领域适应自训练框架,即领域对抗学习增强自训练框架(DaMSTF)。首先,DaMSTF通过元学习来估计每个伪实例的重要性,从而在减少标签噪声的同时保留硬例。其次,我们设计了元构造器来构造元验证集,通过提高元验证集的质量来保证元学习模块的有效性。第三,我们发现元学习模块受到训练指导消失的影响,并倾向于收敛到次优。为此,我们采用领域对抗学习作为启发式神经网络初始化方法,可以帮助元学习模块收敛到更好的最优。理论和实验都证明了所提出的DaMSTF的有效性。在跨域情感分类任务上,DaMSTF将BERT的性能平均提高了近4%。
{"title":"DaMSTF: Domain Adversarial Learning Enhanced Meta Self-Training for Domain Adaptation","authors":"Menglong Lu, Zhen Huang, Yunxiang Zhao, Zhiliang Tian, Yang Liu, Dongsheng Li","doi":"10.18653/v1/2023.acl-long.92","DOIUrl":"https://doi.org/10.18653/v1/2023.acl-long.92","url":null,"abstract":"Self-training emerges as an important research line on domain adaptation. By taking the model’s prediction as the pseudo labels of the unlabeled data, self-training bootstraps the model with pseudo instances in the target domain. However, the prediction errors of pseudo labels (label noise) challenge the performance of self-training. To address this problem, previous approaches only use reliable pseudo instances, i.e., pseudo instances with high prediction confidence, to retrain the model. Although these strategies effectively reduce the label noise, they are prone to miss the hard examples. In this paper, we propose a new self-training framework for domain adaptation, namely Domain adversarial learning enhanced Self-Training Framework (DaMSTF). Firstly, DaMSTF involves meta-learning to estimate the importance of each pseudo instance, so as to simultaneously reduce the label noise and preserve hard examples. Secondly, we design a meta constructor for constructing the meta-validation set, which guarantees the effectiveness of the meta-learning module by improving the quality of the meta-validation set. Thirdly, we find that the meta-learning module suffers from the training guidance vanish- ment and tends to converge to an inferior optimal. To this end, we employ domain adversarial learning as a heuristic neural network initialization method, which can help the meta-learning module converge to a better optimal. Theoretically and experimentally, we demonstrate the effectiveness of the proposed DaMSTF. On the cross-domain sentiment classification task, DaMSTF improves the performance of BERT with an average of nearly 4%.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128372996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Reasoning in Large Language Models Through Symbolic Math Word Problems 通过符号数学单词问题在大型语言模型中的推理
Pub Date : 2023-08-03 DOI: 10.18653/v1/2023.findings-acl.364
Vedant Gaur, Nikunj Saunshi
Large language models (LLMs) have revolutionized NLP by solving downstream tasks with little to no labeled data. Despite their versatile abilities, the larger question of their ability to reason remains ill-understood. This paper addresses reasoning in math word problems (MWPs) by studying symbolic versions of the numeric problems, since a symbolic expression is a"concise explanation"of the numeric answer. We create and use a symbolic version of the SVAMP dataset and find that GPT-3's davinci-002 model also has good zero-shot accuracy on symbolic MWPs. To evaluate the faithfulness of the model's reasoning, we go beyond accuracy and additionally evaluate the alignment between the final answer and the outputted reasoning, which correspond to numeric and symbolic answers respectively for MWPs. We explore a self-prompting approach to encourage the symbolic reasoning to align with the numeric answer, thus equipping the LLM with the ability to provide a concise and verifiable reasoning and making it more interpretable. Surprisingly, self-prompting also improves the symbolic accuracy to be higher than both the numeric and symbolic accuracies, thus providing an ensembling effect. The SVAMP_Sym dataset will be released for future research on symbolic math problems.
大型语言模型(llm)通过解决很少甚至没有标记数据的下游任务,彻底改变了NLP。尽管他们多才多艺,但他们的推理能力这个更大的问题仍然不为人所知。本文通过研究数字问题的符号版本来解决数学单词问题(mwp)中的推理问题,因为符号表达式是数字答案的“简明解释”。我们创建并使用了SVAMP数据集的符号版本,并发现GPT-3的davinci-002模型在符号mwp上也具有良好的零射击精度。为了评估模型推理的可信度,我们超越了准确性,还评估了最终答案与输出推理之间的一致性,它们分别对应于mwp的数字和符号答案。我们探索了一种自我提示的方法,以鼓励符号推理与数字答案保持一致,从而使法学硕士能够提供简洁和可验证的推理,并使其更具可解释性。令人惊讶的是,自提示还提高了符号精度,使其高于数字精度和符号精度,从而提供了一种集成效果。SVAMP_Sym数据集将被发布,用于未来符号数学问题的研究。
{"title":"Reasoning in Large Language Models Through Symbolic Math Word Problems","authors":"Vedant Gaur, Nikunj Saunshi","doi":"10.18653/v1/2023.findings-acl.364","DOIUrl":"https://doi.org/10.18653/v1/2023.findings-acl.364","url":null,"abstract":"Large language models (LLMs) have revolutionized NLP by solving downstream tasks with little to no labeled data. Despite their versatile abilities, the larger question of their ability to reason remains ill-understood. This paper addresses reasoning in math word problems (MWPs) by studying symbolic versions of the numeric problems, since a symbolic expression is a\"concise explanation\"of the numeric answer. We create and use a symbolic version of the SVAMP dataset and find that GPT-3's davinci-002 model also has good zero-shot accuracy on symbolic MWPs. To evaluate the faithfulness of the model's reasoning, we go beyond accuracy and additionally evaluate the alignment between the final answer and the outputted reasoning, which correspond to numeric and symbolic answers respectively for MWPs. We explore a self-prompting approach to encourage the symbolic reasoning to align with the numeric answer, thus equipping the LLM with the ability to provide a concise and verifiable reasoning and making it more interpretable. Surprisingly, self-prompting also improves the symbolic accuracy to be higher than both the numeric and symbolic accuracies, thus providing an ensembling effect. The SVAMP_Sym dataset will be released for future research on symbolic math problems.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"24 25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128465904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
TREA: Tree-Structure Reasoning Schema for Conversational Recommendation 会话推荐的树状结构推理模式
Pub Date : 2023-07-20 DOI: 10.48550/arXiv.2307.10543
Wendi Li, Wei Wei, Xiaoye Qu, Xian-ling Mao, Ye Yuan, Wenfeng Xie, Dangyang Chen
Conversational recommender systems (CRS) aim to timely trace the dynamic interests of users through dialogues and generate relevant responses for item recommendations. Recently, various external knowledge bases (especially knowledge graphs) are incorporated into CRS to enhance the understanding of conversation contexts. However, recent reasoning-based models heavily rely on simplified structures such as linear structures or fixed-hierarchical structures for causality reasoning, hence they cannot fully figure out sophisticated relationships among utterances with external knowledge. To address this, we propose a novel Tree structure Reasoning schEmA named TREA. TREA constructs a multi-hierarchical scalable tree as the reasoning structure to clarify the causal relationships between mentioned entities, and fully utilizes historical conversations to generate more reasonable and suitable responses for recommended results. Extensive experiments on two public CRS datasets have demonstrated the effectiveness of our approach.
会话推荐系统(CRS)旨在通过对话及时跟踪用户的动态兴趣,并生成相关的项目推荐响应。最近,各种外部知识库(特别是知识图谱)被纳入到CRS中,以增强对会话上下文的理解。然而,目前基于推理的模型严重依赖于线性结构或固定层次结构等简化结构进行因果推理,因此无法完全理解具有外部知识的话语之间的复杂关系。为了解决这个问题,我们提出了一种新的树结构推理模式,称为TREA。TREA构建了一个多层可扩展的树作为推理结构,阐明被提及实体之间的因果关系,并充分利用历史对话,对推荐结果产生更合理、更合适的响应。在两个公共CRS数据集上进行的大量实验证明了我们的方法的有效性。
{"title":"TREA: Tree-Structure Reasoning Schema for Conversational Recommendation","authors":"Wendi Li, Wei Wei, Xiaoye Qu, Xian-ling Mao, Ye Yuan, Wenfeng Xie, Dangyang Chen","doi":"10.48550/arXiv.2307.10543","DOIUrl":"https://doi.org/10.48550/arXiv.2307.10543","url":null,"abstract":"Conversational recommender systems (CRS) aim to timely trace the dynamic interests of users through dialogues and generate relevant responses for item recommendations. Recently, various external knowledge bases (especially knowledge graphs) are incorporated into CRS to enhance the understanding of conversation contexts. However, recent reasoning-based models heavily rely on simplified structures such as linear structures or fixed-hierarchical structures for causality reasoning, hence they cannot fully figure out sophisticated relationships among utterances with external knowledge. To address this, we propose a novel Tree structure Reasoning schEmA named TREA. TREA constructs a multi-hierarchical scalable tree as the reasoning structure to clarify the causal relationships between mentioned entities, and fully utilizes historical conversations to generate more reasonable and suitable responses for recommended results. Extensive experiments on two public CRS datasets have demonstrated the effectiveness of our approach.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122827634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Gender-tuning: Empowering Fine-tuning for Debiasing Pre-trained Language Models 性别调整:增强微调去偏见预训练语言模型
Pub Date : 2023-07-20 DOI: 10.48550/arXiv.2307.10522
Somayeh Ghanbarzadeh, Yan Huang, H. Palangi, R. C. Moreno, Hamed Khanpour
Recent studies have revealed that the widely-used Pre-trained Language Models (PLMs) propagate societal biases from the large unmoderated pre-training corpora. Existing solutions require debiasing training processes and datasets for debiasing, which are resource-intensive and costly. Furthermore, these methods hurt the PLMs' performance on downstream tasks. In this study, we propose Gender-tuning, which debiases the PLMs through fine-tuning on downstream tasks' datasets. For this aim, Gender-tuning integrates Masked Language Modeling (MLM) training objectives into fine-tuning's training process. Comprehensive experiments show that Gender-tuning outperforms the state-of-the-art baselines in terms of average gender bias scores in PLMs while improving PLMs' performance on downstream tasks solely using the downstream tasks' dataset. Also, Gender-tuning is a deployable debiasing tool for any PLM that works with original fine-tuning.
最近的研究表明,广泛使用的预训练语言模型(PLMs)从大量未经调节的预训练语料库中传播社会偏见。现有的解决方案需要除偏训练过程和除偏数据集,这是资源密集型和昂贵的。此外,这些方法损害了plm在下游任务上的性能。在本研究中,我们提出性别调整,通过对下游任务数据集的微调来消除plm的偏差。为此,性别调优将掩码语言建模(MLM)的训练目标集成到调优的训练过程中。综合实验表明,性别调整在plm的平均性别偏见得分方面优于最先进的基线,同时仅使用下游任务的数据集提高了plm在下游任务上的表现。此外,性别调优对于任何使用原始微调的PLM来说都是一个可部署的消除偏见的工具。
{"title":"Gender-tuning: Empowering Fine-tuning for Debiasing Pre-trained Language Models","authors":"Somayeh Ghanbarzadeh, Yan Huang, H. Palangi, R. C. Moreno, Hamed Khanpour","doi":"10.48550/arXiv.2307.10522","DOIUrl":"https://doi.org/10.48550/arXiv.2307.10522","url":null,"abstract":"Recent studies have revealed that the widely-used Pre-trained Language Models (PLMs) propagate societal biases from the large unmoderated pre-training corpora. Existing solutions require debiasing training processes and datasets for debiasing, which are resource-intensive and costly. Furthermore, these methods hurt the PLMs' performance on downstream tasks. In this study, we propose Gender-tuning, which debiases the PLMs through fine-tuning on downstream tasks' datasets. For this aim, Gender-tuning integrates Masked Language Modeling (MLM) training objectives into fine-tuning's training process. Comprehensive experiments show that Gender-tuning outperforms the state-of-the-art baselines in terms of average gender bias scores in PLMs while improving PLMs' performance on downstream tasks solely using the downstream tasks' dataset. Also, Gender-tuning is a deployable debiasing tool for any PLM that works with original fine-tuning.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128640073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Curriculum Learning for Graph Neural Networks: A Multiview Competence-based Approach 图神经网络的课程学习:基于多视角能力的方法
Pub Date : 2023-07-17 DOI: 10.48550/arXiv.2307.08859
Nidhi Vakil, Hadi Amiri
A curriculum is a planned sequence of learning materials and an effective one can make learning efficient and effective for both humans and machines. Recent studies developed effective data-driven curriculum learning approaches for training graph neural networks in language applications. However, existing curriculum learning approaches often employ a single criterion of difficulty in their training paradigms. In this paper, we propose a new perspective on curriculum learning by introducing a novel approach that builds on graph complexity formalisms (as difficulty criteria) and model competence during training. The model consists of a scheduling scheme which derives effective curricula by accounting for different views of sample difficulty and model competence during training. The proposed solution advances existing research in curriculum learning for graph neural networks with the ability to incorporate a fine-grained spectrum of graph difficulty criteria in their training paradigms. Experimental results on real-world link prediction and node classification tasks illustrate the effectiveness of the proposed approach.
课程是学习材料的计划序列,有效的课程可以使学习对人和机器都有效。最近的研究开发了有效的数据驱动课程学习方法来训练语言应用中的图神经网络。然而,现有的课程学习方法往往在其训练范式中采用单一的难度标准。在本文中,我们通过引入一种基于图复杂性形式化(作为难度标准)和训练过程中的模型能力的新方法,提出了课程学习的新视角。该模型包括一个调度方案,该方案考虑了训练过程中对样本难度和模型能力的不同看法,从而获得有效的课程。提出的解决方案推进了图神经网络课程学习的现有研究,能够在其训练范例中纳入细粒度的图难度标准谱。在实际链路预测和节点分类任务上的实验结果表明了该方法的有效性。
{"title":"Curriculum Learning for Graph Neural Networks: A Multiview Competence-based Approach","authors":"Nidhi Vakil, Hadi Amiri","doi":"10.48550/arXiv.2307.08859","DOIUrl":"https://doi.org/10.48550/arXiv.2307.08859","url":null,"abstract":"A curriculum is a planned sequence of learning materials and an effective one can make learning efficient and effective for both humans and machines. Recent studies developed effective data-driven curriculum learning approaches for training graph neural networks in language applications. However, existing curriculum learning approaches often employ a single criterion of difficulty in their training paradigms. In this paper, we propose a new perspective on curriculum learning by introducing a novel approach that builds on graph complexity formalisms (as difficulty criteria) and model competence during training. The model consists of a scheduling scheme which derives effective curricula by accounting for different views of sample difficulty and model competence during training. The proposed solution advances existing research in curriculum learning for graph neural networks with the ability to incorporate a fine-grained spectrum of graph difficulty criteria in their training paradigms. Experimental results on real-world link prediction and node classification tasks illustrate the effectiveness of the proposed approach.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127698340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Facilitating Multi-turn Emotional Support Conversation with Positive Emotion Elicitation: A Reinforcement Learning Approach 积极情绪激发促进多回合情绪支持对话:强化学习方法
Pub Date : 2023-07-16 DOI: 10.48550/arXiv.2307.07994
Jinfeng Zhou, Zhuang Chen, Bo Wang, Minlie Huang
Emotional support conversation (ESC) aims to provide emotional support (ES) to improve one’s mental state. Existing works stay at fitting grounded responses and responding strategies (e.g., question), which ignore the effect on ES and lack explicit goals to guide emotional positive transition. To this end, we introduce a new paradigm to formalize multi-turn ESC as a process of positive emotion elicitation. Addressing this task requires finely adjusting the elicitation intensity in ES as the conversation progresses while maintaining conversational goals like coherence. In this paper, we propose Supporter, a mixture-of-expert-based reinforcement learning model, and well design ES and dialogue coherence rewards to guide policy’s learning for responding. Experiments verify the superiority of Supporter in achieving positive emotion elicitation during responding while maintaining conversational goals including coherence.
情感支持会话(ESC)旨在提供情感支持来改善一个人的精神状态。现有的研究停留在拟合基础反应和应对策略(如问题)上,忽视了对情绪反应的影响,缺乏明确的目标来引导情绪的积极转变。为此,我们引入了一个新的范式,将多回合ESC形式化为一个积极情绪激发的过程。要完成这一任务,需要在保持连贯等会话目标的同时,随着对话的进行,精细地调整ES中的引出强度。在本文中,我们提出了一个基于混合专家的强化学习模型,并设计了ES和对话一致性奖励来指导政策的响应学习。实验验证了支持者在保持会话目标(包括连贯)的同时,在回应过程中实现积极情绪激发方面的优势。
{"title":"Facilitating Multi-turn Emotional Support Conversation with Positive Emotion Elicitation: A Reinforcement Learning Approach","authors":"Jinfeng Zhou, Zhuang Chen, Bo Wang, Minlie Huang","doi":"10.48550/arXiv.2307.07994","DOIUrl":"https://doi.org/10.48550/arXiv.2307.07994","url":null,"abstract":"Emotional support conversation (ESC) aims to provide emotional support (ES) to improve one’s mental state. Existing works stay at fitting grounded responses and responding strategies (e.g., question), which ignore the effect on ES and lack explicit goals to guide emotional positive transition. To this end, we introduce a new paradigm to formalize multi-turn ESC as a process of positive emotion elicitation. Addressing this task requires finely adjusting the elicitation intensity in ES as the conversation progresses while maintaining conversational goals like coherence. In this paper, we propose Supporter, a mixture-of-expert-based reinforcement learning model, and well design ES and dialogue coherence rewards to guide policy’s learning for responding. Experiments verify the superiority of Supporter in achieving positive emotion elicitation during responding while maintaining conversational goals including coherence.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131870058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Annual Meeting of the Association for Computational Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1