首页 > 最新文献

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing最新文献

英文 中文
Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences. 回答意见问题:区分事实与意见,辨别意见句的极性。
Hong Yu, Vasileios Hatzivassiloglou

Opinion question answering is a challenging task for natural language processing. In this paper, we discuss a necessary component for an opinion question answering system: separating opinions from fact, at both the document and sentence level. We present a Bayesian classifier for discriminating between documents with a preponderance of opinions such as editorials from regular news stories, and describe three unsupervised, statistical techniques for the significantly harder task of detecting opinions at the sentence level. We also present a first model for classifying opinion sentences as positive or negative in terms of the main perspective being expressed in the opinion. Results from a large collection of news stories and a human evaluation of 400 sentences are reported, indicating that we achieve very high performance in document classification (upwards of 97% precision and recall), and respectable performance in detecting opinions and classifying them at the sentence level as positive, negative, or neutral (up to 91% accuracy).

对于自然语言处理来说,意见问题回答是一项具有挑战性的任务。在本文中,我们讨论了意见问答系统的一个必要组成部分:在文件和句子层面将意见与事实分离。我们提出了一个贝叶斯分类器,用于区分具有优势意见的文档,例如来自常规新闻故事的社论,并描述了三种无监督的统计技术,用于在句子级别检测意见的任务。我们还提出了第一个模型,用于根据意见中表达的主要观点将意见句子分类为积极或消极。报告了来自大量新闻故事和400个句子的人类评估的结果,表明我们在文档分类方面取得了非常高的性能(超过97%的准确率和召回率),并且在检测意见并在句子级别将其分类为积极,消极或中立方面表现不错(准确率高达91%)。
{"title":"Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences.","authors":"Hong Yu, Vasileios Hatzivassiloglou","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Opinion question answering is a challenging task for natural language processing. In this paper, we discuss a necessary component for an opinion question answering system: separating opinions from fact, at both the document and sentence level. We present a Bayesian classifier for discriminating between documents with a preponderance of opinions such as editorials from regular news stories, and describe three unsupervised, statistical techniques for the significantly harder task of detecting opinions at the sentence level. We also present a first model for classifying opinion sentences as positive or negative in terms of the main perspective being expressed in the opinion. Results from a large collection of news stories and a human evaluation of 400 sentences are reported, indicating that we achieve very high performance in document classification (upwards of 97% precision and recall), and respectable performance in detecting opinions and classifying them at the sentence level as positive, negative, or neutral (up to 91% accuracy).</p>","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2003 ","pages":"129-136"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12836483/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146095221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Complex Knowledge Base Question Answering via Question-to-Action and Question-to-Question Alignment 通过问题到行动和问题到问题对齐改进复杂知识库的问题回答
Yechun Tang, Xiaoxia Cheng, Weiming Lu
Complex knowledge base question answering can be achieved by converting questions into sequences of predefined actions. However, there is a significant semantic and structural gap between natural language and action sequences, which makes this conversion difficult. In this paper, we introduce an alignment-enhanced complex question answering framework, called ALCQA, which mitigates this gap through question-to-action alignment and question-to-question alignment. We train a question rewriting model to align the question and each action, and utilize a pretrained language model to implicitly align the question and KG artifacts. Moreover, considering that similar questions correspond to similar action sequences, we retrieve top-k similar question-answer pairs at the inference stage through question-to-question alignment and propose a novel reward-guided action sequence selection strategy to select from candidate action sequences. We conduct experiments on CQA and WQSP datasets, and the results show that our approach outperforms state-of-the-art methods and obtains a 9.88% improvements in the F1 metric on CQA dataset. Our source code is available at https://github.com/TTTTTTTTy/ALCQA.
复杂的知识库问答可以通过将问题转换为预定义的动作序列来实现。然而,在自然语言和动作序列之间存在显著的语义和结构差距,这使得这种转换变得困难。在本文中,我们引入了一个对齐增强的复杂问答框架,称为ALCQA,它通过问题到行动的对齐和问题到问题的对齐来缓解这种差距。我们训练了一个问题重写模型来对齐问题和每个动作,并利用预训练的语言模型来隐式对齐问题和KG工件。此外,考虑到相似的问题对应相似的动作序列,我们在推理阶段通过问题对问题对齐来检索top-k个相似的问答对,并提出了一种新的奖励引导的动作序列选择策略来从候选动作序列中进行选择。我们在CQA和WQSP数据集上进行了实验,结果表明我们的方法优于目前最先进的方法,在CQA数据集上的F1度量提高了9.88%。我们的源代码可从https://github.com/TTTTTTTTy/ALCQA获得。
{"title":"Improving Complex Knowledge Base Question Answering via Question-to-Action and Question-to-Question Alignment","authors":"Yechun Tang, Xiaoxia Cheng, Weiming Lu","doi":"10.48550/arXiv.2212.13036","DOIUrl":"https://doi.org/10.48550/arXiv.2212.13036","url":null,"abstract":"Complex knowledge base question answering can be achieved by converting questions into sequences of predefined actions. However, there is a significant semantic and structural gap between natural language and action sequences, which makes this conversion difficult. In this paper, we introduce an alignment-enhanced complex question answering framework, called ALCQA, which mitigates this gap through question-to-action alignment and question-to-question alignment. We train a question rewriting model to align the question and each action, and utilize a pretrained language model to implicitly align the question and KG artifacts. Moreover, considering that similar questions correspond to similar action sequences, we retrieve top-k similar question-answer pairs at the inference stage through question-to-question alignment and propose a novel reward-guided action sequence selection strategy to select from candidate action sequences. We conduct experiments on CQA and WQSP datasets, and the results show that our approach outperforms state-of-the-art methods and obtains a 9.88% improvements in the F1 metric on CQA dataset. Our source code is available at https://github.com/TTTTTTTTy/ALCQA.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"52 1","pages":"137-147"},"PeriodicalIF":0.0,"publicationDate":"2022-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84959619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
TextBox 2.0: A Text Generation Library with Pre-trained Language Models TextBox 2.0:一个带有预训练语言模型的文本生成库
Tianyi Tang, Junyi Li, Z. Chen, Yiwen Hu, Zhuohao Yu, Wen-Dao Dai, Zican Dong, Xiaoxue Cheng, Yuhao Wang, Wayne Xin Zhao, J. Nie, Ji-rong Wen
To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets and further incorporates $45$ PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement $4$ efficient training strategies and provide $4$ generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.
为了促进文本生成的研究,本文提出了一个全面统一的库TextBox 2.0,重点关注预训练语言模型(plm)的使用。为了全面,我们的库涵盖了13美元的常见文本生成任务及其相应的83美元数据集,并进一步纳入了45美元的plm,包括通用plm、翻译plm、中文plm、对话plm、可控plm、蒸馏plm、提示plm和轻量级plm。我们还实施了4美元的高效培训策略,并为从头开始的新plm预培训提供了4美元的生成目标。为了实现统一,我们设计了支持整个研究流程(从数据加载到培训和评估)的接口,确保每个步骤都可以统一完成。尽管功能丰富,但通过友好的Python API或命令行,很容易使用我们的库。为了验证我们的图书馆的有效性,我们进行了广泛的实验,并举例说明了四种类型的研究场景。该项目发布在链接:https://github.com/RUCAIBox/TextBox。
{"title":"TextBox 2.0: A Text Generation Library with Pre-trained Language Models","authors":"Tianyi Tang, Junyi Li, Z. Chen, Yiwen Hu, Zhuohao Yu, Wen-Dao Dai, Zican Dong, Xiaoxue Cheng, Yuhao Wang, Wayne Xin Zhao, J. Nie, Ji-rong Wen","doi":"10.48550/arXiv.2212.13005","DOIUrl":"https://doi.org/10.48550/arXiv.2212.13005","url":null,"abstract":"To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets and further incorporates $45$ PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement $4$ efficient training strategies and provide $4$ generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"452 1","pages":"435-444"},"PeriodicalIF":0.0,"publicationDate":"2022-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76493273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension 结构化的对话摘要以促进对话理解
Borui Wang, Chengcheng Feng, Arjun Nair, Madelyn Mao, Jai Desai, Asli Celikyilmaz, Haoran Li, Yashar Mehdad, Dragomir R. Radev
Abstractive dialogue summarization has long been viewed as an important standalone task in natural language processing, but no previous work has explored the possibility of whether abstractive dialogue summarization can also be used as a means to boost an NLP system’s performance on other important dialogue comprehension tasks. In this paper, we propose a novel type of dialogue summarization task - STRUctured DiaLoguE Summarization (STRUDEL) - that can help pre-trained language models to better understand dialogues and improve their performance on important dialogue comprehension tasks. In contrast to the holistic approach taken by the traditional free-form abstractive summarization task for dialogues, STRUDEL aims to decompose and imitate the hierarchical, systematic and structured mental process that we human beings usually go through when understanding and analyzing dialogues, and thus has the advantage of being more focused, specific and instructive for dialogue comprehension models to learn from. We further introduce a new STRUDEL dialogue comprehension modeling framework that integrates STRUDEL into a dialogue reasoning module over transformer encoder language models to improve their dialogue comprehension ability. In our empirical experiments on two important downstream dialogue comprehension tasks - dialogue question answering and dialogue response prediction - we demonstrate that our STRUDEL dialogue comprehension models can significantly improve the dialogue comprehension performance of transformer encoder language models.
摘要抽象对话摘要一直被认为是自然语言处理中一个重要的独立任务,但目前还没有研究表明抽象对话摘要是否也可以作为一种手段来提高NLP系统在其他重要对话理解任务上的表现。在本文中,我们提出了一种新型的对话摘要任务——结构化对话摘要(STRUDEL),它可以帮助预训练的语言模型更好地理解对话,并提高它们在重要对话理解任务上的表现。与传统的自由形式抽象的对话总结任务所采取的整体方法相比,STRUDEL旨在分解和模仿我们人类在理解和分析对话时所经历的层次化、系统化和结构化的心理过程,从而具有更集中、更具体和更有指导意义的优势,可供对话理解模型学习。我们进一步引入了一个新的STRUDEL对话理解建模框架,该框架将STRUDEL集成到转换器编码器语言模型的对话推理模块中,以提高其对话理解能力。在两个重要的下游对话理解任务——对话问答和对话响应预测的实证实验中,我们证明了我们的STRUDEL对话理解模型可以显著提高变压器编码器语言模型的对话理解性能。
{"title":"STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension","authors":"Borui Wang, Chengcheng Feng, Arjun Nair, Madelyn Mao, Jai Desai, Asli Celikyilmaz, Haoran Li, Yashar Mehdad, Dragomir R. Radev","doi":"10.48550/arXiv.2212.12652","DOIUrl":"https://doi.org/10.48550/arXiv.2212.12652","url":null,"abstract":"Abstractive dialogue summarization has long been viewed as an important standalone task in natural language processing, but no previous work has explored the possibility of whether abstractive dialogue summarization can also be used as a means to boost an NLP system’s performance on other important dialogue comprehension tasks. In this paper, we propose a novel type of dialogue summarization task - STRUctured DiaLoguE Summarization (STRUDEL) - that can help pre-trained language models to better understand dialogues and improve their performance on important dialogue comprehension tasks. In contrast to the holistic approach taken by the traditional free-form abstractive summarization task for dialogues, STRUDEL aims to decompose and imitate the hierarchical, systematic and structured mental process that we human beings usually go through when understanding and analyzing dialogues, and thus has the advantage of being more focused, specific and instructive for dialogue comprehension models to learn from. We further introduce a new STRUDEL dialogue comprehension modeling framework that integrates STRUDEL into a dialogue reasoning module over transformer encoder language models to improve their dialogue comprehension ability. In our empirical experiments on two important downstream dialogue comprehension tasks - dialogue question answering and dialogue response prediction - we demonstrate that our STRUDEL dialogue comprehension models can significantly improve the dialogue comprehension performance of transformer encoder language models.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"160 3 1","pages":"4949-4958"},"PeriodicalIF":0.0,"publicationDate":"2022-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83262235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Cross-Linguistic Syntactic Difference in Multilingual BERT: How Good is It and How Does It Affect Transfer? 多语言BERT的跨语言句法差异:它有多好?它如何影响迁移?
Ningyu Xu, Tao Gui, Ruotian Ma, Qi Zhang, Jingting Ye, Menghan Zhang, Xuanjing Huang
Multilingual BERT (mBERT) has demonstrated considerable cross-lingual syntactic ability, whereby it enables effective zero-shot cross-lingual transfer of syntactic knowledge. The transfer is more successful between some languages, but it is not well understood what leads to this variation and whether it fairly reflects difference between languages. In this work, we investigate the distributions of grammatical relations induced from mBERT in the context of 24 typologically different languages. We demonstrate that the distance between the distributions of different languages is highly consistent with the syntactic difference in terms of linguistic formalisms. Such difference learnt via self-supervision plays a crucial role in the zero-shot transfer performance and can be predicted by variation in morphosyntactic properties between languages. These results suggest that mBERT properly encodes languages in a way consistent with linguistic diversity and provide insights into the mechanism of cross-lingual transfer.
多语言BERT (mBERT)已经证明了相当大的跨语言句法能力,从而实现了句法知识的有效零概率跨语言迁移。一些语言之间的迁移更为成功,但人们并不清楚是什么导致了这种差异,以及它是否公平地反映了语言之间的差异。在这项工作中,我们研究了在24种不同类型的语言背景下由mBERT诱导的语法关系的分布。我们证明了不同语言分布之间的距离与语言形式的句法差异高度一致。这种通过自我监督习得的差异在零迁移表现中起着至关重要的作用,并且可以通过语言间形态句法特性的差异来预测。这些结果表明,mBERT以一种符合语言多样性的方式对语言进行了适当的编码,并为跨语言迁移的机制提供了新的见解。
{"title":"Cross-Linguistic Syntactic Difference in Multilingual BERT: How Good is It and How Does It Affect Transfer?","authors":"Ningyu Xu, Tao Gui, Ruotian Ma, Qi Zhang, Jingting Ye, Menghan Zhang, Xuanjing Huang","doi":"10.48550/arXiv.2212.10879","DOIUrl":"https://doi.org/10.48550/arXiv.2212.10879","url":null,"abstract":"Multilingual BERT (mBERT) has demonstrated considerable cross-lingual syntactic ability, whereby it enables effective zero-shot cross-lingual transfer of syntactic knowledge. The transfer is more successful between some languages, but it is not well understood what leads to this variation and whether it fairly reflects difference between languages. In this work, we investigate the distributions of grammatical relations induced from mBERT in the context of 24 typologically different languages. We demonstrate that the distance between the distributions of different languages is highly consistent with the syntactic difference in terms of linguistic formalisms. Such difference learnt via self-supervision plays a crucial role in the zero-shot transfer performance and can be predicted by variation in morphosyntactic properties between languages. These results suggest that mBERT properly encodes languages in a way consistent with linguistic diversity and provide insights into the mechanism of cross-lingual transfer.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"71 1","pages":"8073-8092"},"PeriodicalIF":0.0,"publicationDate":"2022-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74047386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Generating Multiple-Length Summaries via Reinforcement Learning for Unsupervised Sentence Summarization 基于强化学习的无监督句子摘要生成多长度摘要
Dongmin Hyun, Xiting Wang, Chanyoung Park, Xing Xie, Hwanjo Yu
Sentence summarization shortens given texts while maintaining core contents of the texts. Unsupervised approaches have been studied to summarize texts without human-written summaries. However, recent unsupervised models are extractive, which remove words from texts and thus they are less flexible than abstractive summarization. In this work, we devise an abstractive model based on reinforcement learning without ground-truth summaries. We formulate the unsupervised summarization based on the Markov decision process with rewards representing the summary quality. To further enhance the summary quality, we develop a multi-summary learning mechanism that generates multiple summaries with varying lengths for a given text, while making the summaries mutually enhance each other. Experimental results show that the proposed model substantially outperforms both abstractive and extractive models, yet frequently generating new words not contained in input texts.
句子摘要在保持文本核心内容的同时,缩短了给定文本。已经研究了无监督的方法来总结没有人类书面摘要的文本。然而,最近的无监督模型是抽取的,它从文本中删除单词,因此它们不如抽象摘要灵活。在这项工作中,我们设计了一个基于强化学习的抽象模型,没有真实摘要。我们提出了基于马尔可夫决策过程的无监督总结,奖励代表总结质量。为了进一步提高摘要质量,我们开发了一种多摘要学习机制,该机制可以为给定文本生成不同长度的多个摘要,同时使摘要相互增强。实验结果表明,该模型大大优于抽象模型和抽取模型,但经常生成输入文本中不包含的新词。
{"title":"Generating Multiple-Length Summaries via Reinforcement Learning for Unsupervised Sentence Summarization","authors":"Dongmin Hyun, Xiting Wang, Chanyoung Park, Xing Xie, Hwanjo Yu","doi":"10.48550/arXiv.2212.10843","DOIUrl":"https://doi.org/10.48550/arXiv.2212.10843","url":null,"abstract":"Sentence summarization shortens given texts while maintaining core contents of the texts. Unsupervised approaches have been studied to summarize texts without human-written summaries. However, recent unsupervised models are extractive, which remove words from texts and thus they are less flexible than abstractive summarization. In this work, we devise an abstractive model based on reinforcement learning without ground-truth summaries. We formulate the unsupervised summarization based on the Markov decision process with rewards representing the summary quality. To further enhance the summary quality, we develop a multi-summary learning mechanism that generates multiple summaries with varying lengths for a given text, while making the summaries mutually enhance each other. Experimental results show that the proposed model substantially outperforms both abstractive and extractive models, yet frequently generating new words not contained in input texts.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"91 1","pages":"2939-2951"},"PeriodicalIF":0.0,"publicationDate":"2022-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90256373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Azimuth: Systematic Error Analysis for Text Classification 方位角:文本分类系统误差分析
Gabrielle Gauthier Melançon, Orlando Marquez Ayala, Lindsay D. Brin, Chris Tyler, Frederic Branchaud-Charron, Joseph Marinier, Karine Grande, Dieu-Thu Le
We present Azimuth, an open-source and easy-to-use tool to perform error analysis for text classification. Compared to other stages of the ML development cycle, such as model training and hyper-parameter tuning, the process and tooling for the error analysis stage are less mature. However, this stage is critical for the development of reliable and trustworthy AI systems. To make error analysis more systematic, we propose an approach comprising dataset analysis and model quality assessment, which Azimuth facilitates. We aim to help AI practitioners discover and address areas where the model does not generalize by leveraging and integrating a range of ML techniques, such as saliency maps, similarity, uncertainty, and behavioral analyses, all in one tool. Our code and documentation are available at github.com/servicenow/azimuth.
我们介绍了Azimuth,一个开源和易于使用的工具,用于执行文本分类的错误分析。与机器学习开发周期的其他阶段(如模型训练和超参数调优)相比,错误分析阶段的过程和工具不太成熟。然而,这一阶段对于开发可靠和值得信赖的人工智能系统至关重要。为了使误差分析更加系统化,我们提出了一种包含数据集分析和模型质量评估的方法。我们的目标是通过利用和集成一系列ML技术,如显著性地图、相似性、不确定性和行为分析,帮助人工智能从业者发现和解决模型不能泛化的领域。我们的代码和文档可在github.com/servicenow/azimuth上获得。
{"title":"Azimuth: Systematic Error Analysis for Text Classification","authors":"Gabrielle Gauthier Melançon, Orlando Marquez Ayala, Lindsay D. Brin, Chris Tyler, Frederic Branchaud-Charron, Joseph Marinier, Karine Grande, Dieu-Thu Le","doi":"10.48550/arXiv.2212.08216","DOIUrl":"https://doi.org/10.48550/arXiv.2212.08216","url":null,"abstract":"We present Azimuth, an open-source and easy-to-use tool to perform error analysis for text classification. Compared to other stages of the ML development cycle, such as model training and hyper-parameter tuning, the process and tooling for the error analysis stage are less mature. However, this stage is critical for the development of reliable and trustworthy AI systems. To make error analysis more systematic, we propose an approach comprising dataset analysis and model quality assessment, which Azimuth facilitates. We aim to help AI practitioners discover and address areas where the model does not generalize by leveraging and integrating a range of ML techniques, such as saliency maps, similarity, uncertainty, and behavioral analyses, all in one tool. Our code and documentation are available at github.com/servicenow/azimuth.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"71 1","pages":"298-310"},"PeriodicalIF":0.0,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84669864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification 基于密集特征记忆增强变压器的COVID-19疫苗搜索分类
Jai Gupta, Yi Tay, C. Kamath, Vinh Q. Tran, Donald Metzler, S. Bavadekar, Mimi Sun, E. Gabrilovich
With the devastating outbreak of COVID-19, vaccines are one of the crucial lines of defense against mass infection in this global pandemic. Given the protection they provide, vaccines are becoming mandatory in certain social and professional settings. This paper presents a classification model for detecting COVID-19 vaccination related search queries, a machine learning model that is used to generate search insights for COVID-19 vaccinations. The proposed method combines and leverages advancements from modern state-of-the-art (SOTA) natural language understanding (NLU) techniques such as pretrained Transformers with traditional dense features. We propose a novel approach of considering dense features as memory tokens that the model can attend to. We show that this new modeling approach enables a significant improvement to the Vaccine Search Insights (VSI) task, improving a strong well-established gradient-boosting baseline by relative +15% improvement in F1 score and +14% in precision.
随着2019冠状病毒病(COVID-19)的毁灭性爆发,疫苗是在这场全球大流行中抵御大规模感染的关键防线之一。鉴于疫苗所提供的保护,在某些社会和专业环境中,疫苗正成为强制性的。本文提出了一种用于检测COVID-19疫苗接种相关搜索查询的分类模型,该模型是一种用于生成COVID-19疫苗接种搜索洞察的机器学习模型。所提出的方法结合并利用了现代最先进的(SOTA)自然语言理解(NLU)技术的进步,例如具有传统密集特征的预训练变形金刚。我们提出了一种新颖的方法,将密集特征视为模型可以关注的记忆标记。我们表明,这种新的建模方法能够显著改善疫苗搜索洞察(VSI)任务,通过F1得分和精度的相对提高+15%和+14%,改善了强大的已建立的梯度增强基线。
{"title":"Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification","authors":"Jai Gupta, Yi Tay, C. Kamath, Vinh Q. Tran, Donald Metzler, S. Bavadekar, Mimi Sun, E. Gabrilovich","doi":"10.48550/arXiv.2212.13898","DOIUrl":"https://doi.org/10.48550/arXiv.2212.13898","url":null,"abstract":"With the devastating outbreak of COVID-19, vaccines are one of the crucial lines of defense against mass infection in this global pandemic. Given the protection they provide, vaccines are becoming mandatory in certain social and professional settings. This paper presents a classification model for detecting COVID-19 vaccination related search queries, a machine learning model that is used to generate search insights for COVID-19 vaccinations. The proposed method combines and leverages advancements from modern state-of-the-art (SOTA) natural language understanding (NLU) techniques such as pretrained Transformers with traditional dense features. We propose a novel approach of considering dense features as memory tokens that the model can attend to. We show that this new modeling approach enables a significant improvement to the Vaccine Search Insights (VSI) task, improving a strong well-established gradient-boosting baseline by relative +15% improvement in F1 score and +14% in precision.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"43 1","pages":"521-530"},"PeriodicalIF":0.0,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84126133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ReCo: Reliable Causal Chain Reasoning via Structural Causal Recurrent Neural Networks 基于结构因果递归神经网络的可靠因果链推理
Kai Xiong, Xiao Ding, Zhongyang Li, L. Du, Bing Qin, Yi Zheng, Baoxing Huai
Causal chain reasoning (CCR) is an essential ability for many decision-making AI systems, which requires the model to build reliable causal chains by connecting causal pairs. However, CCR suffers from two main transitive problems: threshold effect and scene drift. In other words, the causal pairs to be spliced may have a conflicting threshold boundary or scenario.To address these issues, we propose a novel Reliable Causal chain reasoning framework (ReCo), which introduces exogenous variables to represent the threshold and scene factors of each causal pair within the causal chain, and estimates the threshold and scene contradictions across exogenous variables via structural causal recurrent neural networks (SRNN). Experiments show that ReCo outperforms a series of strong baselines on both Chinese and English CCR datasets. Moreover, by injecting reliable causal chain knowledge distilled by ReCo, BERT can achieve better performances on four downstream causal-related tasks than BERT models enhanced by other kinds of knowledge.
因果链推理(Causal chain reasoning, CCR)是许多决策人工智能系统的基本能力,它要求模型通过连接因果对来构建可靠的因果链。然而,CCR存在两个主要的传递问题:阈值效应和场景漂移。换句话说,要拼接的因果对可能具有冲突的阈值边界或场景。为了解决这些问题,我们提出了一种新的可靠因果链推理框架(ReCo),该框架引入外生变量来表示因果链中每个因果对的阈值和场景因素,并通过结构因果递归神经网络(SRNN)估计外生变量之间的阈值和场景矛盾。实验表明,在中英文CCR数据集上,ReCo算法的性能优于一系列强基线。此外,通过注入由ReCo提取的可靠因果链知识,BERT模型在四个下游因果相关任务上的表现优于其他类型知识增强的BERT模型。
{"title":"ReCo: Reliable Causal Chain Reasoning via Structural Causal Recurrent Neural Networks","authors":"Kai Xiong, Xiao Ding, Zhongyang Li, L. Du, Bing Qin, Yi Zheng, Baoxing Huai","doi":"10.48550/arXiv.2212.08322","DOIUrl":"https://doi.org/10.48550/arXiv.2212.08322","url":null,"abstract":"Causal chain reasoning (CCR) is an essential ability for many decision-making AI systems, which requires the model to build reliable causal chains by connecting causal pairs. However, CCR suffers from two main transitive problems: threshold effect and scene drift. In other words, the causal pairs to be spliced may have a conflicting threshold boundary or scenario.To address these issues, we propose a novel Reliable Causal chain reasoning framework (ReCo), which introduces exogenous variables to represent the threshold and scene factors of each causal pair within the causal chain, and estimates the threshold and scene contradictions across exogenous variables via structural causal recurrent neural networks (SRNN). Experiments show that ReCo outperforms a series of strong baselines on both Chinese and English CCR datasets. Moreover, by injecting reliable causal chain knowledge distilled by ReCo, BERT can achieve better performances on four downstream causal-related tasks than BERT models enhanced by other kinds of knowledge.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"30 1","pages":"6426-6438"},"PeriodicalIF":0.0,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86496573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Injecting Domain Knowledge in Language Models for Task-oriented Dialogue Systems 面向任务对话系统的语言模型领域知识注入
Denis Emelin, Daniele Bonadiman, Sawsan Alqahtani, Yi Zhang, Saab Mansour
Pre-trained language models (PLM) have advanced the state-of-the-art across NLP applications, but lack domain-specific knowledge that does not naturally occur in pre-training data. Previous studies augmented PLMs with symbolic knowledge for different downstream NLP tasks. However, knowledge bases (KBs) utilized in these studies are usually large-scale and static, in contrast to small, domain-specific, and modifiable knowledge bases that are prominent in real-world task-oriented dialogue (TOD) systems. In this paper, we showcase the advantages of injecting domain-specific knowledge prior to fine-tuning on TOD tasks. To this end, we utilize light-weight adapters that can be easily integrated with PLMs and serve as a repository for facts learned from different KBs. To measure the efficacy of proposed knowledge injection methods, we introduce Knowledge Probing using Response Selection (KPRS) – a probe designed specifically for TOD models. Experiments on KPRS and the response generation task show improvements of knowledge injection with adapters over strong baselines.
预训练语言模型(PLM)已经在NLP应用中推进了最先进的技术,但缺乏在预训练数据中不会自然出现的领域特定知识。以往的研究在不同的下游NLP任务中增强了符号知识的plm。然而,在这些研究中使用的知识库(KBs)通常是大规模和静态的,与现实世界中面向任务的对话(TOD)系统中突出的小型、特定领域和可修改的知识库形成对比。在本文中,我们展示了在对TOD任务进行微调之前注入特定领域知识的优势。为此,我们利用轻量级适配器,它可以很容易地与plm集成,并作为从不同KBs学习到的事实的存储库。为了衡量所提出的知识注入方法的有效性,我们引入了使用响应选择(KPRS)的知识探测——一种专门为TOD模型设计的探测。在KPRS和响应生成任务上的实验表明,在强基线的基础上,适配器的知识注入得到了改进。
{"title":"Injecting Domain Knowledge in Language Models for Task-oriented Dialogue Systems","authors":"Denis Emelin, Daniele Bonadiman, Sawsan Alqahtani, Yi Zhang, Saab Mansour","doi":"10.48550/arXiv.2212.08120","DOIUrl":"https://doi.org/10.48550/arXiv.2212.08120","url":null,"abstract":"Pre-trained language models (PLM) have advanced the state-of-the-art across NLP applications, but lack domain-specific knowledge that does not naturally occur in pre-training data. Previous studies augmented PLMs with symbolic knowledge for different downstream NLP tasks. However, knowledge bases (KBs) utilized in these studies are usually large-scale and static, in contrast to small, domain-specific, and modifiable knowledge bases that are prominent in real-world task-oriented dialogue (TOD) systems. In this paper, we showcase the advantages of injecting domain-specific knowledge prior to fine-tuning on TOD tasks. To this end, we utilize light-weight adapters that can be easily integrated with PLMs and serve as a repository for facts learned from different KBs. To measure the efficacy of proposed knowledge injection methods, we introduce Knowledge Probing using Response Selection (KPRS) – a probe designed specifically for TOD models. Experiments on KPRS and the response generation task show improvements of knowledge injection with adapters over strong baselines.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"56 1","pages":"11962-11974"},"PeriodicalIF":0.0,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79138347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1