首页 > 最新文献

Natural Language Engineering最新文献

英文 中文
Emerging trends: When can users trust GPT, and when should they intervene? 新趋势:用户何时可以信任 GPT,何时应该进行干预?
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-16 DOI: 10.1017/s1351324923000578
Kenneth Church

Usage of large language models and chat bots will almost surely continue to grow, since they are so easy to use, and so (incredibly) credible. I would be more comfortable with this reality if we encouraged more evaluations with humans-in-the-loop to come up with a better characterization of when the machine can be trusted and when humans should intervene. This article will describe a homework assignment, where I asked my students to use tools such as chat bots and web search to write a number of essays. Even after considerable discussion in class on hallucinations, many of the essays were full of misinformation that should have been fact-checked. Apparently, it is easier to believe ChatGPT than to be skeptical. Fact-checking and web search are too much trouble.

大型语言模型和聊天机器人的使用量几乎肯定会继续增长,因为它们非常容易使用,而且非常可信。如果我们能鼓励更多的评估,让人类参与其中,从而更好地确定何时可以信任机器,何时应该由人类介入,那么我将会更乐于接受这一现实。本文将介绍一项家庭作业,我要求学生使用聊天机器人和网络搜索等工具撰写一些文章。即使在课堂上对幻觉进行了大量讨论,许多文章还是充满了本应进行事实核查的错误信息。显然,相信 ChatGPT 比持怀疑态度更容易。事实核查和网络搜索太麻烦了。
{"title":"Emerging trends: When can users trust GPT, and when should they intervene?","authors":"Kenneth Church","doi":"10.1017/s1351324923000578","DOIUrl":"https://doi.org/10.1017/s1351324923000578","url":null,"abstract":"<p>Usage of large language models and chat bots will almost surely continue to grow, since they are so easy to use, and so (incredibly) credible. I would be more comfortable with this reality if we encouraged more evaluations with humans-in-the-loop to come up with a better characterization of when the machine can be trusted and when humans should intervene. This article will describe a homework assignment, where I asked my students to use tools such as chat bots and web search to write a number of essays. Even after considerable discussion in class on hallucinations, many of the essays were full of misinformation that should have been fact-checked. Apparently, it is easier to believe ChatGPT than to be skeptical. Fact-checking and web search are too much trouble.</p>","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"294 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2024-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139475363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lightweight transformers for clinical natural language processing 用于临床自然语言处理的轻量级转换器
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-12 DOI: 10.1017/s1351324923000542
Omid Rohanian, Mohammadmahdi Nouriborji, Hannah Jauncey, Samaneh Kouchaki, Farhad Nooralahzadeh, ISARIC Clinical Characterisation Group, Lei Clifton, Laura Merson, David A. Clifton

Specialised pre-trained language models are becoming more frequent in Natural language Processing (NLP) since they can potentially outperform models trained on generic texts. BioBERT (Sanh et al., Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv: 1910.01108, 2019) and BioClinicalBERT (Alsentzer et al., Publicly available clinical bert embeddings. In Proceedings of the 2nd Clinical Natural Language Processing Workshop, pp. 72–78, 2019) are two examples of such models that have shown promise in medical NLP tasks. Many of these models are overparametrised and resource-intensive, but thanks to techniques like knowledge distillation, it is possible to create smaller versions that perform almost as well as their larger counterparts. In this work, we specifically focus on development of compact language models for processing clinical texts (i.e. progress notes, discharge summaries, etc). We developed a number of efficient lightweight clinical transformers using knowledge distillation and continual learning, with the number of parameters ranging from $15$ million to $65$ million. These models performed comparably to larger models such as BioBERT and ClinicalBioBERT and significantly outperformed other compact models trained on general or biomedical data. Our extensive evaluation was done across several standard datasets and covered a wide range of clinical text-mining tasks, including natural language inference, relation extraction, named entity recognition and sequence classification. To our knowledge, this is the first comprehensive study specifically focused on creating efficient and compact transformers for clinical NLP tasks. The models and code used in this study can be found on our Huggingface profile at https://huggingface.co/nlpie and Github page at https://github.com/nlpie-research/Lightweight-Clinical-Transformers, respectively, promoting reproducibility of our results.

专业的预训练语言模型在自然语言处理(NLP)领域越来越常见,因为它们有可能超越在通用文本上训练的模型。BioBERT(Sanh等人,Distilbert,Bert的蒸馏版本:更小、更快、更便宜、更轻。ArXiv预印本arXiv:1910.01108,2019)和BioClinicalBERT(Alsentzer等人,公开可用的临床Bert嵌入。In Proceedings of the 2nd Clinical Natural Language Processing Workshop, pp.这些模型中有很多都是过度参数化和资源密集型的,但由于采用了知识提炼等技术,我们有可能创建出性能几乎与大型模型相当的小型模型。在这项工作中,我们特别关注开发用于处理临床文本(即病程进展记录、出院摘要等)的紧凑型语言模型。我们利用知识提炼和持续学习技术开发了许多高效的轻量级临床转换器,参数数量从 1,500 万美元到 6,500 万美元不等。这些模型的性能可与 BioBERT 和 ClinicalBioBERT 等大型模型相媲美,而且明显优于其他基于一般或生物医学数据训练的紧凑型模型。我们在多个标准数据集上进行了广泛的评估,涵盖了一系列临床文本挖掘任务,包括自然语言推理、关系提取、命名实体识别和序列分类。据我们所知,这是第一项专门针对临床 NLP 任务创建高效紧凑转换器的综合性研究。本研究中使用的模型和代码可分别在我们的 Huggingface 简介 https://huggingface.co/nlpie 和 Github 页面 https://github.com/nlpie-research/Lightweight-Clinical-Transformers 上找到,从而提高了我们研究成果的可重复性。
{"title":"Lightweight transformers for clinical natural language processing","authors":"Omid Rohanian, Mohammadmahdi Nouriborji, Hannah Jauncey, Samaneh Kouchaki, Farhad Nooralahzadeh, ISARIC Clinical Characterisation Group, Lei Clifton, Laura Merson, David A. Clifton","doi":"10.1017/s1351324923000542","DOIUrl":"https://doi.org/10.1017/s1351324923000542","url":null,"abstract":"<p>Specialised pre-trained language models are becoming more frequent in Natural language Processing (NLP) since they can potentially outperform models trained on generic texts. BioBERT (Sanh et al., Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. <span>arXiv preprint arXiv: 1910.01108</span>, 2019) and BioClinicalBERT (Alsentzer et al., Publicly available clinical bert embeddings. In <span>Proceedings of the 2nd Clinical Natural Language Processing Workshop</span>, pp. 72–78, 2019) are two examples of such models that have shown promise in medical NLP tasks. Many of these models are overparametrised and resource-intensive, but thanks to techniques like knowledge distillation, it is possible to create smaller versions that perform almost as well as their larger counterparts. In this work, we specifically focus on development of compact language models for processing clinical texts (i.e. progress notes, discharge summaries, etc). We developed a number of efficient lightweight clinical transformers using knowledge distillation and continual learning, with the number of parameters ranging from <span><span><img data-mimesubtype=\"png\" data-type=\"\" src=\"https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20240111120239472-0609:S1351324923000542:S1351324923000542_inline1.png\"><span data-mathjax-type=\"texmath\"><span>$15$</span></span></img></span></span> million to <span><span><img data-mimesubtype=\"png\" data-type=\"\" src=\"https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20240111120239472-0609:S1351324923000542:S1351324923000542_inline2.png\"><span data-mathjax-type=\"texmath\"><span>$65$</span></span></img></span></span> million. These models performed comparably to larger models such as BioBERT and ClinicalBioBERT and significantly outperformed other compact models trained on general or biomedical data. Our extensive evaluation was done across several standard datasets and covered a wide range of clinical text-mining tasks, including natural language inference, relation extraction, named entity recognition and sequence classification. To our knowledge, this is the first comprehensive study specifically focused on creating efficient and compact transformers for clinical NLP tasks. The models and code used in this study can be found on our Huggingface profile at https://huggingface.co/nlpie and Github page at https://github.com/nlpie-research/Lightweight-Clinical-Transformers, respectively, promoting reproducibility of our results.</p>","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"165 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139462167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Actionable conversational quality indicators for improving task-oriented dialog systems 用于改进任务导向型对话系统的可操作对话质量指标
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-09 DOI: 10.1017/s1351324923000372
Michael Higgins, Dominic Widdows, Beth Ann Hockey, Akshay Hazare, Kristen Howell, Gwen Christian, Sujit Mathi, Chris Brew, Andrew Maurer, George Bonev, Matthew Dunn, Joseph Bradley
Automatic dialog systems have become a mainstream part of online customer service. Many such systems are built, maintained, and improved by customer service specialists, rather than dialog systems engineers and computer programmers. As conversations between people and machines become commonplace, it is critical to understand what is working, what is not, and what actions can be taken to reduce the frequency of inappropriate system responses. These analyses and recommendations need to be presented in terms that directly reflect the user experience rather than the internal dialog processing. This paper introduces and explains the use of Actionable Conversational Quality Indicators (ACQIs), which are used both to recognize parts of dialogs that can be improved and to recommend how to improve them. This combines benefits of previous approaches, some of which have focused on producing dialog quality scoring while others have sought to categorize the types of errors the dialog system is making. We demonstrate the effectiveness of using ACQIs on LivePerson internal dialog systems used in commercial customer service applications and on the publicly available LEGOv2 conversational dataset. We report on the annotation and analysis of conversational datasets showing which ACQIs are important to fix in various situations. The annotated datasets are then used to build a predictive model which uses a turn-based vector embedding of the message texts and achieves a 79% weighted average f1-measure at the task of finding the correct ACQI for a given conversation. We predict that if such a model worked perfectly, the range of potential improvement actions a bot-builder must consider at each turn could be reduced by an average of 81%.
自动对话系统已成为在线客户服务的主流。许多此类系统都是由客户服务专家而不是对话系统工程师和计算机程序员建立、维护和改进的。随着人与机器之间的对话变得越来越普遍,了解哪些是有效的,哪些是无效的,以及可以采取哪些措施来减少不恰当的系统响应频率就变得至关重要。这些分析和建议需要以直接反映用户体验而非内部对话处理的方式来呈现。本文介绍并解释了可操作对话质量指标(ACQIs)的使用,它既可用于识别对话中可改进的部分,也可用于建议如何改进。这种方法结合了以往方法的优点,其中一些方法侧重于生成对话质量评分,而另一些方法则试图对对话系统所犯的错误类型进行分类。我们在商业客户服务应用中使用的 LivePerson 内部对话系统和公开的 LEGOv2 对话数据集上展示了 ACQIs 的有效性。我们报告了对话数据集的注释和分析情况,显示了在各种情况下哪些 ACQI 需要修复。注释后的数据集被用于建立一个预测模型,该模型使用基于回合的消息文本向量嵌入,在为给定对话找到正确的 ACQI 的任务中实现了 79% 的加权平均 f1-measure。我们预测,如果这样一个模型能够完美运行,那么机器人开发者在每个回合必须考虑的潜在改进措施的范围平均可以减少 81%。
{"title":"Actionable conversational quality indicators for improving task-oriented dialog systems","authors":"Michael Higgins, Dominic Widdows, Beth Ann Hockey, Akshay Hazare, Kristen Howell, Gwen Christian, Sujit Mathi, Chris Brew, Andrew Maurer, George Bonev, Matthew Dunn, Joseph Bradley","doi":"10.1017/s1351324923000372","DOIUrl":"https://doi.org/10.1017/s1351324923000372","url":null,"abstract":"Automatic dialog systems have become a mainstream part of online customer service. Many such systems are built, maintained, and improved by customer service specialists, rather than dialog systems engineers and computer programmers. As conversations between people and machines become commonplace, it is critical to understand what is working, what is not, and what actions can be taken to reduce the frequency of inappropriate system responses. These analyses and recommendations need to be presented in terms that directly reflect the user experience rather than the internal dialog processing. This paper introduces and explains the use of Actionable Conversational Quality Indicators (ACQIs), which are used both to recognize parts of dialogs that can be improved and to recommend how to improve them. This combines benefits of previous approaches, some of which have focused on producing dialog quality scoring while others have sought to categorize the types of errors the dialog system is making. We demonstrate the effectiveness of using ACQIs on LivePerson internal dialog systems used in commercial customer service applications and on the publicly available LEGOv2 conversational dataset. We report on the annotation and analysis of conversational datasets showing which ACQIs are important to fix in various situations. The annotated datasets are then used to build a predictive model which uses a turn-based vector embedding of the message texts and achieves a 79% weighted average f1-measure at the task of finding the correct ACQI for a given conversation. We predict that if such a model worked perfectly, the range of potential improvement actions a bot-builder must consider at each turn could be reduced by an average of 81%.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"151 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139410430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A year’s a long time in generative AI 生成式人工智能的漫长岁月
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-08 DOI: 10.1017/s1351324923000554
Robert Dale

A lot has happened since OpenAI released ChatGPT to the public in November 2022. We review how things unfolded over the course of the year, tracking significant events and announcements from the tech giants leading the generative AI race and from other players of note; along the way we note the wider impacts of the technology’s progress.

自 2022 年 11 月 OpenAI 向公众发布 ChatGPT 以来,发生了很多事情。我们回顾了这一年来发生的事情,追踪了引领生成式人工智能竞赛的科技巨头和其他知名企业的重大事件和公告;同时,我们也注意到了技术进步带来的更广泛影响。
{"title":"A year’s a long time in generative AI","authors":"Robert Dale","doi":"10.1017/s1351324923000554","DOIUrl":"https://doi.org/10.1017/s1351324923000554","url":null,"abstract":"<p>A lot has happened since OpenAI released ChatGPT to the public in November 2022. We review how things unfolded over the course of the year, tracking significant events and announcements from the tech giants leading the generative AI race and from other players of note; along the way we note the wider impacts of the technology’s progress.</p>","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"22 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139397505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OffensEval 2023: Offensive language identification in the age of Large Language Models 大型语言模型时代的攻击性语言识别
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-12-06 DOI: 10.1017/s1351324923000517
Marcos Zampieri, Sara Rosenthal, Preslav Nakov, Alphaeus Dmonte, Tharindu Ranasinghe

The OffensEval shared tasks organized as part of SemEval-2019–2020 were very popular, attracting over 1300 participating teams. The two editions of the shared task helped advance the state of the art in offensive language identification by providing the community with benchmark datasets in Arabic, Danish, English, Greek, and Turkish. The datasets were annotated using the OLID hierarchical taxonomy, which since then has become the de facto standard in general offensive language identification research and was widely used beyond OffensEval. We present a survey of OffensEval and related competitions, and we discuss the main lessons learned. We further evaluate the performance of Large Language Models (LLMs), which have recently revolutionalized the field of Natural Language Processing. We use zero-shot prompting with six popular LLMs and zero-shot learning with two task-specific fine-tuned BERT models, and we compare the results against those of the top-performing teams at the OffensEval competitions. Our results show that while some LMMs such as Flan-T5 achieve competitive performance, in general LLMs lag behind the best OffensEval systems.

作为SemEval-2019-2020的一部分,“offensive seval”共享任务非常受欢迎,吸引了1300多个团队参与。共享任务的两个版本通过向社区提供阿拉伯文、丹麦文、英文、希腊文和土耳其文的基准数据集,帮助推进了攻击性语言识别的最新技术。数据集使用OLID分层分类法进行注释,从那时起,OLID已成为一般攻击性语言识别研究的事实上的标准,并在OffensEval之外被广泛使用。本文介绍了对OffensEval和相关比赛的调查,并讨论了从中获得的主要经验教训。我们进一步评估了大型语言模型(llm)的性能,它们最近彻底改变了自然语言处理领域。我们使用六个流行的llm的零射击提示和两个特定任务微调BERT模型的零射击学习,并将结果与OffensEval比赛中表现最好的团队的结果进行比较。我们的研究结果表明,虽然一些lmm(如Flan-T5)达到了具有竞争力的性能,但总体而言,llm落后于最佳的OffensEval系统。
{"title":"OffensEval 2023: Offensive language identification in the age of Large Language Models","authors":"Marcos Zampieri, Sara Rosenthal, Preslav Nakov, Alphaeus Dmonte, Tharindu Ranasinghe","doi":"10.1017/s1351324923000517","DOIUrl":"https://doi.org/10.1017/s1351324923000517","url":null,"abstract":"<p>The OffensEval shared tasks organized as part of SemEval-2019–2020 were very popular, attracting over 1300 participating teams. The two editions of the shared task helped advance the state of the art in offensive language identification by providing the community with benchmark datasets in Arabic, Danish, English, Greek, and Turkish. The datasets were annotated using the OLID hierarchical taxonomy, which since then has become the <span>de facto</span> standard in general offensive language identification research and was widely used beyond OffensEval. We present a survey of OffensEval and related competitions, and we discuss the main lessons learned. We further evaluate the performance of Large Language Models (LLMs), which have recently revolutionalized the field of Natural Language Processing. We use zero-shot prompting with six popular LLMs and zero-shot learning with two task-specific fine-tuned BERT models, and we compare the results against those of the top-performing teams at the OffensEval competitions. Our results show that while some LMMs such as Flan-T5 achieve competitive performance, in general LLMs lag behind the best OffensEval systems.</p>","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"187 ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138506470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Preface: Special issue on NLP approaches to offensive content online 序言:关于网络攻击性内容的NLP方法的特刊
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-12-06 DOI: 10.1017/s1351324923000499
Marcos Zampieri, Isabelle Augenstein, Siddharth Krishnan, Joshua Melton, Preslav Nakov
We are delighted to present the Special Issue on NLP Approaches to Offensive Content Online published in the Journal of Natural Language Engineering issue 29.6. We are happy to have received a total of 26 submissions to the special issue evidencing the interest of the NLP community in this topic. Our guest editorial board comprised of international experts in the field has worked hard to review all submissions over multiple rounds of peer review. Ultimately, we accepted nine articles to appear in this special issue.
我们很高兴在《自然语言工程杂志》第29.6期上发表关于在线攻击性内容的NLP方法的特刊。我们很高兴总共收到了26份对特刊的提交,证明了NLP社区对这个主题的兴趣。我们的客座编辑委员会由该领域的国际专家组成,经过多轮同行评审,努力审查所有提交的材料。最终,我们接受了9篇文章刊登在这期特刊上。
{"title":"Preface: Special issue on NLP approaches to offensive content online","authors":"Marcos Zampieri, Isabelle Augenstein, Siddharth Krishnan, Joshua Melton, Preslav Nakov","doi":"10.1017/s1351324923000499","DOIUrl":"https://doi.org/10.1017/s1351324923000499","url":null,"abstract":"We are delighted to present the Special Issue on NLP Approaches to Offensive Content Online published in the Journal of Natural Language Engineering issue 29.6. We are happy to have received a total of 26 submissions to the special issue evidencing the interest of the NLP community in this topic. Our guest editorial board comprised of international experts in the field has worked hard to review all submissions over multiple rounds of peer review. Ultimately, we accepted nine articles to appear in this special issue.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"16 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138543432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-to-text generation using conditional generative adversarial with enhanced transformer 使用条件生成对抗和增强的转换器进行数据到文本的生成
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-11-28 DOI: 10.1017/s1351324923000487
Elham Seifossadat, Hossein Sameti
In this paper, we propose an enhanced version of the vanilla transformer for data-to-text generation and then use it as the generator of a conditional generative adversarial model to improve the semantic quality and diversity of output sentences. Specifically, by adding a diagonal mask matrix to the attention scores of the encoder and using the history of the attention weights in the decoder, this enhanced version of the vanilla transformer prevents semantic defects in the output text. Also, using this enhanced transformer along with a triplet network, respectively, as the generator and discriminator of conditional generative adversarial network, diversity and semantic quality of sentences are guaranteed. To prove the effectiveness of the proposed model, called conditional generative adversarial with enhanced transformer (CGA-ET), we performed experiments on three different datasets and observed that our proposed model is able to achieve better results than the baselines models in terms of BLEU, METEOR, NIST, ROUGE-L, CIDEr, BERTScore, and SER automatic evaluation metrics as well as human evaluation.
在本文中,我们提出了一个用于数据到文本生成的香草转换器的增强版本,然后将其用作条件生成对抗模型的生成器,以提高输出句子的语义质量和多样性。具体来说,通过向编码器的注意力分数添加对角掩码矩阵,并使用解码器中注意力权重的历史记录,这个增强版的香草转换器可以防止输出文本中的语义缺陷。同时,将该增强的变压器与一个三重网络分别作为条件生成对抗网络的生成器和判别器,保证了句子的多样性和语义质量。为了证明所提出的条件生成对抗增强变压器(CGA-ET)模型的有效性,我们在三个不同的数据集上进行了实验,并观察到我们所提出的模型在BLEU、METEOR、NIST、ROUGE-L、CIDEr、BERTScore和SER自动评估指标以及人类评估方面能够取得比基线模型更好的结果。
{"title":"Data-to-text generation using conditional generative adversarial with enhanced transformer","authors":"Elham Seifossadat, Hossein Sameti","doi":"10.1017/s1351324923000487","DOIUrl":"https://doi.org/10.1017/s1351324923000487","url":null,"abstract":"In this paper, we propose an enhanced version of the vanilla transformer for data-to-text generation and then use it as the generator of a conditional generative adversarial model to improve the semantic quality and diversity of output sentences. Specifically, by adding a diagonal mask matrix to the attention scores of the encoder and using the history of the attention weights in the decoder, this enhanced version of the vanilla transformer prevents semantic defects in the output text. Also, using this enhanced transformer along with a triplet network, respectively, as the generator and discriminator of conditional generative adversarial network, diversity and semantic quality of sentences are guaranteed. To prove the effectiveness of the proposed model, called conditional generative adversarial with enhanced transformer (CGA-ET), we performed experiments on three different datasets and observed that our proposed model is able to achieve better results than the baselines models in terms of BLEU, METEOR, NIST, ROUGE-L, CIDEr, BERTScore, and SER automatic evaluation metrics as well as human evaluation.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"212 ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138506467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abstractive summarization with deep reinforcement learning using semantic similarity rewards 基于语义相似度奖励的深度强化学习抽象摘要
3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-31 DOI: 10.1017/s1351324923000505
Figen Beken Fikri, Kemal Oflazer, Berrin Yanıkoğlu
Abstract Abstractive summarization is an approach to document summarization that is not limited to selecting sentences from the document but can generate new sentences as well. We address the two main challenges in abstractive summarization: how to evaluate the performance of a summarization model and what is a good training objective. We first introduce new evaluation measures based on the semantic similarity of the input and corresponding summary. The similarity scores are obtained by the fine-tuned BERTurk model using either the cross-encoder or a bi-encoder architecture. The fine-tuning is done on the Turkish Natural Language Inference and Semantic Textual Similarity benchmark datasets. We show that these measures have better correlations with human evaluations compared to Recall-Oriented Understudy for Gisting Evaluation (ROUGE) scores and BERTScore. We then introduce a deep reinforcement learning algorithm that uses the proposed semantic similarity measures as rewards, together with a mixed training objective, in order to generate more natural summaries in terms of human readability. We show that training with a mixed training objective function compared to only the maximum-likelihood objective improves similarity scores.
摘要抽象摘要是一种文档摘要方法,它不仅限于从文档中选择句子,而且还可以生成新的句子。我们解决了抽象摘要中的两个主要挑战:如何评估摘要模型的性能以及什么是一个好的训练目标。我们首先引入了基于输入的语义相似度和相应的摘要的评价方法。相似性分数是由微调BERTurk模型使用交叉编码器或双编码器架构获得的。在土耳其语自然语言推理和语义文本相似度基准数据集上进行了微调。我们发现,与记忆导向的注册评估(ROUGE)分数和BERTScore分数相比,这些指标与人类评价有更好的相关性。然后,我们引入了一种深度强化学习算法,该算法使用所提出的语义相似性度量作为奖励,以及混合训练目标,以便在人类可读性方面生成更自然的摘要。我们表明,与只有最大似然目标相比,使用混合训练目标函数的训练提高了相似度得分。
{"title":"Abstractive summarization with deep reinforcement learning using semantic similarity rewards","authors":"Figen Beken Fikri, Kemal Oflazer, Berrin Yanıkoğlu","doi":"10.1017/s1351324923000505","DOIUrl":"https://doi.org/10.1017/s1351324923000505","url":null,"abstract":"Abstract Abstractive summarization is an approach to document summarization that is not limited to selecting sentences from the document but can generate new sentences as well. We address the two main challenges in abstractive summarization: how to evaluate the performance of a summarization model and what is a good training objective. We first introduce new evaluation measures based on the semantic similarity of the input and corresponding summary. The similarity scores are obtained by the fine-tuned BERTurk model using either the cross-encoder or a bi-encoder architecture. The fine-tuning is done on the Turkish Natural Language Inference and Semantic Textual Similarity benchmark datasets. We show that these measures have better correlations with human evaluations compared to Recall-Oriented Understudy for Gisting Evaluation (ROUGE) scores and BERTScore. We then introduce a deep reinforcement learning algorithm that uses the proposed semantic similarity measures as rewards, together with a mixed training objective, in order to generate more natural summaries in terms of human readability. We show that training with a mixed training objective function compared to only the maximum-likelihood objective improves similarity scores.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"22 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135863816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural Arabic singular-to-plural conversion using a pretrained Character-BERT and a fused transformer 使用预训练Character-BERT和熔断变压器的神经阿拉伯语单数到复数转换
3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-11 DOI: 10.1017/s1351324923000475
Azzam Radman, Mohammed Atros, Rehab Duwairi
Abstract Morphological re-inflection generation is one of the most challenging tasks in the natural language processing (NLP) domain, especially with morphologically rich, low-resource languages like Arabic. In this research, we investigate the ability of transformer-based models in the singular-to-plural Arabic noun conversion task. We start with pretraining a Character-BERT model on a masked language modeling task using 1,134,950 Arabic words and then adopting the fusion technique to transfer the knowledge gained by the pretrained model to a full encoder–decoder transformer model, in one of the proposed settings. The second proposed setting directly fuses the output Character-BERT embeddings into the decoder. We then analyze and compare the performance of the two architectures and provide an interpretability section in which we track the features of attention with respect to the model. We perform the interpretation on both the macro and micro levels, providing some individual examples. Moreover, we provide a thorough error analysis showing the strengths and weaknesses of the proposed framework. To the best of our knowledge, this is the first effort in the Arabic NLP domain that adopts the development of an end-to-end fused-transformer deep learning model to address the problem of singular-to-plural conversion.
形态重屈折的生成是自然语言处理(NLP)领域中最具挑战性的任务之一,特别是对于像阿拉伯语这样形态丰富、资源匮乏的语言。在本研究中,我们研究了基于变压器的模型在阿拉伯名词单复数转换任务中的能力。我们首先在使用1,134,950个阿拉伯单词的掩码语言建模任务上预训练Character-BERT模型,然后采用融合技术将预训练模型获得的知识转移到一个完整的编码器-解码器转换器模型中,在其中一个建议的设置中。第二种建议直接将输出的Character-BERT嵌入融合到解码器中。然后,我们分析和比较了两种架构的性能,并提供了一个可解释性部分,在该部分中,我们跟踪了与模型相关的注意力特征。我们从宏观和微观两个层面进行解释,并提供了一些单独的例子。此外,我们提供了一个彻底的错误分析,显示了所提议的框架的优点和缺点。据我们所知,这是阿拉伯语NLP领域首次采用端到端融合变压器深度学习模型来解决单数到复数转换问题。
{"title":"Neural Arabic singular-to-plural conversion using a pretrained Character-BERT and a fused transformer","authors":"Azzam Radman, Mohammed Atros, Rehab Duwairi","doi":"10.1017/s1351324923000475","DOIUrl":"https://doi.org/10.1017/s1351324923000475","url":null,"abstract":"Abstract Morphological re-inflection generation is one of the most challenging tasks in the natural language processing (NLP) domain, especially with morphologically rich, low-resource languages like Arabic. In this research, we investigate the ability of transformer-based models in the singular-to-plural Arabic noun conversion task. We start with pretraining a Character-BERT model on a masked language modeling task using 1,134,950 Arabic words and then adopting the fusion technique to transfer the knowledge gained by the pretrained model to a full encoder–decoder transformer model, in one of the proposed settings. The second proposed setting directly fuses the output Character-BERT embeddings into the decoder. We then analyze and compare the performance of the two architectures and provide an interpretability section in which we track the features of attention with respect to the model. We perform the interpretation on both the macro and micro levels, providing some individual examples. Moreover, we provide a thorough error analysis showing the strengths and weaknesses of the proposed framework. To the best of our knowledge, this is the first effort in the Arabic NLP domain that adopts the development of an end-to-end fused-transformer deep learning model to address the problem of singular-to-plural conversion.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"30 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136209486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Perceptional and actional enrichment for metaphor detection with sensorimotor norms 用感觉运动规范进行隐喻检测的感知和行为富集
3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-09-20 DOI: 10.1017/s135132492300044x
Mingyu Wan, Qi Su, Kathleen Ahrens, Chu-Ren Huang
Abstract Understanding the nature of meaning and its extensions (with metaphor as one typical kind) has been one core issue in figurative language study since Aristotle’s time. This research takes a computational cognitive perspective to model metaphor based on the assumption that meaning is perceptual, embodied, and encyclopedic. We model word meaning representation for metaphor detection with embodiment information obtained from behavioral experiments. Our work is the first attempt to incorporate sensorimotor knowledge into neural networks for metaphor detection, and demonstrates superiority, consistency, and interpretability compared to peer systems based on two general datasets. In addition, with cross-sectional analysis of different feature schemas, our results suggest that metaphor, as a device of cognitive conceptualization, can be ‘learned’ from the perceptual and actional information independent of several more explicit levels of linguistic representation. The access to such knowledge allows us to probe further into word meaning mapping tendencies relevant to our conceptualization and reaction to the physical world.
自亚里士多德时代以来,理解意义的本质及其延伸(隐喻是其中一种典型的延伸)一直是比喻语言研究的核心问题。本研究从计算认知的角度对隐喻进行建模,假设意义是感性的、具身的和百科全书式的。我们利用行为实验中获得的体现信息对隐喻检测的词义表示建模。我们的工作是第一次尝试将感觉运动知识整合到神经网络中进行隐喻检测,并与基于两个通用数据集的同类系统相比,展示了优越性、一致性和可解释性。此外,通过对不同特征图式的横断面分析,我们的研究结果表明,隐喻作为一种认知概念化的手段,可以独立于几个更明确的语言表征层面,从感知和行为信息中“学习”出来。获得这些知识使我们能够进一步探索与我们对物质世界的概念化和反应相关的词义映射趋势。
{"title":"Perceptional and actional enrichment for metaphor detection with sensorimotor norms","authors":"Mingyu Wan, Qi Su, Kathleen Ahrens, Chu-Ren Huang","doi":"10.1017/s135132492300044x","DOIUrl":"https://doi.org/10.1017/s135132492300044x","url":null,"abstract":"Abstract Understanding the nature of meaning and its extensions (with metaphor as one typical kind) has been one core issue in figurative language study since Aristotle’s time. This research takes a computational cognitive perspective to model metaphor based on the assumption that meaning is perceptual, embodied, and encyclopedic. We model word meaning representation for metaphor detection with embodiment information obtained from behavioral experiments. Our work is the first attempt to incorporate sensorimotor knowledge into neural networks for metaphor detection, and demonstrates superiority, consistency, and interpretability compared to peer systems based on two general datasets. In addition, with cross-sectional analysis of different feature schemas, our results suggest that metaphor, as a device of cognitive conceptualization, can be ‘learned’ from the perceptual and actional information independent of several more explicit levels of linguistic representation. The access to such knowledge allows us to probe further into word meaning mapping tendencies relevant to our conceptualization and reaction to the physical world.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136313777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Natural Language Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1