A novel abstractive summarization model based on topic-aware and contrastive learning

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE International Journal of Machine Learning and Cybernetics Pub Date : 2024-07-04 DOI:10.1007/s13042-024-02263-8

Huanling Tang, Ruiquan Li, Wenhao Duan, Quansheng Dou, Mingyu Lu

{"title":"A novel abstractive summarization model based on topic-aware and contrastive learning","authors":"Huanling Tang, Ruiquan Li, Wenhao Duan, Quansheng Dou, Mingyu Lu","doi":"10.1007/s13042-024-02263-8","DOIUrl":null,"url":null,"abstract":"<p>The majority of abstractive summarization models are designed based on the Sequence-to-Sequence(Seq2Seq) architecture. These models are able to capture syntactic and contextual information between words. However, Seq2Seq-based summarization models tend to overlook global semantic information. Moreover, there exist inconsistency between the objective function and evaluation metrics of this model. To address these limitations, a novel model named ASTCL is proposed in this paper. It integrates the neural topic model into the Seq2Seq framework innovatively, aiming to capture the text’s global semantic information and guide the summary generation. Additionally, it incorporates contrastive learning techniques to mitigate the discrepancy between the objective loss and the evaluation metrics through scoring multiple candidate summaries. On CNN/DM XSum and NYT datasets, the experimental results demonstrate that the ASTCL model outperforms the other generic models in summarization task.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"48 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Machine Learning and Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s13042-024-02263-8","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The majority of abstractive summarization models are designed based on the Sequence-to-Sequence(Seq2Seq) architecture. These models are able to capture syntactic and contextual information between words. However, Seq2Seq-based summarization models tend to overlook global semantic information. Moreover, there exist inconsistency between the objective function and evaluation metrics of this model. To address these limitations, a novel model named ASTCL is proposed in this paper. It integrates the neural topic model into the Seq2Seq framework innovatively, aiming to capture the text’s global semantic information and guide the summary generation. Additionally, it incorporates contrastive learning techniques to mitigate the discrepancy between the objective loss and the evaluation metrics through scoring multiple candidate summaries. On CNN/DM XSum and NYT datasets, the experimental results demonstrate that the ASTCL model outperforms the other generic models in summarization task.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于主题感知和对比学习的新型抽象摘要模型

大多数抽象摘要模型都是基于序列到序列（Sequence-to-Sequence，Seq2Seq）架构设计的。这些模型能够捕捉词与词之间的句法和上下文信息。然而，基于 Seq2Seq 的摘要模型往往会忽略全局语义信息。此外，该模型的目标函数和评价指标之间也存在不一致。针对这些局限性，本文提出了一种名为 ASTCL 的新型模型。它将神经主题模型创新性地集成到 Seq2Seq 框架中，旨在捕捉文本的全局语义信息并指导摘要的生成。此外，它还结合了对比学习技术，通过对多个候选摘要进行评分来减少客观损失与评价指标之间的差异。在 CNN/DM XSum 和 NYT 数据集上的实验结果表明，ASTCL 模型在摘要任务中的表现优于其他通用模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Machine Learning and Cybernetics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

7.90

自引率

10.70%

发文量

225

期刊介绍： Cybernetics is concerned with describing complex interactions and interrelationships between systems which are omnipresent in our daily life. Machine Learning discovers fundamental functional relationships between variables and ensembles of variables in systems. The merging of the disciplines of Machine Learning and Cybernetics is aimed at the discovery of various forms of interaction between systems through diverse mechanisms of learning from data. The International Journal of Machine Learning and Cybernetics (IJMLC) focuses on the key research problems emerging at the junction of machine learning and cybernetics and serves as a broad forum for rapid dissemination of the latest advancements in the area. The emphasis of IJMLC is on the hybrid development of machine learning and cybernetics schemes inspired by different contributing disciplines such as engineering, mathematics, cognitive sciences, and applications. New ideas, design alternatives, implementations and case studies pertaining to all the aspects of machine learning and cybernetics fall within the scope of the IJMLC. Key research areas to be covered by the journal include: Machine Learning for modeling interactions between systems Pattern Recognition technology to support discovery of system-environment interaction Control of system-environment interactions Biochemical interaction in biological and biologically-inspired systems Learning for improvement of communication schemes between systems