用于生成中文文本摘要的改进型 mT5 模型

AI, Machine Learning and Applications Pub Date : 2024-01-27 DOI:10.5121/csit.2024.140214

Fuping Ren, Jian Chen, Defu Zhang

{"title":"用于生成中文文本摘要的改进型 mT5 模型","authors":"Fuping Ren, Jian Chen, Defu Zhang","doi":"10.5121/csit.2024.140214","DOIUrl":null,"url":null,"abstract":"Understanding complex policy documents can be challenging, highlighting the need for intelligent interpretation of Chinese policies. To enhance Chinese text summarization, this study utilized the mT5 model as the core framework and initial weights. Additionally, it reduced model size through parameter clipping, employed the Gap Sentence Generation (GSG) method as an unsupervised technique, and enhanced the Chinese tokenizer. After training on a meticulously processed 30GB Chinese training corpus, the study developed the enhanced mT5- GSG model. When fine-tuning on Chinese policy texts, it adopted the \"Dropout Twice\" approach and ingeniously merged the probability distribution of the two dropouts using the Wasserstein distance. Experimental results indicate that the proposed model achieved Rouge-1, Rouge-2, and Rouge-L scores of 56.13%, 45.76%, and 56.41% respectively on the Chinese policy text summarization dataset.","PeriodicalId":104179,"journal":{"name":"AI, Machine Learning and Applications","volume":"86 9-10","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Improved mT5 Model for Chinese Text Summary Generation\",\"authors\":\"Fuping Ren, Jian Chen, Defu Zhang\",\"doi\":\"10.5121/csit.2024.140214\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Understanding complex policy documents can be challenging, highlighting the need for intelligent interpretation of Chinese policies. To enhance Chinese text summarization, this study utilized the mT5 model as the core framework and initial weights. Additionally, it reduced model size through parameter clipping, employed the Gap Sentence Generation (GSG) method as an unsupervised technique, and enhanced the Chinese tokenizer. After training on a meticulously processed 30GB Chinese training corpus, the study developed the enhanced mT5- GSG model. When fine-tuning on Chinese policy texts, it adopted the \\\"Dropout Twice\\\" approach and ingeniously merged the probability distribution of the two dropouts using the Wasserstein distance. Experimental results indicate that the proposed model achieved Rouge-1, Rouge-2, and Rouge-L scores of 56.13%, 45.76%, and 56.41% respectively on the Chinese policy text summarization dataset.\",\"PeriodicalId\":104179,\"journal\":{\"name\":\"AI, Machine Learning and Applications\",\"volume\":\"86 9-10\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AI, Machine Learning and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5121/csit.2024.140214\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI, Machine Learning and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/csit.2024.140214","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

理解复杂的政策文件可能具有挑战性，因此需要对中文政策进行智能解读。为加强中文文本摘要，本研究利用 mT5 模型作为核心框架和初始权重。此外，它还通过参数裁剪缩小了模型大小，采用了间隙句生成（GSG）方法作为无监督技术，并增强了中文标记符。在对经过精心处理的 30GB 中文训练语料进行训练后，该研究开发出了增强型 mT5- GSG 模型。在对中文政策文本进行微调时，研究采用了 "Dropout Twice "方法，并巧妙地利用 Wasserstein 距离合并了两次 dropout 的概率分布。实验结果表明，在中文政策文本摘要数据集上，所提出的模型分别获得了 56.13%、45.76% 和 56.41% 的 Rouge-1、Rouge-2 和 Rouge-L 分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

An Improved mT5 Model for Chinese Text Summary Generation

Understanding complex policy documents can be challenging, highlighting the need for intelligent interpretation of Chinese policies. To enhance Chinese text summarization, this study utilized the mT5 model as the core framework and initial weights. Additionally, it reduced model size through parameter clipping, employed the Gap Sentence Generation (GSG) method as an unsupervised technique, and enhanced the Chinese tokenizer. After training on a meticulously processed 30GB Chinese training corpus, the study developed the enhanced mT5- GSG model. When fine-tuning on Chinese policy texts, it adopted the "Dropout Twice" approach and ingeniously merged the probability distribution of the two dropouts using the Wasserstein distance. Experimental results indicate that the proposed model achieved Rouge-1, Rouge-2, and Rouge-L scores of 56.13%, 45.76%, and 56.41% respectively on the Chinese policy text summarization dataset.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

AI, Machine Learning and Applications

自引率

0.00%

发文量

期刊最新文献

An Improved mT5 Model for Chinese Text Summary Generation Prior-Information Enhanced Reinforcement Learning for Energy Management Systems Building a Robust Federated Learning based Intrusion Detection System in Internet of Things Improving Salience-Based Multi-Document Summarization Performance using a Hybrid Sentence Similarity Measure Unsupervised Anomaly Detection