基于深度学习的业务需求文本分类方法研究

Weibing Ding, S. Jin, Yan Ren, Fangzhou Liu
{"title":"基于深度学习的业务需求文本分类方法研究","authors":"Weibing Ding, S. Jin, Yan Ren, Fangzhou Liu","doi":"10.1145/3558819.3565082","DOIUrl":null,"url":null,"abstract":"The text of power grid business cost demand is complex and the description cannot be unified and standardized. As a single text description involves multiple business types, it is difficult to judge the business cost type. This paper presents a classification method of clustering specific cost types for business cost requirements text. Firstly, the business cost requirement text is transformed, and the key weight parameters in the Chinese word segmentation model are improved iteratively according to the cost representation report to obtain the global semantic vector. At the same time, the weights of recognition loss values of different samples were dynamically modified according to the difficulty of sample fitting. In this paper, the existing text clustering model is improved by k-means clustering algorithm model, and the cost types of 450 real business cost demand texts in the province are identified. The results show that the performance index value of the text classification method proposed in this paper is better than the commonly used text classification method, and the F1 value of the algorithm in this paper reaches more than 93%. The value of F1 is more than 3.5% higher than that of single BERT model.","PeriodicalId":373484,"journal":{"name":"Proceedings of the 7th International Conference on Cyber Security and Information Engineering","volume":"190 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Research on classification method of business requirement text based on deep learning\",\"authors\":\"Weibing Ding, S. Jin, Yan Ren, Fangzhou Liu\",\"doi\":\"10.1145/3558819.3565082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The text of power grid business cost demand is complex and the description cannot be unified and standardized. As a single text description involves multiple business types, it is difficult to judge the business cost type. This paper presents a classification method of clustering specific cost types for business cost requirements text. Firstly, the business cost requirement text is transformed, and the key weight parameters in the Chinese word segmentation model are improved iteratively according to the cost representation report to obtain the global semantic vector. At the same time, the weights of recognition loss values of different samples were dynamically modified according to the difficulty of sample fitting. In this paper, the existing text clustering model is improved by k-means clustering algorithm model, and the cost types of 450 real business cost demand texts in the province are identified. The results show that the performance index value of the text classification method proposed in this paper is better than the commonly used text classification method, and the F1 value of the algorithm in this paper reaches more than 93%. The value of F1 is more than 3.5% higher than that of single BERT model.\",\"PeriodicalId\":373484,\"journal\":{\"name\":\"Proceedings of the 7th International Conference on Cyber Security and Information Engineering\",\"volume\":\"190 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th International Conference on Cyber Security and Information Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3558819.3565082\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on Cyber Security and Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3558819.3565082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

电网业务成本需求文本复杂,描述不能统一、规范。由于单一文本描述涉及多个业务类型,因此难以判断业务成本类型。针对企业成本需求文本,提出了一种聚类特定成本类型的分类方法。首先对业务成本需求文本进行变换,并根据成本表示报告对中文分词模型中的关键权重参数进行迭代改进,得到全局语义向量;同时,根据样本拟合的难易程度,动态修改不同样本的识别损失值权重。本文通过k-means聚类算法模型对现有文本聚类模型进行改进,识别出全省450个真实商业成本需求文本的成本类型。结果表明,本文提出的文本分类方法的性能指标值优于常用的文本分类方法,本文算法的F1值达到93%以上。F1值比单一BERT模型高3.5%以上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Research on classification method of business requirement text based on deep learning
The text of power grid business cost demand is complex and the description cannot be unified and standardized. As a single text description involves multiple business types, it is difficult to judge the business cost type. This paper presents a classification method of clustering specific cost types for business cost requirements text. Firstly, the business cost requirement text is transformed, and the key weight parameters in the Chinese word segmentation model are improved iteratively according to the cost representation report to obtain the global semantic vector. At the same time, the weights of recognition loss values of different samples were dynamically modified according to the difficulty of sample fitting. In this paper, the existing text clustering model is improved by k-means clustering algorithm model, and the cost types of 450 real business cost demand texts in the province are identified. The results show that the performance index value of the text classification method proposed in this paper is better than the commonly used text classification method, and the F1 value of the algorithm in this paper reaches more than 93%. The value of F1 is more than 3.5% higher than that of single BERT model.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Development and Application of Portable Multi-Function Power Distribution Emergency Repair Standardized Equipment Research on Automatic Self-healing Control of Intelligent Feeder based on Multi-Agent Algorithm Research and implementation of IP address management in medium and large-scale local area networks Application of Compressive Sensing Technology and Image Processing in Space Exploration House Price Prediction Model Using Bridge Memristors Recurrent Neural Network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1