{"title":"基于深度学习的业务需求文本分类方法研究","authors":"Weibing Ding, S. Jin, Yan Ren, Fangzhou Liu","doi":"10.1145/3558819.3565082","DOIUrl":null,"url":null,"abstract":"The text of power grid business cost demand is complex and the description cannot be unified and standardized. As a single text description involves multiple business types, it is difficult to judge the business cost type. This paper presents a classification method of clustering specific cost types for business cost requirements text. Firstly, the business cost requirement text is transformed, and the key weight parameters in the Chinese word segmentation model are improved iteratively according to the cost representation report to obtain the global semantic vector. At the same time, the weights of recognition loss values of different samples were dynamically modified according to the difficulty of sample fitting. In this paper, the existing text clustering model is improved by k-means clustering algorithm model, and the cost types of 450 real business cost demand texts in the province are identified. The results show that the performance index value of the text classification method proposed in this paper is better than the commonly used text classification method, and the F1 value of the algorithm in this paper reaches more than 93%. The value of F1 is more than 3.5% higher than that of single BERT model.","PeriodicalId":373484,"journal":{"name":"Proceedings of the 7th International Conference on Cyber Security and Information Engineering","volume":"190 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Research on classification method of business requirement text based on deep learning\",\"authors\":\"Weibing Ding, S. Jin, Yan Ren, Fangzhou Liu\",\"doi\":\"10.1145/3558819.3565082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The text of power grid business cost demand is complex and the description cannot be unified and standardized. As a single text description involves multiple business types, it is difficult to judge the business cost type. This paper presents a classification method of clustering specific cost types for business cost requirements text. Firstly, the business cost requirement text is transformed, and the key weight parameters in the Chinese word segmentation model are improved iteratively according to the cost representation report to obtain the global semantic vector. At the same time, the weights of recognition loss values of different samples were dynamically modified according to the difficulty of sample fitting. In this paper, the existing text clustering model is improved by k-means clustering algorithm model, and the cost types of 450 real business cost demand texts in the province are identified. The results show that the performance index value of the text classification method proposed in this paper is better than the commonly used text classification method, and the F1 value of the algorithm in this paper reaches more than 93%. The value of F1 is more than 3.5% higher than that of single BERT model.\",\"PeriodicalId\":373484,\"journal\":{\"name\":\"Proceedings of the 7th International Conference on Cyber Security and Information Engineering\",\"volume\":\"190 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th International Conference on Cyber Security and Information Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3558819.3565082\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on Cyber Security and Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3558819.3565082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on classification method of business requirement text based on deep learning
The text of power grid business cost demand is complex and the description cannot be unified and standardized. As a single text description involves multiple business types, it is difficult to judge the business cost type. This paper presents a classification method of clustering specific cost types for business cost requirements text. Firstly, the business cost requirement text is transformed, and the key weight parameters in the Chinese word segmentation model are improved iteratively according to the cost representation report to obtain the global semantic vector. At the same time, the weights of recognition loss values of different samples were dynamically modified according to the difficulty of sample fitting. In this paper, the existing text clustering model is improved by k-means clustering algorithm model, and the cost types of 450 real business cost demand texts in the province are identified. The results show that the performance index value of the text classification method proposed in this paper is better than the commonly used text classification method, and the F1 value of the algorithm in this paper reaches more than 93%. The value of F1 is more than 3.5% higher than that of single BERT model.