{"title":"基于内容的问询信主题识别","authors":"Wei Wang, Guiying Wei, Xiaonan Gao, Huixia He","doi":"10.1109/CBFD52659.2021.00039","DOIUrl":null,"url":null,"abstract":"Facing the problem of match between the label and the text content of the current inquiry letter, this paper proposes a content-aware topic recognition of inquiry letter model. Firstly, the text data of the inquiry letter of Shanghai Stock Exchange are collected, and the text content features are extracted by using word vectorization (TF-IDF). The t-SNE dimension reduction algorithm is used to reduce the text vector data to 2-dimensional space, and then the K-means algorithm is used to cluster the inquiry feature data, and the inquiry letter are labeled according to the clustering results. Finally, the deep forest classification algorithm is used to train and classify the inquiry letter data. The application results show that the recall, precision and F-measure of the content-aware topic recognition based on the inquiry letter contents are improved compared with the traditional methods, which indicates that the proposed method based on the inquiry letter content is effective.","PeriodicalId":230625,"journal":{"name":"2021 International Conference on Computer, Blockchain and Financial Development (CBFD)","volume":"362 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Content-aware Topic Recognition of Inquiry Letter\",\"authors\":\"Wei Wang, Guiying Wei, Xiaonan Gao, Huixia He\",\"doi\":\"10.1109/CBFD52659.2021.00039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Facing the problem of match between the label and the text content of the current inquiry letter, this paper proposes a content-aware topic recognition of inquiry letter model. Firstly, the text data of the inquiry letter of Shanghai Stock Exchange are collected, and the text content features are extracted by using word vectorization (TF-IDF). The t-SNE dimension reduction algorithm is used to reduce the text vector data to 2-dimensional space, and then the K-means algorithm is used to cluster the inquiry feature data, and the inquiry letter are labeled according to the clustering results. Finally, the deep forest classification algorithm is used to train and classify the inquiry letter data. The application results show that the recall, precision and F-measure of the content-aware topic recognition based on the inquiry letter contents are improved compared with the traditional methods, which indicates that the proposed method based on the inquiry letter content is effective.\",\"PeriodicalId\":230625,\"journal\":{\"name\":\"2021 International Conference on Computer, Blockchain and Financial Development (CBFD)\",\"volume\":\"362 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Computer, Blockchain and Financial Development (CBFD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CBFD52659.2021.00039\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computer, Blockchain and Financial Development (CBFD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBFD52659.2021.00039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Facing the problem of match between the label and the text content of the current inquiry letter, this paper proposes a content-aware topic recognition of inquiry letter model. Firstly, the text data of the inquiry letter of Shanghai Stock Exchange are collected, and the text content features are extracted by using word vectorization (TF-IDF). The t-SNE dimension reduction algorithm is used to reduce the text vector data to 2-dimensional space, and then the K-means algorithm is used to cluster the inquiry feature data, and the inquiry letter are labeled according to the clustering results. Finally, the deep forest classification algorithm is used to train and classify the inquiry letter data. The application results show that the recall, precision and F-measure of the content-aware topic recognition based on the inquiry letter contents are improved compared with the traditional methods, which indicates that the proposed method based on the inquiry letter content is effective.