{"title":"利用自动摘要改进文本分类的性能","authors":"Xiao-yu Jiang, Xiao-zhong Fan, Zhi-Fei Wang, Ke-liang Jia","doi":"10.1109/ICCMS.2009.29","DOIUrl":null,"url":null,"abstract":"In order to reduce the dimensionality of feature vector space and reduce the computing complexity of categorization, each document of the train set is summarized automatically and two approaches to text categorization based on these summaries are proposed: in the first approach, the text summarization is directly used for feature selection and categorization instead of the original text; in the second approach, each summary is used to select and weight features for each document, and free texts are classified using KNN algorithm. Experimental results show that the two proposed methods using automatic summarization can not only reduce the time of classifier training, but also improve the performance of text categorization.","PeriodicalId":325964,"journal":{"name":"2009 International Conference on Computer Modeling and Simulation","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Improving the Performance of Text Categorization Using Automatic Summarization\",\"authors\":\"Xiao-yu Jiang, Xiao-zhong Fan, Zhi-Fei Wang, Ke-liang Jia\",\"doi\":\"10.1109/ICCMS.2009.29\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to reduce the dimensionality of feature vector space and reduce the computing complexity of categorization, each document of the train set is summarized automatically and two approaches to text categorization based on these summaries are proposed: in the first approach, the text summarization is directly used for feature selection and categorization instead of the original text; in the second approach, each summary is used to select and weight features for each document, and free texts are classified using KNN algorithm. Experimental results show that the two proposed methods using automatic summarization can not only reduce the time of classifier training, but also improve the performance of text categorization.\",\"PeriodicalId\":325964,\"journal\":{\"name\":\"2009 International Conference on Computer Modeling and Simulation\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-02-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 International Conference on Computer Modeling and Simulation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCMS.2009.29\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Computer Modeling and Simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCMS.2009.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving the Performance of Text Categorization Using Automatic Summarization
In order to reduce the dimensionality of feature vector space and reduce the computing complexity of categorization, each document of the train set is summarized automatically and two approaches to text categorization based on these summaries are proposed: in the first approach, the text summarization is directly used for feature selection and categorization instead of the original text; in the second approach, each summary is used to select and weight features for each document, and free texts are classified using KNN algorithm. Experimental results show that the two proposed methods using automatic summarization can not only reduce the time of classifier training, but also improve the performance of text categorization.