{"title":"FE-TAC:一种结合特征提取和特征选择的有效文档分类方法","authors":"Kshetrimayum Nareshkumar Singh, Haobam Mamata Devi, Anjana Kakoti Mahant, Ahongsangbam Dorendro","doi":"10.1504/ijads.2023.134204","DOIUrl":null,"url":null,"abstract":"An effective classification method requires the most informative and relevant set of features. In this paper, we discuss an enhanced text classification method combining feature extraction (FE) and feature selection. First, we used the FE method to extract features from text data and then apply the feature selection method to select the most relevant features out of those extracted features. During feature selection, we introduce a new measure called term affinity to the class (TAC) to estimate the degree of retaining capability of the term as a member of the particular class. TAC is computed based on the combination of normalise document frequency and summing up the occurrence frequency of the term to the specific class. Experimental results on three existing datasets - BBC, Classic4, 20 Newsgroup, and our own dataset called 'Sangai' show that the proposed method outperforms the other competent methods in terms of accuracy.","PeriodicalId":39414,"journal":{"name":"International Journal of Applied Decision Sciences","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FE-TAC: an effective document classification method combining feature extraction and feature selection\",\"authors\":\"Kshetrimayum Nareshkumar Singh, Haobam Mamata Devi, Anjana Kakoti Mahant, Ahongsangbam Dorendro\",\"doi\":\"10.1504/ijads.2023.134204\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An effective classification method requires the most informative and relevant set of features. In this paper, we discuss an enhanced text classification method combining feature extraction (FE) and feature selection. First, we used the FE method to extract features from text data and then apply the feature selection method to select the most relevant features out of those extracted features. During feature selection, we introduce a new measure called term affinity to the class (TAC) to estimate the degree of retaining capability of the term as a member of the particular class. TAC is computed based on the combination of normalise document frequency and summing up the occurrence frequency of the term to the specific class. Experimental results on three existing datasets - BBC, Classic4, 20 Newsgroup, and our own dataset called 'Sangai' show that the proposed method outperforms the other competent methods in terms of accuracy.\",\"PeriodicalId\":39414,\"journal\":{\"name\":\"International Journal of Applied Decision Sciences\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Applied Decision Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/ijads.2023.134204\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Economics, Econometrics and Finance\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Applied Decision Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijads.2023.134204","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Economics, Econometrics and Finance","Score":null,"Total":0}
FE-TAC: an effective document classification method combining feature extraction and feature selection
An effective classification method requires the most informative and relevant set of features. In this paper, we discuss an enhanced text classification method combining feature extraction (FE) and feature selection. First, we used the FE method to extract features from text data and then apply the feature selection method to select the most relevant features out of those extracted features. During feature selection, we introduce a new measure called term affinity to the class (TAC) to estimate the degree of retaining capability of the term as a member of the particular class. TAC is computed based on the combination of normalise document frequency and summing up the occurrence frequency of the term to the specific class. Experimental results on three existing datasets - BBC, Classic4, 20 Newsgroup, and our own dataset called 'Sangai' show that the proposed method outperforms the other competent methods in terms of accuracy.
期刊介绍:
IJADS is a double-blind refereed international journal whose focus is to promote the infusion of the functional and behavioural areas of business with the concepts and methodologies of the decision sciences and information systems. IJADS distinguishes itself as a business journal with an explicit focus on modelling and applied decision-making. The thrust of IJADS is to provide practical guidance to decision makers and practicing managers by publishing papers that bridge the gap between theory and practice of decision sciences and information systems in business, industry, government and academia. Papers published in the journal must contain some link to practice through realistically detailed examples or real applications.