{"title":"Analysis of web data classification methods based on semantic similarity measure","authors":"K. Ramesh, Mohanasundaram R","doi":"10.1080/19393555.2022.2080614","DOIUrl":null,"url":null,"abstract":"ABSTRACT In this survey, 60 research papers are reviewed based on various web data classification techniques, which are used for effective classification of web data and measuring the semantic relatedness between the two words. The web data classification techniques are classified into three types, such as semantic-based approach, search engine-based approach, and WordNet-based approach, and the research issues and challenges confronted by the existing techniques are reported in this survey. Moreover, the analysis is carried out based on the research works using the categorized web data classification techniques, dataset, and evaluation metrics are carried out. From the analysis, it is clear that semantic-based approach is the widely used techniques in the classification of web data. Similarly, Miller-Charles dataset is the most commonly used dataset in most of the research papers, and the evaluation metrics, like precision, recall, and F-measure are widely utilized in web data classification. The insights from this manuscript can be utilized to understand various research gaps and problems in this area. Those can be considered in the future by developing novel optimization algorithms, which might enhance the performance of web data classifications.","PeriodicalId":103842,"journal":{"name":"Information Security Journal: A Global Perspective","volume":"83 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Security Journal: A Global Perspective","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/19393555.2022.2080614","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
ABSTRACT In this survey, 60 research papers are reviewed based on various web data classification techniques, which are used for effective classification of web data and measuring the semantic relatedness between the two words. The web data classification techniques are classified into three types, such as semantic-based approach, search engine-based approach, and WordNet-based approach, and the research issues and challenges confronted by the existing techniques are reported in this survey. Moreover, the analysis is carried out based on the research works using the categorized web data classification techniques, dataset, and evaluation metrics are carried out. From the analysis, it is clear that semantic-based approach is the widely used techniques in the classification of web data. Similarly, Miller-Charles dataset is the most commonly used dataset in most of the research papers, and the evaluation metrics, like precision, recall, and F-measure are widely utilized in web data classification. The insights from this manuscript can be utilized to understand various research gaps and problems in this area. Those can be considered in the future by developing novel optimization algorithms, which might enhance the performance of web data classifications.