{"title":"Comparative Analysis for KeyTerms Extraction Methods for Personalized Search Engines","authors":"Shaurya Uppal, Arti Jain, Anuja Arora","doi":"10.1109/Confluence47617.2020.9057810","DOIUrl":null,"url":null,"abstract":"Text Mining refers to an extraction of certain nontrivial, hidden and interesting knowledge from an unstructured textual data. In this paper, efforts are directed to interpret text mining queries in the healthcare domain. To do so, the dataset is taken from the 1mg-company that has emerged during 2015 to provide transparent, authentic and accessible healthcare information for the millions of people while guiding customers with the quality care that too at affordable prices. The different text mining algorithms are compared to generate knowledge extraction of keyterms while linking the personalized search concepts with respect to the healthcare domain, and for the better search recommendations. The algorithms are: basic TF-IDF, SGRank with IDF, TextRank, and modified TF-IDF. The best results are obtained with the modified TF-IDF with the Shingle analyzer where post-release overall is reduced.","PeriodicalId":180005,"journal":{"name":"2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/Confluence47617.2020.9057810","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Text Mining refers to an extraction of certain nontrivial, hidden and interesting knowledge from an unstructured textual data. In this paper, efforts are directed to interpret text mining queries in the healthcare domain. To do so, the dataset is taken from the 1mg-company that has emerged during 2015 to provide transparent, authentic and accessible healthcare information for the millions of people while guiding customers with the quality care that too at affordable prices. The different text mining algorithms are compared to generate knowledge extraction of keyterms while linking the personalized search concepts with respect to the healthcare domain, and for the better search recommendations. The algorithms are: basic TF-IDF, SGRank with IDF, TextRank, and modified TF-IDF. The best results are obtained with the modified TF-IDF with the Shingle analyzer where post-release overall is reduced.