{"title":"使用早期检测和数据分析的智能Web主题搜索","authors":"Ching-Cheng Lee, Yixin Yang","doi":"10.1109/CMPSAC.2003.1245399","DOIUrl":null,"url":null,"abstract":"Topic-specific search engines that offer users relevant topics as search results have recently been developed. However, these topic-specific search engines require intensive human efforts to build and maintain. In addition, they visit many irrelevant pages. In our project, we propose a new approach for Web topics search. First, we do early detection for \"candidate topics\" while extracting words from the HTML text. Secondly, we perform data analysis on the appearance information such as appearance times and places for candidate topics. By these two techniques, we can reduce candidate topics' crawling times and computing cost. Analysis of the results and the comparisons with related research will be made to demonstrate the effectiveness of our approach.","PeriodicalId":173397,"journal":{"name":"Proceedings 27th Annual International Computer Software and Applications Conference. COMPAC 2003","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Intelligent Web topics search using early detection and data analysis\",\"authors\":\"Ching-Cheng Lee, Yixin Yang\",\"doi\":\"10.1109/CMPSAC.2003.1245399\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Topic-specific search engines that offer users relevant topics as search results have recently been developed. However, these topic-specific search engines require intensive human efforts to build and maintain. In addition, they visit many irrelevant pages. In our project, we propose a new approach for Web topics search. First, we do early detection for \\\"candidate topics\\\" while extracting words from the HTML text. Secondly, we perform data analysis on the appearance information such as appearance times and places for candidate topics. By these two techniques, we can reduce candidate topics' crawling times and computing cost. Analysis of the results and the comparisons with related research will be made to demonstrate the effectiveness of our approach.\",\"PeriodicalId\":173397,\"journal\":{\"name\":\"Proceedings 27th Annual International Computer Software and Applications Conference. COMPAC 2003\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-11-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 27th Annual International Computer Software and Applications Conference. COMPAC 2003\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CMPSAC.2003.1245399\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 27th Annual International Computer Software and Applications Conference. COMPAC 2003","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CMPSAC.2003.1245399","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Intelligent Web topics search using early detection and data analysis
Topic-specific search engines that offer users relevant topics as search results have recently been developed. However, these topic-specific search engines require intensive human efforts to build and maintain. In addition, they visit many irrelevant pages. In our project, we propose a new approach for Web topics search. First, we do early detection for "candidate topics" while extracting words from the HTML text. Secondly, we perform data analysis on the appearance information such as appearance times and places for candidate topics. By these two techniques, we can reduce candidate topics' crawling times and computing cost. Analysis of the results and the comparisons with related research will be made to demonstrate the effectiveness of our approach.