{"title":"一种基于域的Web日志数据搜索引擎架构","authors":"P. Sharma, Divakar Yadav","doi":"10.34028/iajit/20/1/10","DOIUrl":null,"url":null,"abstract":"Search engines, an information retrieval tool are the main source of information for users’ information need now a day. For every query, the search engine explores its repository and/or indexer to find the relevant documents/URLs for that query. Page ranking algorithms rank the Uniform Resource Locator in abstract section (URLs) according to its relevancy with respect to users’ query. It is analyzed that many of the queries fired by users on search engines are duplicate. There is a scope to improve the performance of search engine to reduce its efforts for duplicate queries. In this paper a proxy server is created that keep store the search results of user queries in web log. The proposed proxy server uses this web log to find results faster for duplicate queries fired next time. The proposed scheme has been tested and found prominent. The proposed architecture tested for ten duplicate user queries. it return all relevant web pages for duplicate user query (if query is found in web log at proxy server) from a particular domain instead of entire database. It reduces the perceived latency for duplicate query and also improves the value of precession and accuracy up to 81.8% and 99% respectively for all duplicate user queries.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"8 1","pages":"92-101"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Novel Architecture for Search Engine using Domain Based Web Log Data\",\"authors\":\"P. Sharma, Divakar Yadav\",\"doi\":\"10.34028/iajit/20/1/10\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Search engines, an information retrieval tool are the main source of information for users’ information need now a day. For every query, the search engine explores its repository and/or indexer to find the relevant documents/URLs for that query. Page ranking algorithms rank the Uniform Resource Locator in abstract section (URLs) according to its relevancy with respect to users’ query. It is analyzed that many of the queries fired by users on search engines are duplicate. There is a scope to improve the performance of search engine to reduce its efforts for duplicate queries. In this paper a proxy server is created that keep store the search results of user queries in web log. The proposed proxy server uses this web log to find results faster for duplicate queries fired next time. The proposed scheme has been tested and found prominent. The proposed architecture tested for ten duplicate user queries. it return all relevant web pages for duplicate user query (if query is found in web log at proxy server) from a particular domain instead of entire database. It reduces the perceived latency for duplicate query and also improves the value of precession and accuracy up to 81.8% and 99% respectively for all duplicate user queries.\",\"PeriodicalId\":13624,\"journal\":{\"name\":\"Int. Arab J. Inf. Technol.\",\"volume\":\"8 1\",\"pages\":\"92-101\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. Arab J. Inf. Technol.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34028/iajit/20/1/10\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. Arab J. Inf. Technol.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34028/iajit/20/1/10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Novel Architecture for Search Engine using Domain Based Web Log Data
Search engines, an information retrieval tool are the main source of information for users’ information need now a day. For every query, the search engine explores its repository and/or indexer to find the relevant documents/URLs for that query. Page ranking algorithms rank the Uniform Resource Locator in abstract section (URLs) according to its relevancy with respect to users’ query. It is analyzed that many of the queries fired by users on search engines are duplicate. There is a scope to improve the performance of search engine to reduce its efforts for duplicate queries. In this paper a proxy server is created that keep store the search results of user queries in web log. The proposed proxy server uses this web log to find results faster for duplicate queries fired next time. The proposed scheme has been tested and found prominent. The proposed architecture tested for ten duplicate user queries. it return all relevant web pages for duplicate user query (if query is found in web log at proxy server) from a particular domain instead of entire database. It reduces the perceived latency for duplicate query and also improves the value of precession and accuracy up to 81.8% and 99% respectively for all duplicate user queries.