{"title":"Adding the temporal dimension to search - a case study in publication search","authors":"Philip S. Yu, Xin Li, B. Liu","doi":"10.1109/WI.2005.21","DOIUrl":null,"url":null,"abstract":"The most well known search techniques are perhaps the PageRank and HITS algorithms. In this paper, we argue that these algorithms miss an important dimension, the temporal dimension. Quality pages in the past may not be quality pages now or in the future. These techniques favor older pages because these pages have many in-links accumulated over time. New pages, which may be of high quality, have few or no in-links and are left behind. Research publication search has the same problem. If we use the PageRank or HITS algorithm, those older or classic papers are ranked high due to the large number of citations that they received in the past. This paper studies the temporal dimension of search in the context of research publication. A number of methods are proposed to deal with the problem based on analyzing the behavior history and the source of each publication. These methods are evaluated empirically. Our results show that they are highly effective.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"51","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI.2005.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 51
Abstract
The most well known search techniques are perhaps the PageRank and HITS algorithms. In this paper, we argue that these algorithms miss an important dimension, the temporal dimension. Quality pages in the past may not be quality pages now or in the future. These techniques favor older pages because these pages have many in-links accumulated over time. New pages, which may be of high quality, have few or no in-links and are left behind. Research publication search has the same problem. If we use the PageRank or HITS algorithm, those older or classic papers are ranked high due to the large number of citations that they received in the past. This paper studies the temporal dimension of search in the context of research publication. A number of methods are proposed to deal with the problem based on analyzing the behavior history and the source of each publication. These methods are evaluated empirically. Our results show that they are highly effective.