{"title":"Compaction techniques for nextword indexes","authors":"D. Bahle, H. Williams, J. Zobel","doi":"10.1109/SPIRE.2001.989735","DOIUrl":null,"url":null,"abstract":"Most queries to text search engines are ranked or Boolean. Phrase querying is a powerful technique for rejning searches, but is expensive to implement on conventional indexes. In previous work we introduced the nextword index, a structure specifically designed for phrase queries, which however is relatively large. In this paper we introduce new compaction techniques for nextword indexes. In contrast to most index compression schemes, these techniques are lossy, yet as we show allow full resolution ofphrase queries without false match checking. We show experimentally that our novel techniques lead to significant savings in index size.","PeriodicalId":107511,"journal":{"name":"Proceedings Eighth Symposium on String Processing and Information Retrieval","volume":"54 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Eighth Symposium on String Processing and Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPIRE.2001.989735","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
Most queries to text search engines are ranked or Boolean. Phrase querying is a powerful technique for rejning searches, but is expensive to implement on conventional indexes. In previous work we introduced the nextword index, a structure specifically designed for phrase queries, which however is relatively large. In this paper we introduce new compaction techniques for nextword indexes. In contrast to most index compression schemes, these techniques are lossy, yet as we show allow full resolution ofphrase queries without false match checking. We show experimentally that our novel techniques lead to significant savings in index size.