{"title":"利用词位信息增强概率信息检索","authors":"Baiyan Liu, X. An, Xiangji Huang","doi":"10.1145/2766462.2767827","DOIUrl":null,"url":null,"abstract":"Nouns are more important than other parts of speech in information retrieval and are more often found near the beginning or the end of sentences. In this paper, we investigate the effects of rewarding terms based on their location in sentences on information retrieval. Particularly, we propose a novel Term Location (TEL) retrieval model based on BM25 to enhance probabilistic information retrieval, where a kernel-based method is used to capture term placement patterns. Experiments on five TREC datasets of varied size and content indicate the proposed model significantly outperforms the optimized BM25 and DirichletLM in MAP over all datasets with all kernel functions, and excels the optimized BM25 and DirichletLM over most of the datasets in P@5 and P@20 with different kernel functions.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"Using Term Location Information to Enhance Probabilistic Information Retrieval\",\"authors\":\"Baiyan Liu, X. An, Xiangji Huang\",\"doi\":\"10.1145/2766462.2767827\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nouns are more important than other parts of speech in information retrieval and are more often found near the beginning or the end of sentences. In this paper, we investigate the effects of rewarding terms based on their location in sentences on information retrieval. Particularly, we propose a novel Term Location (TEL) retrieval model based on BM25 to enhance probabilistic information retrieval, where a kernel-based method is used to capture term placement patterns. Experiments on five TREC datasets of varied size and content indicate the proposed model significantly outperforms the optimized BM25 and DirichletLM in MAP over all datasets with all kernel functions, and excels the optimized BM25 and DirichletLM over most of the datasets in P@5 and P@20 with different kernel functions.\",\"PeriodicalId\":297035,\"journal\":{\"name\":\"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2766462.2767827\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2766462.2767827","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using Term Location Information to Enhance Probabilistic Information Retrieval
Nouns are more important than other parts of speech in information retrieval and are more often found near the beginning or the end of sentences. In this paper, we investigate the effects of rewarding terms based on their location in sentences on information retrieval. Particularly, we propose a novel Term Location (TEL) retrieval model based on BM25 to enhance probabilistic information retrieval, where a kernel-based method is used to capture term placement patterns. Experiments on five TREC datasets of varied size and content indicate the proposed model significantly outperforms the optimized BM25 and DirichletLM in MAP over all datasets with all kernel functions, and excels the optimized BM25 and DirichletLM over most of the datasets in P@5 and P@20 with different kernel functions.