首页 > 最新文献

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval最新文献

英文 中文
Estimating query representativeness for query-performance prediction 估计查询代表性以进行查询性能预测
Mor Sondak, Anna Shtok, Oren Kurland
The query-performance prediction (QPP) task is estimating retrieval effectiveness with no relevance judgments. We present a novel probabilistic framework for QPP that gives rise to an important aspect that was not addressed in previous work; namely, the extent to which the query effectively represents the information need for retrieval. Accordingly, we devise a few query-representativeness measures that utilize relevance language models. Experiments show that integrating the most effective measures with state-of-the-art predictors in our framework often yields prediction quality that significantly transcends that of using the predictors alone.
查询性能预测(query-performance prediction, QPP)任务是在没有相关性判断的情况下估计检索效率。我们提出了一个新的QPP概率框架,它产生了一个重要的方面,在以前的工作中没有解决;也就是说,查询有效地表示需要检索的信息的程度。因此,我们设计了一些利用关联语言模型的查询代表性度量。实验表明,在我们的框架中,将最有效的度量与最先进的预测器集成通常会产生预测质量,这大大超过了单独使用预测器的预测质量。
{"title":"Estimating query representativeness for query-performance prediction","authors":"Mor Sondak, Anna Shtok, Oren Kurland","doi":"10.1145/2484028.2484107","DOIUrl":"https://doi.org/10.1145/2484028.2484107","url":null,"abstract":"The query-performance prediction (QPP) task is estimating retrieval effectiveness with no relevance judgments. We present a novel probabilistic framework for QPP that gives rise to an important aspect that was not addressed in previous work; namely, the extent to which the query effectively represents the information need for retrieval. Accordingly, we devise a few query-representativeness measures that utilize relevance language models. Experiments show that integrating the most effective measures with state-of-the-art predictors in our framework often yields prediction quality that significantly transcends that of using the predictors alone.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121458289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
How do users respond to voice input errors?: lexical and phonetic query reformulation in voice search 用户对语音输入错误有何反应?语音搜索中词汇和语音查询的重新表述
Jiepu Jiang, Wei Jeng, Daqing He
Voice search offers users with a new search experience: instead of typing, users can vocalize their search queries. However, due to voice input errors (such as speech recognition errors and improper system interruptions), users need to frequently reformulate queries to handle the incorrectly recognized queries. We conducted user experiments with native English speakers on their query reformulation behaviors in voice search and found that users often reformulate queries with both lexical and phonetic changes to previous queries. In this paper, we first characterize and analyze typical voice input errors in voice search and users' corresponding reformulation strategies. Then, we evaluate the impacts of typical voice input errors on users' search progress and the effectiveness of different reformulation strategies on handling these errors. This study provides a clearer picture on how to further improve current voice search systems.
语音搜索为用户提供了一种全新的搜索体验:用户不用打字,而是可以发出搜索请求。但是,由于语音输入错误(如语音识别错误、系统异常中断),用户需要频繁地重新制定查询,以处理识别错误的查询。我们对以英语为母语的人进行了用户实验,研究他们在语音搜索中的查询重新表述行为,发现用户经常在词汇和语音上对之前的查询进行重新表述。本文首先对语音搜索中典型的语音输入错误进行了表征和分析,并分析了用户相应的重塑策略。然后,我们评估了典型的语音输入错误对用户搜索进度的影响,以及不同的重构策略处理这些错误的有效性。这项研究为如何进一步改进当前的语音搜索系统提供了更清晰的画面。
{"title":"How do users respond to voice input errors?: lexical and phonetic query reformulation in voice search","authors":"Jiepu Jiang, Wei Jeng, Daqing He","doi":"10.1145/2484028.2484092","DOIUrl":"https://doi.org/10.1145/2484028.2484092","url":null,"abstract":"Voice search offers users with a new search experience: instead of typing, users can vocalize their search queries. However, due to voice input errors (such as speech recognition errors and improper system interruptions), users need to frequently reformulate queries to handle the incorrectly recognized queries. We conducted user experiments with native English speakers on their query reformulation behaviors in voice search and found that users often reformulate queries with both lexical and phonetic changes to previous queries. In this paper, we first characterize and analyze typical voice input errors in voice search and users' corresponding reformulation strategies. Then, we evaluate the impacts of typical voice input errors on users' search progress and the effectiveness of different reformulation strategies on handling these errors. This study provides a clearer picture on how to further improve current voice search systems.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126587397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 105
A general evaluation measure for document organization tasks 文件组织任务的一般评价方法
Enrique Amigó, Julio Gonzalo, M. Verdejo
A number of key Information Access tasks -- Document Retrieval, Clustering, Filtering, and their combinations -- can be seen as instances of a generic {em document organization} problem that establishes priority and relatedness relationships between documents (in other words, a problem of forming and ranking clusters). As far as we know, no analysis has been made yet on the evaluation of these tasks from a global perspective. In this paper we propose two complementary evaluation measures -- Reliability and Sensitivity -- for the generic Document Organization task which are derived from a proposed set of formal constraints (properties that any suitable measure must satisfy). In addition to be the first measures that can be applied to any mixture of ranking, clustering and filtering tasks, Reliability and Sensitivity satisfy more formal constraints than previously existing evaluation metrics for each of the subsumed tasks. Besides their formal properties, its most salient feature from an empirical point of view is their strictness: a high score according to the harmonic mean of Reliability and Sensitivity ensures a high score with any of the most popular evaluation metrics in all the Document Retrieval, Clustering and Filtering datasets used in our experiments.
许多关键的信息访问任务——文档检索、聚类、过滤以及它们的组合——可以看作是通用的{em文档组织}问题的实例,该问题在文档之间建立优先级和相关性关系(换句话说,是形成和排列聚类的问题)。据我们所知,尚未对从全球角度评价这些任务进行分析。在本文中,我们为通用文档组织任务提出了两个互补的评估指标——可靠性和灵敏度,这两个指标来自于一组提出的形式约束(任何合适的度量必须满足的属性)。除了是第一个可以应用于任何排序、聚类和过滤任务的混合度量之外,可靠性和灵敏度比以前存在的每个包含任务的评估度量满足更多的正式约束。除了它们的形式属性外,从经验的角度来看,其最显著的特征是它们的严谨性:根据可靠性和灵敏度的调和平均值获得高分,确保在我们实验中使用的所有文档检索,聚类和过滤数据集中使用任何最流行的评估指标获得高分。
{"title":"A general evaluation measure for document organization tasks","authors":"Enrique Amigó, Julio Gonzalo, M. Verdejo","doi":"10.1145/2484028.2484081","DOIUrl":"https://doi.org/10.1145/2484028.2484081","url":null,"abstract":"A number of key Information Access tasks -- Document Retrieval, Clustering, Filtering, and their combinations -- can be seen as instances of a generic {em document organization} problem that establishes priority and relatedness relationships between documents (in other words, a problem of forming and ranking clusters). As far as we know, no analysis has been made yet on the evaluation of these tasks from a global perspective. In this paper we propose two complementary evaluation measures -- Reliability and Sensitivity -- for the generic Document Organization task which are derived from a proposed set of formal constraints (properties that any suitable measure must satisfy). In addition to be the first measures that can be applied to any mixture of ranking, clustering and filtering tasks, Reliability and Sensitivity satisfy more formal constraints than previously existing evaluation metrics for each of the subsumed tasks. Besides their formal properties, its most salient feature from an empirical point of view is their strictness: a high score according to the harmonic mean of Reliability and Sensitivity ensures a high score with any of the most popular evaluation metrics in all the Document Retrieval, Clustering and Filtering datasets used in our experiments.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116650347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 117
Query representation for cross-temporal information retrieval 跨时间信息检索的查询表示
Miles Efron
This paper addresses the problem of long-term language change in information retrieval (IR) systems. IR research has often ignored lexical drift. But in the emerging domain of massive digitized book collections, the risk of vocabulary mismatch due to language change is high. Collections such as Google Books and the Hathi Trust contain text written in the vernaculars of many centuries. With respect to IR, changes in vocabulary and orthography make 14th-Century English qualitatively different from 21st-Century English. This challenges retrieval models that rely on keyword matching. With this challenge in mind, we ask: given a query written in contemporary English, how can we retrieve relevant documents that were written in early English? We argue that search in historically diverse corpora is similar to cross-language retrieval (CLIR). By considering "modern" English and "archaic" English as distinct languages, CLIR techniques can improve what we call cross-temporal IR (CTIR). We focus on ways to combine evidence to improve CTIR effectiveness, proposing and testing several ways to handle language change during book search. We find that a principled combination of three sources of evidence during relevance feedback yields strong CTIR performance.
本文研究了信息检索系统中语言的长期变化问题。IR研究往往忽视了词汇漂移。但在大量数字化图书收藏这一新兴领域,由于语言变化而导致词汇不匹配的风险很高。谷歌Books和Hathi Trust等收藏包含了用许多世纪的白话写的文本。在IR方面,词汇和正字法的变化使14世纪的英语与21世纪的英语有了质的不同。这对依赖关键字匹配的检索模型提出了挑战。带着这个挑战,我们问:给定一个用当代英语写的查询,我们如何检索用早期英语写的相关文档?我们认为历史上不同语料库的搜索类似于跨语言检索(CLIR)。通过将“现代”英语和“古代”英语视为不同的语言,CLIR技术可以改善我们所说的跨时间IR (CTIR)。我们专注于结合证据来提高CTIR有效性的方法,提出并测试了几种处理图书搜索过程中语言变化的方法。我们发现,在相关性反馈期间,三个证据来源的原则组合产生了强大的CTIR性能。
{"title":"Query representation for cross-temporal information retrieval","authors":"Miles Efron","doi":"10.1145/2484028.2484054","DOIUrl":"https://doi.org/10.1145/2484028.2484054","url":null,"abstract":"This paper addresses the problem of long-term language change in information retrieval (IR) systems. IR research has often ignored lexical drift. But in the emerging domain of massive digitized book collections, the risk of vocabulary mismatch due to language change is high. Collections such as Google Books and the Hathi Trust contain text written in the vernaculars of many centuries. With respect to IR, changes in vocabulary and orthography make 14th-Century English qualitatively different from 21st-Century English. This challenges retrieval models that rely on keyword matching. With this challenge in mind, we ask: given a query written in contemporary English, how can we retrieve relevant documents that were written in early English? We argue that search in historically diverse corpora is similar to cross-language retrieval (CLIR). By considering \"modern\" English and \"archaic\" English as distinct languages, CLIR techniques can improve what we call cross-temporal IR (CTIR). We focus on ways to combine evidence to improve CTIR effectiveness, proposing and testing several ways to handle language change during book search. We find that a principled combination of three sources of evidence during relevance feedback yields strong CTIR performance.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116662468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Beyond relevance: on novelty and diversity in tag recommendation 超越相关性:论标签推荐的新颖性和多样性
F. Belém
We propose to explicitly exploit issues related to novelty and diversity in tag recommendation tasks, an unexplored research avenue (only relevance issues have been investigated so far), in order to improve user experience and satisfaction. We propose new tag recommendation strategies to cover these issues and highlight the involved challenges.
我们建议明确利用标签推荐任务中与新颖性和多样性相关的问题,这是一个尚未探索的研究途径(到目前为止只研究了相关性问题),以提高用户体验和满意度。我们提出了新的标签推荐策略来解决这些问题,并强调了所涉及的挑战。
{"title":"Beyond relevance: on novelty and diversity in tag recommendation","authors":"F. Belém","doi":"10.1145/2484028.2484229","DOIUrl":"https://doi.org/10.1145/2484028.2484229","url":null,"abstract":"We propose to explicitly exploit issues related to novelty and diversity in tag recommendation tasks, an unexplored research avenue (only relevance issues have been investigated so far), in order to improve user experience and satisfaction. We propose new tag recommendation strategies to cover these issues and highlight the involved challenges.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121678175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Summary of the NTCIR-10 INTENT-2 task: subtopic mining and search result diversification ntir -10 INTENT-2任务摘要:子主题挖掘和搜索结果多样化
T. Sakai, Zhicheng Dou, Takehiro Yamamoto, Yiqun Liu, Min Zhang, Makoto P. Kato, Ruihua Song, Mayu Iwata
The NTCIR INTENT task comprises two subtasks: {em Subtopic Mining}, where systems are required to return a ranked list of {em subtopic strings} for each given query; and {em Document Ranking}, where systems are required to return a diversified web search result for each given query. This paper summarises the novel features of the Second INTENT task at NTCIR-10 and its main findings, and poses some questions for future diversified search evaluation.
NTCIR INTENT任务包括两个子任务:{em子主题挖掘},其中系统需要为每个给定查询返回{em子主题字符串}的排序列表;和{em文档排名},其中系统需要为每个给定的查询返回多样化的web搜索结果。本文总结了ntcirr -10第二意图任务的新特点及其主要发现,并提出了未来多样化搜索评估的一些问题。
{"title":"Summary of the NTCIR-10 INTENT-2 task: subtopic mining and search result diversification","authors":"T. Sakai, Zhicheng Dou, Takehiro Yamamoto, Yiqun Liu, Min Zhang, Makoto P. Kato, Ruihua Song, Mayu Iwata","doi":"10.1145/2484028.2484104","DOIUrl":"https://doi.org/10.1145/2484028.2484104","url":null,"abstract":"The NTCIR INTENT task comprises two subtasks: {em Subtopic Mining}, where systems are required to return a ranked list of {em subtopic strings} for each given query; and {em Document Ranking}, where systems are required to return a diversified web search result for each given query. This paper summarises the novel features of the Second INTENT task at NTCIR-10 and its main findings, and poses some questions for future diversified search evaluation.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125250157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Retrieving documents with mathematical content 检索具有数学内容的文档
Shahab Kamali, Frank Wm. Tompa
Many documents with mathematical content are published on the Web, but conventional search engines that rely on keyword search only cannot fully exploit their mathematical information. In particular, keyword search is insufficient when expressions in a document are not annotated with natural keywords or the user cannot describe her query with keywords. Retrieving documents by querying their mathematical content directly is very appealing in various domains such as education, digital libraries, engineering, patent documents, medical sciences, etc. Capturing the relevance of mathematical expressions also greatly enhances document classification in such domains. Unlike text retrieval, where keywords carry enough semantics to distinguish text documents and rank them, math symbols do not contain much semantic information on their own. In fact, mathematical expressions typically consist of few alphabetical symbols organized in rather complex structures. Hence, the structure of an expression, which describes the way such symbols are combined, should also be considered. Unfortunately, there is no standard testbed with which to evaluate the effectiveness of a mathematics retrieval algorithm. In this paper we study the fundamental and challenging problems in mathematics retrieval, that is how to capture the relevance of mathematical expressions, how to query them, and how to evaluate the results. We describe various search paradigms and propose retrieval systems accordingly. We discuss the benefits and drawbacks of each approach, and further compare them through an extensive empirical study.
许多包含数学内容的文档都发布在Web上,但是仅依赖关键字搜索的传统搜索引擎无法充分利用其中的数学信息。特别是,当文档中的表达式没有使用自然关键字注释或用户无法使用关键字描述其查询时,关键字搜索是不够的。通过直接查询其数学内容来检索文档在教育、数字图书馆、工程、专利文档、医学科学等各个领域都非常有吸引力。捕获数学表达式的相关性也极大地增强了这些领域中的文档分类。与文本检索不同,关键字携带足够的语义来区分文本文档并对其进行排序,数学符号本身不包含太多的语义信息。事实上,数学表达式通常由几个字母符号组成,它们以相当复杂的结构组织起来。因此,还应该考虑描述这些符号组合方式的表达式的结构。不幸的是,没有标准的测试平台来评估数学检索算法的有效性。在本文中,我们研究了数学检索中最基本和最具挑战性的问题,即如何捕获数学表达式的相关性,如何查询它们,以及如何评估结果。我们描述了各种搜索范例,并提出了相应的检索系统。我们讨论了每种方法的优点和缺点,并通过广泛的实证研究进一步比较它们。
{"title":"Retrieving documents with mathematical content","authors":"Shahab Kamali, Frank Wm. Tompa","doi":"10.1145/2484028.2484083","DOIUrl":"https://doi.org/10.1145/2484028.2484083","url":null,"abstract":"Many documents with mathematical content are published on the Web, but conventional search engines that rely on keyword search only cannot fully exploit their mathematical information. In particular, keyword search is insufficient when expressions in a document are not annotated with natural keywords or the user cannot describe her query with keywords. Retrieving documents by querying their mathematical content directly is very appealing in various domains such as education, digital libraries, engineering, patent documents, medical sciences, etc. Capturing the relevance of mathematical expressions also greatly enhances document classification in such domains. Unlike text retrieval, where keywords carry enough semantics to distinguish text documents and rank them, math symbols do not contain much semantic information on their own. In fact, mathematical expressions typically consist of few alphabetical symbols organized in rather complex structures. Hence, the structure of an expression, which describes the way such symbols are combined, should also be considered. Unfortunately, there is no standard testbed with which to evaluate the effectiveness of a mathematics retrieval algorithm. In this paper we study the fundamental and challenging problems in mathematics retrieval, that is how to capture the relevance of mathematical expressions, how to query them, and how to evaluate the results. We describe various search paradigms and propose retrieval systems accordingly. We discuss the benefits and drawbacks of each approach, and further compare them through an extensive empirical study.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123037878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Searching in the city of knowledge: challenges and recent developments 知识之城的搜索:挑战和最新发展
V. Bicer, V. López
Today plenty of data is emerging from various city systems. Beyond the classical Web resources, large amounts of data are retrieved from sensors, devices, social networks, governmental applications, or service networks. In such a diversity of information, answering specific information needs of city inhabitants requires holistic IR techniques, capable of harnessing different types of city data and turned it into actionable insights to answer different queries. This tutorial will present deep insights, challenges, opportunities and techniques to make heterogeneous city data searchable and show how emerging IR techniques models can be employed to retrieve relevant information for the citizens.
如今,大量数据从各个城市系统中涌现出来。除了传统的Web资源之外,还可以从传感器、设备、社会网络、政府应用程序或服务网络中检索大量数据。在如此多样化的信息中,回答城市居民的特定信息需求需要整体IR技术,能够利用不同类型的城市数据,并将其转化为可操作的见解,以回答不同的问题。本教程将介绍使异构城市数据可搜索的深刻见解、挑战、机遇和技术,并展示如何使用新兴的IR技术模型为市民检索相关信息。
{"title":"Searching in the city of knowledge: challenges and recent developments","authors":"V. Bicer, V. López","doi":"10.1145/2484028.2484195","DOIUrl":"https://doi.org/10.1145/2484028.2484195","url":null,"abstract":"Today plenty of data is emerging from various city systems. Beyond the classical Web resources, large amounts of data are retrieved from sensors, devices, social networks, governmental applications, or service networks. In such a diversity of information, answering specific information needs of city inhabitants requires holistic IR techniques, capable of harnessing different types of city data and turned it into actionable insights to answer different queries. This tutorial will present deep insights, challenges, opportunities and techniques to make heterogeneous city data searchable and show how emerging IR techniques models can be employed to retrieve relevant information for the citizens.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131753679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Mapping queries to questions: towards understanding users' information needs 将查询映射到问题:了解用户的信息需求
Yunjun Gao, Lu Chen, Rui Li, Gang Chen
In this paper, for the first time, we study the problem of mapping keyword queries to questions on community-based question answering (CQA) sites. Mapping general web queries to questions enables search engines not only to discover explicit and specific information needs (questions) behind keywords queries, but also to find high quality information (answers) for answering keyword queries. In order to map queries to questions, we propose a ranking algorithm containing three steps: Candidate Question Selection, Candidate Question Ranking, and Candidate Question Grouping. Preliminary experimental results using 60 queries from search logs of a commercial engine show that the presented approach can efficiently find the questions which capture user's information needs explicitly.
在本文中,我们首次研究了基于社区的问答(CQA)网站上关键字查询与问题的映射问题。将一般的网络查询映射到问题,使搜索引擎不仅可以发现关键字查询背后明确的、特定的信息需求(问题),还可以找到高质量的信息(答案)来回答关键字查询。为了将查询映射到问题,我们提出了一种包含三个步骤的排序算法:候选问题选择、候选问题排序和候选问题分组。对60条商业搜索日志的初步实验结果表明,该方法可以有效地找到明确捕获用户信息需求的问题。
{"title":"Mapping queries to questions: towards understanding users' information needs","authors":"Yunjun Gao, Lu Chen, Rui Li, Gang Chen","doi":"10.1145/2484028.2484138","DOIUrl":"https://doi.org/10.1145/2484028.2484138","url":null,"abstract":"In this paper, for the first time, we study the problem of mapping keyword queries to questions on community-based question answering (CQA) sites. Mapping general web queries to questions enables search engines not only to discover explicit and specific information needs (questions) behind keywords queries, but also to find high quality information (answers) for answering keyword queries. In order to map queries to questions, we propose a ranking algorithm containing three steps: Candidate Question Selection, Candidate Question Ranking, and Candidate Question Grouping. Preliminary experimental results using 60 queries from search logs of a commercial engine show that the presented approach can efficiently find the questions which capture user's information needs explicitly.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116504487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A financial cost metric for result caching 结果缓存的财务成本度量
Fethi Burak Sazoglu, B. B. Cambazoglu, R. Ozcan, I. S. Altingövde, Ö. Ulusoy
Web search engines cache results of frequent and/or recent queries. Result caching strategies can be evaluated using different metrics, hit rate being the most well-known. Recent works take the processing overhead of queries into account when evaluating the performance of result caching strategies and propose cost-aware caching strategies. In this paper, we propose a financial cost metric that goes one step beyond and takes also the hourly electricity prices into account when computing the cost. We evaluate the most well-known static, dynamic, and hybrid result caching strategies under this new metric. Moreover, we propose a financial-cost-aware version of the well-known LRU strategy and show that it outperforms the original LRU strategy in terms of the financial cost metric.
Web搜索引擎缓存频繁和/或最近查询的结果。结果缓存策略可以使用不同的度量来评估,命中率是最著名的。最近的研究在评估结果缓存策略的性能时考虑了查询的处理开销,并提出了具有成本意识的缓存策略。在本文中,我们提出了一个财务成本指标,在计算成本时还考虑了小时电价。我们在这个新指标下评估了最著名的静态、动态和混合结果缓存策略。此外,我们提出了一个众所周知的LRU策略的财务成本意识版本,并表明它在财务成本度量方面优于原始的LRU策略。
{"title":"A financial cost metric for result caching","authors":"Fethi Burak Sazoglu, B. B. Cambazoglu, R. Ozcan, I. S. Altingövde, Ö. Ulusoy","doi":"10.1145/2484028.2484182","DOIUrl":"https://doi.org/10.1145/2484028.2484182","url":null,"abstract":"Web search engines cache results of frequent and/or recent queries. Result caching strategies can be evaluated using different metrics, hit rate being the most well-known. Recent works take the processing overhead of queries into account when evaluating the performance of result caching strategies and propose cost-aware caching strategies. In this paper, we propose a financial cost metric that goes one step beyond and takes also the hourly electricity prices into account when computing the cost. We evaluate the most well-known static, dynamic, and hybrid result caching strategies under this new metric. Moreover, we propose a financial-cost-aware version of the well-known LRU strategy and show that it outperforms the original LRU strategy in terms of the financial cost metric.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115170612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
期刊
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1