Igo Ramalho Brilhante, J. Macêdo, F. M. Nardini, R. Perego, C. Renso
In this paper we propose TripBuilder, a new framework for personalized touristic tour planning. We mine from Flickr the information about the actual itineraries followed by a multitude of different tourists, and we match these itineraries on the touristic Point of Interests available from Wikipedia. The task of planning personalized touristic tours is then modeled as an instance of the Generalized Maximum Coverage problem. Wisdom-of-the-crowds information allows us to derive touristic plans that maximize a measure of interest for the tourist given her preferences and visiting time-budget. Experimental results on three different touristic cities show that our approach is effective and outperforms strong baselines.
{"title":"Where shall we go today?: planning touristic tours with tripbuilder","authors":"Igo Ramalho Brilhante, J. Macêdo, F. M. Nardini, R. Perego, C. Renso","doi":"10.1145/2505515.2505643","DOIUrl":"https://doi.org/10.1145/2505515.2505643","url":null,"abstract":"In this paper we propose TripBuilder, a new framework for personalized touristic tour planning. We mine from Flickr the information about the actual itineraries followed by a multitude of different tourists, and we match these itineraries on the touristic Point of Interests available from Wikipedia. The task of planning personalized touristic tours is then modeled as an instance of the Generalized Maximum Coverage problem. Wisdom-of-the-crowds information allows us to derive touristic plans that maximize a measure of interest for the tourist given her preferences and visiting time-budget. Experimental results on three different touristic cities show that our approach is effective and outperforms strong baselines.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81666087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Ardon, A. Bagchi, A. Mahanti, Amit Ruhela, Aaditeshwar Seth, R. M. Tripathy, Sipat Triukose
We present the first comprehensive characterization of the diffusion of ideas on Twitter, studying more than 5.96 million topics that include both popular and less popular topics. On a data set containing approximately 10 million users and a comprehensive scraping of 196 million tweets, we perform a rigorous temporal and spatial analysis, investigating the time-evolving properties of the subgraphs formed by the users discussing each topic. We focus on two different notions of the spatial: the network topology formed by follower-following links on Twitter, and the geospatial location of the users. We investigate the effect of initiators on the popularity of topics and find that users with a high number of followers have a strong impact on topic popularity. We deduce that topics become popular when disjoint clusters of users discussing them begin to merge and form one giant component that grows to cover a significant fraction of the network. Our geospatial analysis shows that highly popular topics are those that cross regional boundaries aggressively.
{"title":"Spatio-temporal and events based analysis of topic popularity in twitter","authors":"S. Ardon, A. Bagchi, A. Mahanti, Amit Ruhela, Aaditeshwar Seth, R. M. Tripathy, Sipat Triukose","doi":"10.1145/2505515.2505525","DOIUrl":"https://doi.org/10.1145/2505515.2505525","url":null,"abstract":"We present the first comprehensive characterization of the diffusion of ideas on Twitter, studying more than 5.96 million topics that include both popular and less popular topics. On a data set containing approximately 10 million users and a comprehensive scraping of 196 million tweets, we perform a rigorous temporal and spatial analysis, investigating the time-evolving properties of the subgraphs formed by the users discussing each topic. We focus on two different notions of the spatial: the network topology formed by follower-following links on Twitter, and the geospatial location of the users. We investigate the effect of initiators on the popularity of topics and find that users with a high number of followers have a strong impact on topic popularity. We deduce that topics become popular when disjoint clusters of users discussing them begin to merge and form one giant component that grows to cover a significant fraction of the network. Our geospatial analysis shows that highly popular topics are those that cross regional boundaries aggressively.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"115 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81791143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abhishek Mukherji, Jason Whitehouse, Christopher R. Botaish, Elke A. Rundensteiner, M. Ward
We demonstrate our SPHINX system that not only derives but also visualizes evidence-hypotheses relationships on a parameter space of belief and plausibility. SPHINX facilitates the analyst to interactively explore the contribution of different pieces of evidence towards the hypotheses. The key technical contributions of SPHINX include both computational and visual dimensions. The computational contributions cover (a.) flexible computational model selection; and (b.) real-time incremental strength computations. The visual contributions include (a.) sense-making over parameter space; (b.) filtering and abstraction options; (c.) novel visual displays such as evidence glyph and skyline views. Using two real datasets, we will demonstrate that the SPHINX system provides the analysts with rich insights into evidence-hypothesis relationships facilitating the discovery and decision making process.
{"title":"SPHINX: rich insights into evidence-hypotheses relationships via parameter space-based exploration","authors":"Abhishek Mukherji, Jason Whitehouse, Christopher R. Botaish, Elke A. Rundensteiner, M. Ward","doi":"10.1145/2505515.2508202","DOIUrl":"https://doi.org/10.1145/2505515.2508202","url":null,"abstract":"We demonstrate our SPHINX system that not only derives but also visualizes evidence-hypotheses relationships on a parameter space of belief and plausibility. SPHINX facilitates the analyst to interactively explore the contribution of different pieces of evidence towards the hypotheses. The key technical contributions of SPHINX include both computational and visual dimensions. The computational contributions cover (a.) flexible computational model selection; and (b.) real-time incremental strength computations. The visual contributions include (a.) sense-making over parameter space; (b.) filtering and abstraction options; (c.) novel visual displays such as evidence glyph and skyline views. Using two real datasets, we will demonstrate that the SPHINX system provides the analysts with rich insights into evidence-hypothesis relationships facilitating the discovery and decision making process.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79487752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiyun Luo, Christopher Wing, G. Yang, Marti A. Hearst
Professional search activities such as patent and legal search are often time sensitive and consist of rich information needs with multiple aspects or subtopics. This paper proposes a 3D water filling model to describe this search process, and derives a new evaluation metric, the Cube Test, to encompass the complex nature of professional search. The new metric is compared against state-of-the-art patent search evaluation metrics as well as Web search evaluation metrics over two distinct patent datasets. The experimental results show that the Cube Test metric effectively captures the characteristics and requirements of professional search.
{"title":"The water filling model and the cube test: multi-dimensional evaluation for professional search","authors":"Jiyun Luo, Christopher Wing, G. Yang, Marti A. Hearst","doi":"10.1145/2505515.2523648","DOIUrl":"https://doi.org/10.1145/2505515.2523648","url":null,"abstract":"Professional search activities such as patent and legal search are often time sensitive and consist of rich information needs with multiple aspects or subtopics. This paper proposes a 3D water filling model to describe this search process, and derives a new evaluation metric, the Cube Test, to encompass the complex nature of professional search. The new metric is compared against state-of-the-art patent search evaluation metrics as well as Web search evaluation metrics over two distinct patent datasets. The experimental results show that the Cube Test metric effectively captures the characteristics and requirements of professional search.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85075475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Chuklin, Anne Schuth, Katja Hofmann, P. Serdyukov, M. de Rijke
A result page of a modern web search engine is often much more complicated than a simple list of "ten blue links." In particular, a search engine may combine results from different sources (e.g., Web, News, and Images), and display these as grouped results to provide a better user experience. Such a system is called an aggregated or federated search system. Because search engines evolve over time, their results need to be constantly evaluated. However, one of the most efficient and widely used evaluation methods, interleaving, cannot be directly applied to aggregated search systems, as it ignores the need to group results originating from the same source (vertical results). We propose an interleaving algorithm that allows comparisons of search engine result pages containing grouped vertical documents. We compare our algorithm to existing interleaving algorithms and other evaluation methods (such as A/B-testing), both on real-life click log data and in simulation experiments. We find that our algorithm allows us to perform unbiased and accurate interleaved comparisons that are comparable to conventional evaluation techniques. We also show that our interleaving algorithm produces a ranking that does not substantially alter the user experience, while being sensitive to changes in both the vertical result block and the non-vertical document rankings. All this makes our proposed interleaving algorithm an essential tool for comparing IR systems with complex aggregated pages.
{"title":"Evaluating aggregated search using interleaving","authors":"A. Chuklin, Anne Schuth, Katja Hofmann, P. Serdyukov, M. de Rijke","doi":"10.1145/2505515.2505698","DOIUrl":"https://doi.org/10.1145/2505515.2505698","url":null,"abstract":"A result page of a modern web search engine is often much more complicated than a simple list of \"ten blue links.\" In particular, a search engine may combine results from different sources (e.g., Web, News, and Images), and display these as grouped results to provide a better user experience. Such a system is called an aggregated or federated search system. Because search engines evolve over time, their results need to be constantly evaluated. However, one of the most efficient and widely used evaluation methods, interleaving, cannot be directly applied to aggregated search systems, as it ignores the need to group results originating from the same source (vertical results). We propose an interleaving algorithm that allows comparisons of search engine result pages containing grouped vertical documents. We compare our algorithm to existing interleaving algorithms and other evaluation methods (such as A/B-testing), both on real-life click log data and in simulation experiments. We find that our algorithm allows us to perform unbiased and accurate interleaved comparisons that are comparable to conventional evaluation techniques. We also show that our interleaving algorithm produces a ranking that does not substantially alter the user experience, while being sensitive to changes in both the vertical result block and the non-vertical document rankings. All this makes our proposed interleaving algorithm an essential tool for comparing IR systems with complex aggregated pages.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85464675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study the problem of disinformation. We assume that an ``agent'' has some sensitive information that the ``adversary'' is trying to obtain. For example, a camera company (the agent) may secretly be developing its new camera model, and a user (the adversary) may want to know in advance the detailed specs of the model. The agent's goal is to disseminate false information to ``dilute'' what is known by the adversary. We model the adversary as an Entity Resolution (ER) process that pieces together available information. We formalize the problem of finding the disinformation with the highest benefit given a limited budget for creating the disinformation and propose efficient algorithms for solving the problem. We then evaluate our disinformation planning algorithms on real and synthetic data and compare the robustness of existing ER algorithms. In general, our disinformation techniques can be used as a framework for testing ER robustness.
{"title":"Disinformation techniques for entity resolution","authors":"Steven Euijong Whang, H. Garcia-Molina","doi":"10.1145/2505515.2505636","DOIUrl":"https://doi.org/10.1145/2505515.2505636","url":null,"abstract":"We study the problem of disinformation. We assume that an ``agent'' has some sensitive information that the ``adversary'' is trying to obtain. For example, a camera company (the agent) may secretly be developing its new camera model, and a user (the adversary) may want to know in advance the detailed specs of the model. The agent's goal is to disseminate false information to ``dilute'' what is known by the adversary. We model the adversary as an Entity Resolution (ER) process that pieces together available information. We formalize the problem of finding the disinformation with the highest benefit given a limited budget for creating the disinformation and propose efficient algorithms for solving the problem. We then evaluate our disinformation planning algorithms on real and synthetic data and compare the robustness of existing ER algorithms. In general, our disinformation techniques can be used as a framework for testing ER robustness.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"114 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82287366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Search and browsing activity is known to be a valuable source of information about user's search intent. It is extensively utilized by most of modern search engines to improve ranking by constructing certain ranking features as well as by personalizing search. Personalization aims at two major goals: extraction of stable preferences of a user and specification and disambiguation of the current query. The common way to approach these problems is to extract information from user's search and browsing long-term history and to utilize short-term history to determine the context of a given query. Personalization of the web search for the first queries in new search sessions of new users is more difficult due to the lack of both long- and short-term data. In this paper we study the problem of short-term personalization. To be more precise, we restrict our attention to the set of initial queries of search sessions. These, with the lack of contextual information, are known to be the most challenging for short-term personalization and are not covered by previous studies on the subject. To approach this problem in the absence of the search context, we employ short-term browsing context. We apply a widespread framework for personalization of search results based on the re-ranking approach and evaluate our methods on the large scale data. The proposed methods are shown to significantly improve non-personalized ranking of one of the major commercial search engines. To the best of our knowledge this is the first study addressing the problem of short-term personalization based on recent browsing history. We find that performance of this re-ranking approach can be reasonably predicted given a query. When we restrict the use of our method to the queries with largest expected gain, the resulting benefit of personalization increases significantly
{"title":"Personalization of web-search using short-term browsing context","authors":"Yury Ustinovsky, P. Serdyukov","doi":"10.1145/2505515.2505679","DOIUrl":"https://doi.org/10.1145/2505515.2505679","url":null,"abstract":"Search and browsing activity is known to be a valuable source of information about user's search intent. It is extensively utilized by most of modern search engines to improve ranking by constructing certain ranking features as well as by personalizing search. Personalization aims at two major goals: extraction of stable preferences of a user and specification and disambiguation of the current query. The common way to approach these problems is to extract information from user's search and browsing long-term history and to utilize short-term history to determine the context of a given query. Personalization of the web search for the first queries in new search sessions of new users is more difficult due to the lack of both long- and short-term data. In this paper we study the problem of short-term personalization. To be more precise, we restrict our attention to the set of initial queries of search sessions. These, with the lack of contextual information, are known to be the most challenging for short-term personalization and are not covered by previous studies on the subject. To approach this problem in the absence of the search context, we employ short-term browsing context. We apply a widespread framework for personalization of search results based on the re-ranking approach and evaluate our methods on the large scale data. The proposed methods are shown to significantly improve non-personalized ranking of one of the major commercial search engines. To the best of our knowledge this is the first study addressing the problem of short-term personalization based on recent browsing history. We find that performance of this re-ranking approach can be reasonably predicted given a query. When we restrict the use of our method to the queries with largest expected gain, the resulting benefit of personalization increases significantly","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80591204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Online social networks can often be represented as heterogeneous information networks containing abundant information about: who, where, when and what. Nowadays, people are usually involved in multiple social networks simultaneously. The multiple accounts of the same user in different networks are mostly isolated from each other without any connection between them. Discovering the correspondence of these accounts across multiple social networks is a crucial prerequisite for many interesting inter-network applications, such as link recommendation and community analysis using information from multiple networks. In this paper, we study the problem of anchor link prediction across multiple heterogeneous social networks, i.e., discovering the correspondence among different accounts of the same user. Unlike most prior work on link prediction and network alignment, we assume that the anchor links are one-to-one relationships (i.e., no two edges share a common endpoint) between the accounts in two social networks, and a small number of anchor links are known beforehand. We propose to extract heterogeneous features from multiple heterogeneous networks for anchor link prediction, including user's social, spatial, temporal and text information. Then we formulate the inference problem for anchor links as a stable matching problem between the two sets of user accounts in two different networks. An effective solution, MNA (Multi-Network Anchoring), is derived to infer anchor links w.r.t. the one-to-one constraint. Extensive experiments on two real-world heterogeneous social networks show that our MNA model consistently outperform other commonly-used baselines on anchor link prediction.
{"title":"Inferring anchor links across multiple heterogeneous social networks","authors":"Xiangnan Kong, Jiawei Zhang, Philip S. Yu","doi":"10.1145/2505515.2505531","DOIUrl":"https://doi.org/10.1145/2505515.2505531","url":null,"abstract":"Online social networks can often be represented as heterogeneous information networks containing abundant information about: who, where, when and what. Nowadays, people are usually involved in multiple social networks simultaneously. The multiple accounts of the same user in different networks are mostly isolated from each other without any connection between them. Discovering the correspondence of these accounts across multiple social networks is a crucial prerequisite for many interesting inter-network applications, such as link recommendation and community analysis using information from multiple networks. In this paper, we study the problem of anchor link prediction across multiple heterogeneous social networks, i.e., discovering the correspondence among different accounts of the same user. Unlike most prior work on link prediction and network alignment, we assume that the anchor links are one-to-one relationships (i.e., no two edges share a common endpoint) between the accounts in two social networks, and a small number of anchor links are known beforehand. We propose to extract heterogeneous features from multiple heterogeneous networks for anchor link prediction, including user's social, spatial, temporal and text information. Then we formulate the inference problem for anchor links as a stable matching problem between the two sets of user accounts in two different networks. An effective solution, MNA (Multi-Network Anchoring), is derived to infer anchor links w.r.t. the one-to-one constraint. Extensive experiments on two real-world heterogeneous social networks show that our MNA model consistently outperform other commonly-used baselines on anchor link prediction.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82533704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Query expansion for Information Retrieval is a challenging task. On the one hand, low quality expansion may hurt either recall, due to vocabulary mismatch, or precision, due to topic drift, and therefore reduce user satisfaction. On the other hand, utilizing a large number of expansion terms for a query may easily lead to resource consumption overhead. As web search engines apply strict constraints on response time, it is essential to estimate the impact of each expansion term on query performance at the pre-retrieval time. Our experimental results confirm that a significant part of expansions do not improve query performance, and it is possible to detect such expansions at the pre-retrieval time.
{"title":"Predicting the impact of expansion terms using semantic and user interaction features","authors":"A. Bakhtin, Yury Ustinovsky, P. Serdyukov","doi":"10.1145/2505515.2507872","DOIUrl":"https://doi.org/10.1145/2505515.2507872","url":null,"abstract":"Query expansion for Information Retrieval is a challenging task. On the one hand, low quality expansion may hurt either recall, due to vocabulary mismatch, or precision, due to topic drift, and therefore reduce user satisfaction. On the other hand, utilizing a large number of expansion terms for a query may easily lead to resource consumption overhead. As web search engines apply strict constraints on response time, it is essential to estimate the impact of each expansion term on query performance at the pre-retrieval time. Our experimental results confirm that a significant part of expansions do not improve query performance, and it is possible to detect such expansions at the pre-retrieval time.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80869711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Retrieval system effectiveness can be measured in two quite different ways: by monitoring the behavior of users and gathering data about the ease and accuracy with which they accomplish certain specified information-seeking tasks; or by using numeric effectiveness metrics to score system runs in reference to a set of relevance judgments. In the second approach, the effectiveness metric is chosen in the belief that user task performance, if it were to be measured by the first approach, should be linked to the score provided by the metric. This work explores that link, by analyzing the assumptions and implications of a number of effectiveness metrics, and exploring how these relate to observable user behaviors. Data recorded as part of a user study included user self-assessment of search task difficulty; gaze position; and click activity. Our results show that user behavior is influenced by a blend of many factors, including the extent to which relevant documents are encountered, the stage of the search process, and task difficulty. These insights can be used to guide development of batch effectiveness metrics.
{"title":"Users versus models: what observation tells us about effectiveness metrics","authors":"Alistair Moffat, Paul Thomas, Falk Scholer","doi":"10.1145/2505515.2507665","DOIUrl":"https://doi.org/10.1145/2505515.2507665","url":null,"abstract":"Retrieval system effectiveness can be measured in two quite different ways: by monitoring the behavior of users and gathering data about the ease and accuracy with which they accomplish certain specified information-seeking tasks; or by using numeric effectiveness metrics to score system runs in reference to a set of relevance judgments. In the second approach, the effectiveness metric is chosen in the belief that user task performance, if it were to be measured by the first approach, should be linked to the score provided by the metric. This work explores that link, by analyzing the assumptions and implications of a number of effectiveness metrics, and exploring how these relate to observable user behaviors. Data recorded as part of a user study included user self-assessment of search task difficulty; gaze position; and click activity. Our results show that user behavior is influenced by a blend of many factors, including the extent to which relevant documents are encountered, the stage of the search process, and task difficulty. These insights can be used to guide development of batch effectiveness metrics.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83023020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}