Query auto-completion (QAC) is one of the most prominent features of modern search engines. The list of query candidates is generated according to the prefix entered by the user in the search box and is updated on each new key stroke. Query prefixes tend to be short and ambiguous, and existing models mostly rely on the past popularity of matching candidates for ranking. However, the popularity of certain queries may vary drastically across different demographics and users. For instance, while instagram and imdb have comparable popularities overall and are both legitimate candidates to show for prefix i, the former is noticeably more popular among young female users, and the latter is more likely to be issued by men. In this paper, we present a supervised framework for personalizing auto-completion ranking. We introduce a novel labelling strategy for generating offline training labels that can be used for learning personalized rankers. We compare the effectiveness of several user-specific and demographic-based features and show that among them, the user's long-term search history and location are the most effective for personalizing auto-completion rankers. We perform our experiments on the publicly available AOL query logs, and also on the larger-scale logs of Bing. The results suggest that supervised rankers enhanced by personalization features can significantly outperform the existing popularity-based base-lines, in terms of mean reciprocal rank (MRR) by up to 9%.
{"title":"Learning to personalize query auto-completion","authors":"Milad Shokouhi","doi":"10.1145/2484028.2484076","DOIUrl":"https://doi.org/10.1145/2484028.2484076","url":null,"abstract":"Query auto-completion (QAC) is one of the most prominent features of modern search engines. The list of query candidates is generated according to the prefix entered by the user in the search box and is updated on each new key stroke. Query prefixes tend to be short and ambiguous, and existing models mostly rely on the past popularity of matching candidates for ranking. However, the popularity of certain queries may vary drastically across different demographics and users. For instance, while instagram and imdb have comparable popularities overall and are both legitimate candidates to show for prefix i, the former is noticeably more popular among young female users, and the latter is more likely to be issued by men. In this paper, we present a supervised framework for personalizing auto-completion ranking. We introduce a novel labelling strategy for generating offline training labels that can be used for learning personalized rankers. We compare the effectiveness of several user-specific and demographic-based features and show that among them, the user's long-term search history and location are the most effective for personalizing auto-completion rankers. We perform our experiments on the publicly available AOL query logs, and also on the larger-scale logs of Bing. The results suggest that supervised rankers enhanced by personalization features can significantly outperform the existing popularity-based base-lines, in terms of mean reciprocal rank (MRR) by up to 9%.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125099560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
affects how users interact with a search system. Microeconomic theory is used to generate the cost-interaction hypothesis that states as the cost of querying increases, users will pose fewer queries and examine more documents per query. A between-subjects laboratory study with 36 undergraduate subjects was conducted, where subjects were randomly assigned to use one of three search interfaces that varied according to the amount of physical cost required to query: Structured (high cost), Standard (medium cost) and Query Suggestion (low cost). Results show that subjects who used the Structured interface submitted significantly fewer queries, spent more time on search results pages, examined significantly more documents per query, and went to greater depths in the search results list. Results also showed that these subjects spent longer generating their initial queries, saved more relevant documents and rated their queries as more successful. These findings have implications for the usefulness of microeconomic theory as a way to model and explain search interaction, as well as for the design of query facilities.
{"title":"How query cost affects search behavior","authors":"L. Azzopardi, D. Kelly, Kathy Brennan","doi":"10.1145/2484028.2484049","DOIUrl":"https://doi.org/10.1145/2484028.2484049","url":null,"abstract":"affects how users interact with a search system. Microeconomic theory is used to generate the cost-interaction hypothesis that states as the cost of querying increases, users will pose fewer queries and examine more documents per query. A between-subjects laboratory study with 36 undergraduate subjects was conducted, where subjects were randomly assigned to use one of three search interfaces that varied according to the amount of physical cost required to query: Structured (high cost), Standard (medium cost) and Query Suggestion (low cost). Results show that subjects who used the Structured interface submitted significantly fewer queries, spent more time on search results pages, examined significantly more documents per query, and went to greater depths in the search results list. Results also showed that these subjects spent longer generating their initial queries, saved more relevant documents and rated their queries as more successful. These findings have implications for the usefulness of microeconomic theory as a way to model and explain search interaction, as well as for the design of query facilities.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117122337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The textual context of an element, structurally, contains traces of evidences. Utilizing this context in scoring is called contextualization. In this study we hypothesize that the context of an XML-element originated from its textit{preceding} and textit{following} elements in the sequential ordering of a document improves the quality of retrieval. In the tree form of the document's structure, textit{kinship} contextualization means, contextualization based on the horizontal and vertical elements in the textit{kinship tree,} or elements in closer to a wider structural kinship. We have tested several variants of kinship contextualization and verified notable improvements in comparison with the baseline system and gold standards in the retrieval of focused elements.
{"title":"Kinship contextualization: utilizing the preceding and following structural elements","authors":"Muhammad Ali Norozi, Paavo Arvola","doi":"10.1145/2484028.2484111","DOIUrl":"https://doi.org/10.1145/2484028.2484111","url":null,"abstract":"The textual context of an element, structurally, contains traces of evidences. Utilizing this context in scoring is called contextualization. In this study we hypothesize that the context of an XML-element originated from its textit{preceding} and textit{following} elements in the sequential ordering of a document improves the quality of retrieval. In the tree form of the document's structure, textit{kinship} contextualization means, contextualization based on the horizontal and vertical elements in the textit{kinship tree,} or elements in closer to a wider structural kinship. We have tested several variants of kinship contextualization and verified notable improvements in comparison with the baseline system and gold standards in the retrieval of focused elements.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125816443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Shou, Kuang Mao, Xinyuan Luo, Ke Chen, Gang Chen, Tianlei Hu
Singing is a popular social activity and a good way of expressing one's feelings. One important reason for unsuccessful singing performance is because the singer fails to choose a suitable song. In this paper, we propose a novel singing competence-based song recommendation framework. It is distinguished from most existing music recommendation systems which rely on the computation of listeners' interests or similarity. We model a singer's vocal competence as singer profile, which takes voice pitch, intensity, and quality into consideration. Then we propose techniques to acquire singer profiles. We also present a song profile model which is used to construct a human annotated song database. Finally, we propose a learning-to-rank scheme for recommending songs by singer profile. The experimental study on real singers demonstrates the effectiveness of our approach and its advantages over two baseline methods. To the best of our knowledge, our work is the first to study competence-based song recommendation.
{"title":"Competence-based song recommendation","authors":"L. Shou, Kuang Mao, Xinyuan Luo, Ke Chen, Gang Chen, Tianlei Hu","doi":"10.1145/2484028.2484048","DOIUrl":"https://doi.org/10.1145/2484028.2484048","url":null,"abstract":"Singing is a popular social activity and a good way of expressing one's feelings. One important reason for unsuccessful singing performance is because the singer fails to choose a suitable song. In this paper, we propose a novel singing competence-based song recommendation framework. It is distinguished from most existing music recommendation systems which rely on the computation of listeners' interests or similarity. We model a singer's vocal competence as singer profile, which takes voice pitch, intensity, and quality into consideration. Then we propose techniques to acquire singer profiles. We also present a song profile model which is used to construct a human annotated song database. Finally, we propose a learning-to-rank scheme for recommending songs by singer profile. The experimental study on real singers demonstrates the effectiveness of our approach and its advantages over two baseline methods. To the best of our knowledge, our work is the first to study competence-based song recommendation.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126969741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Huang, Yunjun Gao, Lu Chen, Rui Li, K. Chiew, Qinming He
Browse with either web directories or social bookmarks is an important complementation to search by keywords in web information retrieval. To improve users' browse experiences and facilitate the web directory construction, in this paper, we propose a novel browse system called Social Web Directory (SWD for short) by integrating web directories and social bookmarks. In SWD, (1) web pages are automatically categorized to a hierarchical structure to be retrieved efficiently, and (2) the popular web pages, hottest tags, and expert users in each category are ranked to help users find information more conveniently. Extensive experimental results demonstrate the effectiveness of our SWD system.
在网络信息检索中,利用网络目录或社交书签进行浏览是对关键词搜索的重要补充。为了提高用户的浏览体验,方便网络目录的构建,本文提出了一种将网络目录与社交书签相结合的新型浏览系统Social web directory(简称SWD)。在SWD中,(1)自动将网页分类为一个层次结构,以便高效地检索;(2)对每个类别中的热门网页、最热标签和专家用户进行排名,以帮助用户更方便地查找信息。大量的实验结果证明了我们的SWD系统的有效性。
{"title":"Browse with a social web directory","authors":"H. Huang, Yunjun Gao, Lu Chen, Rui Li, K. Chiew, Qinming He","doi":"10.1145/2484028.2484141","DOIUrl":"https://doi.org/10.1145/2484028.2484141","url":null,"abstract":"Browse with either web directories or social bookmarks is an important complementation to search by keywords in web information retrieval. To improve users' browse experiences and facilitate the web directory construction, in this paper, we propose a novel browse system called Social Web Directory (SWD for short) by integrating web directories and social bookmarks. In SWD, (1) web pages are automatically categorized to a hierarchical structure to be retrieved efficiently, and (2) the popular web pages, hottest tags, and expert users in each category are ranked to help users find information more conveniently. Extensive experimental results demonstrate the effectiveness of our SWD system.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122223130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Search systems use context to effectively satisfy a user's information need as expressed by a query. Tasks are important factors in determining user context during search and many studies have been conducted that identify tasks and task stages through users' interaction behavior with search systems. The type of interaction available to users, however, depends on the type of search interface features available. Queries are the most pervasive input from users to express their information need regardless of the input method, e.g., typing keywords or clicking facets. Instead of characterizing interaction behavior in terms of interface specific components, we propose to characterize users' search behavior in terms of two types of query modification: (i) direct modification, which refers to reformulations of queries; and (ii) indirect modification, which refers to user operations on additional input components provided by various search interfaces. We investigate the utility of characterizing task stages through direct and indirect query reformulations in a case study and find that it is possible to effectively differentiate subsequent stages of the search task. We found that describing user interaction behavior in such a generic form allowed us to relate user actions to search task stages independent from the specific search interface deployed. The next step will then be to validate this idea in a setting with a wider palette of search tasks and tools.
{"title":"Characterizing stages of a multi-session complex search task through direct and indirect query modifications","authors":"Jiyin He, M. Bron, A. D. Vries","doi":"10.1145/2484028.2484178","DOIUrl":"https://doi.org/10.1145/2484028.2484178","url":null,"abstract":"Search systems use context to effectively satisfy a user's information need as expressed by a query. Tasks are important factors in determining user context during search and many studies have been conducted that identify tasks and task stages through users' interaction behavior with search systems. The type of interaction available to users, however, depends on the type of search interface features available. Queries are the most pervasive input from users to express their information need regardless of the input method, e.g., typing keywords or clicking facets. Instead of characterizing interaction behavior in terms of interface specific components, we propose to characterize users' search behavior in terms of two types of query modification: (i) direct modification, which refers to reformulations of queries; and (ii) indirect modification, which refers to user operations on additional input components provided by various search interfaces. We investigate the utility of characterizing task stages through direct and indirect query reformulations in a case study and find that it is possible to effectively differentiate subsequent stages of the search task. We found that describing user interaction behavior in such a generic form allowed us to relate user actions to search task stages independent from the specific search interface deployed. The next step will then be to validate this idea in a setting with a wider palette of search tasks and tools.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121135772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Expert retrieval has been widely studied especially after the introduction of Expert Finding task in the TREC's Enterprise Track in 2005 [3]. This track provided two different test collections crawled from two organizations' public-facing websites and internal emails which led to the development of many state-of-the-art algorithms on expert retrieval [1]. Until recently, these datasets were considered good representatives of the information resources available within enterprise. However, the recent growth of social media also influenced the work environment, and social media became a common communication and collaboration tool within organizations. According to a recent survey by McKinsey Global Institute [2], 29% of the companies use at least one social media tool for matching their employees to tasks, and 26% of them assess their employees' performance by using social media. This shows that intra-organizational social media became an important resource to identify expertise within organizations. In recent years, in addition to the intra-organizational social media, public social media tools like Twitter, Facebook, LinkedIn also became common environments for searching expertise. These tools provide an opportunity for their users to show their specific skills to the world which motivates recruiters to look for talented job candidates on social media, or writers and reporters to find experts for consulting on specific topics they are working on. With these motivations in mind, in this work we propose to develop expert retrieval algorithms for intra-organizational and public social media tools. Social media datasets have both challenges and advantages. In terms of challenges, they do not always contain context on one specific domain, instead one social media tool may contain discussions on technical stuff, hobbies or news concurrently. They may also contain spam posts or advertisements. Compared to well-edited enterprise documents, they are much more informal in language. Furthermore, depending on the social media platform, they may have limits on the number of characters used in posts. Even though they include the challenges stated above, they also bring some unique authority signals, such as votes, comments, follower/following information, which can be useful in estimating expertise. Furthermore, compared to previously used enterprise documents, social media provides clear associations between documents and candidates in the context of authorship information. In this work, we propose to develop expert retrieval approaches which will handle these challenges while making use of the advantages. Expert retrieval is a very useful application by itself; furthermore, it can be a step towards improving other social media applications. Social media is different than other web based tools mainly because it is dependent on its users. In social media, users are not just content consumers, but they are also the primary and sometimes the only content creators
自2005年在TREC的企业轨道中引入专家查找任务以来,专家检索得到了广泛的研究。这条赛道提供了从两个组织的面向公众的网站和内部电子邮件中抓取的两个不同的测试集合,这导致了专家检索[1]上许多最先进算法的发展。直到最近,这些数据集还被认为是企业内可用信息资源的良好代表。然而,最近社交媒体的发展也影响了工作环境,社交媒体成为组织内部常见的沟通和协作工具。根据麦肯锡全球研究院(McKinsey Global Institute)最近的一项调查,29%的公司至少使用一种社交媒体工具来匹配员工的任务,26%的公司通过使用社交媒体来评估员工的表现。这表明组织内社交媒体成为组织内识别专业知识的重要资源。近年来,除了组织内部的社交媒体外,Twitter、Facebook、LinkedIn等公共社交媒体工具也成为搜索专业知识的常见环境。这些工具为他们的用户提供了一个向世界展示他们的特定技能的机会,这促使招聘人员在社交媒体上寻找有才华的求职者,或者作家和记者找到专家来咨询他们正在研究的特定主题。考虑到这些动机,在这项工作中,我们建议为组织内部和公共社交媒体工具开发专家检索算法。社交媒体数据集既有挑战,也有优势。就挑战而言,它们并不总是包含特定领域的上下文,相反,一个社交媒体工具可能同时包含有关技术内容、爱好或新闻的讨论。它们也可能包含垃圾邮件或广告。与精心编辑的企业文档相比,它们在语言上要随意得多。此外,根据社交媒体平台的不同,他们可能会限制帖子中使用的字符数量。尽管它们包括上述挑战,但它们也带来了一些独特的权威信号,如投票、评论、追随者/跟踪信息,这些信息在评估专业知识时很有用。此外,与以前使用的企业文档相比,社交媒体在作者信息上下文中提供了文档和候选人之间的明确关联。在这项工作中,我们建议开发专家检索方法来处理这些挑战,同时利用优势。专家检索本身就是一个非常有用的应用;此外,它可以成为改进其他社交媒体应用程序的一步。社交媒体不同于其他基于网络的工具,主要是因为它依赖于它的用户。在社交媒体中,用户不仅仅是内容的消费者,他们也是主要的,有时甚至是唯一的内容创造者。因此,社交媒体中任何用户生成内容的质量取决于其创建者。在本文中,我们建议使用用户的专业知识来改进现有的应用程序,以便他们不仅可以基于内容,还可以基于内容创建者的专业知识来估计内容的相关性。通过使用内容生成器的专业知识,我们也希望增加更可靠的内容。我们建议利用这些用户的专业知识信息来改进社交媒体中的特别搜索和问答应用程序。在这项工作中,以前的TREC企业数据集,可用的组织内部社交媒体和公共社交媒体数据集将用于测试所提出的算法。
{"title":"Effective approaches to retrieving and using expertise in social media","authors":"Reyyan Yeniterzi","doi":"10.1145/2484028.2484230","DOIUrl":"https://doi.org/10.1145/2484028.2484230","url":null,"abstract":"Expert retrieval has been widely studied especially after the introduction of Expert Finding task in the TREC's Enterprise Track in 2005 [3]. This track provided two different test collections crawled from two organizations' public-facing websites and internal emails which led to the development of many state-of-the-art algorithms on expert retrieval [1]. Until recently, these datasets were considered good representatives of the information resources available within enterprise. However, the recent growth of social media also influenced the work environment, and social media became a common communication and collaboration tool within organizations. According to a recent survey by McKinsey Global Institute [2], 29% of the companies use at least one social media tool for matching their employees to tasks, and 26% of them assess their employees' performance by using social media. This shows that intra-organizational social media became an important resource to identify expertise within organizations. In recent years, in addition to the intra-organizational social media, public social media tools like Twitter, Facebook, LinkedIn also became common environments for searching expertise. These tools provide an opportunity for their users to show their specific skills to the world which motivates recruiters to look for talented job candidates on social media, or writers and reporters to find experts for consulting on specific topics they are working on. With these motivations in mind, in this work we propose to develop expert retrieval algorithms for intra-organizational and public social media tools. Social media datasets have both challenges and advantages. In terms of challenges, they do not always contain context on one specific domain, instead one social media tool may contain discussions on technical stuff, hobbies or news concurrently. They may also contain spam posts or advertisements. Compared to well-edited enterprise documents, they are much more informal in language. Furthermore, depending on the social media platform, they may have limits on the number of characters used in posts. Even though they include the challenges stated above, they also bring some unique authority signals, such as votes, comments, follower/following information, which can be useful in estimating expertise. Furthermore, compared to previously used enterprise documents, social media provides clear associations between documents and candidates in the context of authorship information. In this work, we propose to develop expert retrieval approaches which will handle these challenges while making use of the advantages. Expert retrieval is a very useful application by itself; furthermore, it can be a step towards improving other social media applications. Social media is different than other web based tools mainly because it is dependent on its users. In social media, users are not just content consumers, but they are also the primary and sometimes the only content creators","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127693238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Freitas, Fabrício F. de Faria, Seán O'Riain, E. Curry
This paper demonstrates Treo, a natural language query mechanism for Linked Data graphs. The approach uses a distributional semantic vector space model to semantically match user query terms with data, supporting vocabulary-independent (or schema-agnostic) queries over structured data.
{"title":"Answering natural language queries over linked data graphs: a distributional semantics approach","authors":"A. Freitas, Fabrício F. de Faria, Seán O'Riain, E. Curry","doi":"10.1145/2484028.2484209","DOIUrl":"https://doi.org/10.1145/2484028.2484209","url":null,"abstract":"This paper demonstrates Treo, a natural language query mechanism for Linked Data graphs. The approach uses a distributional semantic vector space model to semantically match user query terms with data, supporting vocabulary-independent (or schema-agnostic) queries over structured data.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127934250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Electronic medical records (EMRs) are being increasingly used worldwide to facilitate improved healthcare services [2,3]. They describe the clinical decision process relating to a patient, detailing the observed symptoms, the conducted diagnostic tests, the identified diagnoses and the prescribed treatments. However, medical records search is challenging, due to the implicit knowledge inherent within the medical records - such knowledge may be known by medical practitioners, but hidden to an information retrieval (IR) system [3]. For instance, the mention of a treatment such as a drug may indicate to a practitioner that a particular diagnosis has been made even if this was not explicitly mentioned in the patient's EMRs. Moreover, the fact that a symptom has not been observed by a clinician may rule out some specific diagnoses. Our work focuses on searching EMRs to identify patients with medical histories relevant to the medical condition(s) stated in a query. The resulting system can be beneficial to healthcare providers, administrators, and researchers who may wish to analyse the effectiveness of a particular medical procedure to combat a specific disease [2,4]. During retrieval, a healthcare provider may indicate a number of inclusion criteria to describe the type of patients of interest. For example, the used criteria may include personal profiles (e.g. age and gender) or some specific medical symptoms and tests, allowing to identify patients that have EMRs matching the criteria. To attain effective retrieval performance, we hypothesise that, in such a medical IR system, both the information needs and patients should be modelled based on how the medical process is developed. Specifically, our thesis states that since the medical decision process typically encompasses four aspects (symptom, diagnostic test, diagnosis, and treatment), a medical search system should take into account these aspects and apply inferences to recover possible implicit knowledge. We postulate that considering these aspects and their derived implicit knowledge at different levels of the retrieval process (namely, sentence, record, and inter-record level) enhances the retrieval performance. Indeed, we propose to build a query and patient understanding framework that can gain insights from EMRs and queries, by modelling and reasoning during retrieval in terms of the four aforementioned aspects (symptom, diagnostic test, diagnosis, and treatment) at three different levels of the retrieval process.
{"title":"A query and patient understanding framework for medical records search","authors":"Nut Limsopatham","doi":"10.1145/2484028.2484228","DOIUrl":"https://doi.org/10.1145/2484028.2484228","url":null,"abstract":"Electronic medical records (EMRs) are being increasingly used worldwide to facilitate improved healthcare services [2,3]. They describe the clinical decision process relating to a patient, detailing the observed symptoms, the conducted diagnostic tests, the identified diagnoses and the prescribed treatments. However, medical records search is challenging, due to the implicit knowledge inherent within the medical records - such knowledge may be known by medical practitioners, but hidden to an information retrieval (IR) system [3]. For instance, the mention of a treatment such as a drug may indicate to a practitioner that a particular diagnosis has been made even if this was not explicitly mentioned in the patient's EMRs. Moreover, the fact that a symptom has not been observed by a clinician may rule out some specific diagnoses. Our work focuses on searching EMRs to identify patients with medical histories relevant to the medical condition(s) stated in a query. The resulting system can be beneficial to healthcare providers, administrators, and researchers who may wish to analyse the effectiveness of a particular medical procedure to combat a specific disease [2,4]. During retrieval, a healthcare provider may indicate a number of inclusion criteria to describe the type of patients of interest. For example, the used criteria may include personal profiles (e.g. age and gender) or some specific medical symptoms and tests, allowing to identify patients that have EMRs matching the criteria. To attain effective retrieval performance, we hypothesise that, in such a medical IR system, both the information needs and patients should be modelled based on how the medical process is developed. Specifically, our thesis states that since the medical decision process typically encompasses four aspects (symptom, diagnostic test, diagnosis, and treatment), a medical search system should take into account these aspects and apply inferences to recover possible implicit knowledge. We postulate that considering these aspects and their derived implicit knowledge at different levels of the retrieval process (namely, sentence, record, and inter-record level) enhances the retrieval performance. Indeed, we propose to build a query and patient understanding framework that can gain insights from EMRs and queries, by modelling and reasoning during retrieval in terms of the four aforementioned aspects (symptom, diagnostic test, diagnosis, and treatment) at three different levels of the retrieval process.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129079654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Twitter is currently one of the largest social hubs for users to spread and discuss news. For most of the top news stories happening, there are corresponding discussions on social media. In this demonstration TweetMogaz is presented, which is a platform for microblog search and filtering. It creates a real-time comprehensive report about what people discuss and share around news happening in certain regions. TweetMogaz reports the most popular tweets, jokes, videos, images, and news articles that people share about top news stories. Moreover, it allows users to search for specific topics. A scalable automatic technique for microblog filtering is used to obtain relevant tweets to a certain news category in a region. TweetMogaz.com demonstrates the effectiveness of our filtering technique for reporting public response toward news in different Arabic regions including Egypt and Syria in real-time.
{"title":"TweetMogaz: a news portal of tweets","authors":"Walid Magdy","doi":"10.1145/2484028.2484212","DOIUrl":"https://doi.org/10.1145/2484028.2484212","url":null,"abstract":"Twitter is currently one of the largest social hubs for users to spread and discuss news. For most of the top news stories happening, there are corresponding discussions on social media. In this demonstration TweetMogaz is presented, which is a platform for microblog search and filtering. It creates a real-time comprehensive report about what people discuss and share around news happening in certain regions. TweetMogaz reports the most popular tweets, jokes, videos, images, and news articles that people share about top news stories. Moreover, it allows users to search for specific topics. A scalable automatic technique for microblog filtering is used to obtain relevant tweets to a certain news category in a region. TweetMogaz.com demonstrates the effectiveness of our filtering technique for reporting public response toward news in different Arabic regions including Egypt and Syria in real-time.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129305104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}