Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval最新文献

英文中文

Learning to personalize query auto-completion 学习个性化查询自动完成

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484076

Milad Shokouhi

Query auto-completion (QAC) is one of the most prominent features of modern search engines. The list of query candidates is generated according to the prefix entered by the user in the search box and is updated on each new key stroke. Query prefixes tend to be short and ambiguous, and existing models mostly rely on the past popularity of matching candidates for ranking. However, the popularity of certain queries may vary drastically across different demographics and users. For instance, while instagram and imdb have comparable popularities overall and are both legitimate candidates to show for prefix i, the former is noticeably more popular among young female users, and the latter is more likely to be issued by men. In this paper, we present a supervised framework for personalizing auto-completion ranking. We introduce a novel labelling strategy for generating offline training labels that can be used for learning personalized rankers. We compare the effectiveness of several user-specific and demographic-based features and show that among them, the user's long-term search history and location are the most effective for personalizing auto-completion rankers. We perform our experiments on the publicly available AOL query logs, and also on the larger-scale logs of Bing. The results suggest that supervised rankers enhanced by personalization features can significantly outperform the existing popularity-based base-lines, in terms of mean reciprocal rank (MRR) by up to 9%.

查询自动完成(QAC)是现代搜索引擎最突出的特性之一。查询候选列表根据用户在搜索框中输入的前缀生成，并在每次新的按键时更新。查询前缀往往很短且模棱两可，现有模型主要依赖于过去匹配候选项的流行程度来进行排序。然而，某些查询的受欢迎程度在不同的人口统计数据和用户之间可能会有很大差异。例如，虽然instagram和imdb的总体受欢迎程度相当，而且都是前缀i的合法候选，但前者在年轻女性用户中明显更受欢迎，而后者更可能由男性发布。在本文中，我们提出了一个个性化自动完成排名的监督框架。我们引入了一种新的标签策略，用于生成离线训练标签，用于学习个性化排名器。我们比较了几个特定于用户和基于人口统计的功能的有效性，并表明其中，用户的长期搜索历史和位置对于个性化自动完成排名最有效。我们在公开可用的AOL查询日志和Bing的更大规模日志上执行实验。结果表明，通过个性化特征增强的监督排序器在平均倒数排名(MRR)方面显著优于现有的基于人气的基线，最高可达9%。

{"title":"Learning to personalize query auto-completion","authors":"Milad Shokouhi","doi":"10.1145/2484028.2484076","DOIUrl":"https://doi.org/10.1145/2484028.2484076","url":null,"abstract":"Query auto-completion (QAC) is one of the most prominent features of modern search engines. The list of query candidates is generated according to the prefix entered by the user in the search box and is updated on each new key stroke. Query prefixes tend to be short and ambiguous, and existing models mostly rely on the past popularity of matching candidates for ranking. However, the popularity of certain queries may vary drastically across different demographics and users. For instance, while instagram and imdb have comparable popularities overall and are both legitimate candidates to show for prefix i, the former is noticeably more popular among young female users, and the latter is more likely to be issued by men. In this paper, we present a supervised framework for personalizing auto-completion ranking. We introduce a novel labelling strategy for generating offline training labels that can be used for learning personalized rankers. We compare the effectiveness of several user-specific and demographic-based features and show that among them, the user's long-term search history and location are the most effective for personalizing auto-completion rankers. We perform our experiments on the publicly available AOL query logs, and also on the larger-scale logs of Bing. The results suggest that supervised rankers enhanced by personalization features can significantly outperform the existing popularity-based base-lines, in terms of mean reciprocal rank (MRR) by up to 9%.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125099560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 189

How query cost affects search behavior 查询成本如何影响搜索行为

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484049

L. Azzopardi, D. Kelly, Kathy Brennan

affects how users interact with a search system. Microeconomic theory is used to generate the cost-interaction hypothesis that states as the cost of querying increases, users will pose fewer queries and examine more documents per query. A between-subjects laboratory study with 36 undergraduate subjects was conducted, where subjects were randomly assigned to use one of three search interfaces that varied according to the amount of physical cost required to query: Structured (high cost), Standard (medium cost) and Query Suggestion (low cost). Results show that subjects who used the Structured interface submitted significantly fewer queries, spent more time on search results pages, examined significantly more documents per query, and went to greater depths in the search results list. Results also showed that these subjects spent longer generating their initial queries, saved more relevant documents and rated their queries as more successful. These findings have implications for the usefulness of microeconomic theory as a way to model and explain search interaction, as well as for the design of query facilities.

影响用户与搜索系统的交互方式。微观经济学理论用于生成成本交互假设，该假设认为，随着查询成本的增加，用户将提出更少的查询，每个查询检查更多的文档。我们对36名本科生进行了受试者间实验室研究，受试者被随机分配使用三种搜索界面中的一种，这些界面根据查询所需的物理成本而变化:结构化(高成本)、标准(中等成本)和查询建议(低成本)。结果表明，使用结构化接口的受试者提交的查询明显减少，在搜索结果页面上花费的时间更多，每个查询检查的文档明显更多，并且在搜索结果列表中更深入。结果还显示，这些受试者花了更长的时间来生成他们的初始查询，保存了更多的相关文档，并认为他们的查询更成功。这些发现暗示了微观经济理论作为建模和解释搜索交互的有用性，以及查询设施的设计。

{"title":"How query cost affects search behavior","authors":"L. Azzopardi, D. Kelly, Kathy Brennan","doi":"10.1145/2484028.2484049","DOIUrl":"https://doi.org/10.1145/2484028.2484049","url":null,"abstract":"affects how users interact with a search system. Microeconomic theory is used to generate the cost-interaction hypothesis that states as the cost of querying increases, users will pose fewer queries and examine more documents per query. A between-subjects laboratory study with 36 undergraduate subjects was conducted, where subjects were randomly assigned to use one of three search interfaces that varied according to the amount of physical cost required to query: Structured (high cost), Standard (medium cost) and Query Suggestion (low cost). Results show that subjects who used the Structured interface submitted significantly fewer queries, spent more time on search results pages, examined significantly more documents per query, and went to greater depths in the search results list. Results also showed that these subjects spent longer generating their initial queries, saved more relevant documents and rated their queries as more successful. These findings have implications for the usefulness of microeconomic theory as a way to model and explain search interaction, as well as for the design of query facilities.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117122337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 88

Kinship contextualization: utilizing the preceding and following structural elements 亲属关系语境化:利用前后结构要素

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484111

Muhammad Ali Norozi, Paavo Arvola

The textual context of an element, structurally, contains traces of evidences. Utilizing this context in scoring is called contextualization. In this study we hypothesize that the context of an XML-element originated from its textit{preceding} and textit{following} elements in the sequential ordering of a document improves the quality of retrieval. In the tree form of the document's structure, textit{kinship} contextualization means, contextualization based on the horizontal and vertical elements in the textit{kinship tree,} or elements in closer to a wider structural kinship. We have tested several variants of kinship contextualization and verified notable improvements in comparison with the baseline system and gold standards in the retrieval of focused elements.

一个元素的文本语境，在结构上包含证据的痕迹。在评分中利用这种情境被称为情境化。在本研究中，我们假设xml元素的上下文来源于文档顺序中的textit{前}textit{一个和后}一个元素，从而提高了检索的质量。在文献结构的树形中，textit{亲属关系}语境化是指，基于亲属关系textit{树中横向和纵向元素的语境化}，或更接近于更广泛的结构性亲属关系元素的语境化。我们已经测试了亲属关系语境化的几种变体，并验证了在检索重点要素方面与基线系统和金标准相比的显着改进。

引用次数: 4

Competence-based song recommendation 基于能力的歌曲推荐

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484048

L. Shou, Kuang Mao, Xinyuan Luo, Ke Chen, Gang Chen, Tianlei Hu

Singing is a popular social activity and a good way of expressing one's feelings. One important reason for unsuccessful singing performance is because the singer fails to choose a suitable song. In this paper, we propose a novel singing competence-based song recommendation framework. It is distinguished from most existing music recommendation systems which rely on the computation of listeners' interests or similarity. We model a singer's vocal competence as singer profile, which takes voice pitch, intensity, and quality into consideration. Then we propose techniques to acquire singer profiles. We also present a song profile model which is used to construct a human annotated song database. Finally, we propose a learning-to-rank scheme for recommending songs by singer profile. The experimental study on real singers demonstrates the effectiveness of our approach and its advantages over two baseline methods. To the best of our knowledge, our work is the first to study competence-based song recommendation.

唱歌是一种流行的社会活动，也是表达情感的好方法。歌唱表演不成功的一个重要原因是歌手没有选择合适的歌曲。在本文中，我们提出了一个新的基于歌唱能力的歌曲推荐框架。它区别于大多数现有的音乐推荐系统依赖于听众兴趣或相似度的计算。我们将歌手的声音能力建模为歌手的形象，其中考虑了音高，强度和质量。然后，我们提出了获取歌手资料的技术。我们还提出了一个歌曲轮廓模型，用于构建人类注释歌曲数据库。最后，我们提出了一种根据歌手个人资料推荐歌曲的学习排序方案。通过对真实歌手的实验研究，证明了该方法的有效性和优于两种基线方法的优点。据我们所知，我们的工作是第一个研究基于能力的歌曲推荐。

引用次数: 6

Browse with a social web directory 浏览与社会网络目录

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484141

H. Huang, Yunjun Gao, Lu Chen, Rui Li, K. Chiew, Qinming He

Browse with either web directories or social bookmarks is an important complementation to search by keywords in web information retrieval. To improve users' browse experiences and facilitate the web directory construction, in this paper, we propose a novel browse system called Social Web Directory (SWD for short) by integrating web directories and social bookmarks. In SWD, (1) web pages are automatically categorized to a hierarchical structure to be retrieved efficiently, and (2) the popular web pages, hottest tags, and expert users in each category are ranked to help users find information more conveniently. Extensive experimental results demonstrate the effectiveness of our SWD system.

在网络信息检索中，利用网络目录或社交书签进行浏览是对关键词搜索的重要补充。为了提高用户的浏览体验，方便网络目录的构建，本文提出了一种将网络目录与社交书签相结合的新型浏览系统Social web directory(简称SWD)。在SWD中，(1)自动将网页分类为一个层次结构，以便高效地检索;(2)对每个类别中的热门网页、最热标签和专家用户进行排名，以帮助用户更方便地查找信息。大量的实验结果证明了我们的SWD系统的有效性。

引用次数: 4

Characterizing stages of a multi-session complex search task through direct and indirect query modifications 通过直接和间接查询修改来描述多会话复杂搜索任务的各个阶段

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484178

Jiyin He, M. Bron, A. D. Vries

Search systems use context to effectively satisfy a user's information need as expressed by a query. Tasks are important factors in determining user context during search and many studies have been conducted that identify tasks and task stages through users' interaction behavior with search systems. The type of interaction available to users, however, depends on the type of search interface features available. Queries are the most pervasive input from users to express their information need regardless of the input method, e.g., typing keywords or clicking facets. Instead of characterizing interaction behavior in terms of interface specific components, we propose to characterize users' search behavior in terms of two types of query modification: (i) direct modification, which refers to reformulations of queries; and (ii) indirect modification, which refers to user operations on additional input components provided by various search interfaces. We investigate the utility of characterizing task stages through direct and indirect query reformulations in a case study and find that it is possible to effectively differentiate subsequent stages of the search task. We found that describing user interaction behavior in such a generic form allowed us to relate user actions to search task stages independent from the specific search interface deployed. The next step will then be to validate this idea in a setting with a wider palette of search tasks and tools.

搜索系统使用上下文来有效地满足查询所表达的用户信息需求。任务是决定搜索过程中用户语境的重要因素，许多研究通过用户与搜索系统的交互行为来确定任务和任务阶段。然而，用户可用的交互类型取决于可用的搜索界面特性的类型。查询是用户用来表达信息需求的最普遍的输入方式，无论使用何种输入法，例如，键入关键字或单击facet。我们不是用界面特定组件来描述交互行为，而是用两种类型的查询修改来描述用户的搜索行为:(i)直接修改，指的是查询的重新表述;(ii)间接修改，即用户对各种搜索界面提供的额外输入组件进行操作。我们在案例研究中研究了通过直接和间接查询重新表述来描述任务阶段的效用，并发现它可以有效地区分搜索任务的后续阶段。我们发现，以这种通用形式描述用户交互行为，使我们能够将用户操作与搜索任务阶段联系起来，而不依赖于所部署的特定搜索界面。下一步将是在更广泛的搜索任务和工具的设置中验证这个想法。

{"title":"Characterizing stages of a multi-session complex search task through direct and indirect query modifications","authors":"Jiyin He, M. Bron, A. D. Vries","doi":"10.1145/2484028.2484178","DOIUrl":"https://doi.org/10.1145/2484028.2484178","url":null,"abstract":"Search systems use context to effectively satisfy a user's information need as expressed by a query. Tasks are important factors in determining user context during search and many studies have been conducted that identify tasks and task stages through users' interaction behavior with search systems. The type of interaction available to users, however, depends on the type of search interface features available. Queries are the most pervasive input from users to express their information need regardless of the input method, e.g., typing keywords or clicking facets. Instead of characterizing interaction behavior in terms of interface specific components, we propose to characterize users' search behavior in terms of two types of query modification: (i) direct modification, which refers to reformulations of queries; and (ii) indirect modification, which refers to user operations on additional input components provided by various search interfaces. We investigate the utility of characterizing task stages through direct and indirect query reformulations in a case study and find that it is possible to effectively differentiate subsequent stages of the search task. We found that describing user interaction behavior in such a generic form allowed us to relate user actions to search task stages independent from the specific search interface deployed. The next step will then be to validate this idea in a setting with a wider palette of search tasks and tools.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121135772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Effective approaches to retrieving and using expertise in social media 检索和使用社交媒体专业知识的有效方法

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484230

Reyyan Yeniterzi

Expert retrieval has been widely studied especially after the introduction of Expert Finding task in the TREC's Enterprise Track in 2005 [3]. This track provided two different test collections crawled from two organizations' public-facing websites and internal emails which led to the development of many state-of-the-art algorithms on expert retrieval [1]. Until recently, these datasets were considered good representatives of the information resources available within enterprise. However, the recent growth of social media also influenced the work environment, and social media became a common communication and collaboration tool within organizations. According to a recent survey by McKinsey Global Institute [2], 29% of the companies use at least one social media tool for matching their employees to tasks, and 26% of them assess their employees' performance by using social media. This shows that intra-organizational social media became an important resource to identify expertise within organizations. In recent years, in addition to the intra-organizational social media, public social media tools like Twitter, Facebook, LinkedIn also became common environments for searching expertise. These tools provide an opportunity for their users to show their specific skills to the world which motivates recruiters to look for talented job candidates on social media, or writers and reporters to find experts for consulting on specific topics they are working on. With these motivations in mind, in this work we propose to develop expert retrieval algorithms for intra-organizational and public social media tools. Social media datasets have both challenges and advantages. In terms of challenges, they do not always contain context on one specific domain, instead one social media tool may contain discussions on technical stuff, hobbies or news concurrently. They may also contain spam posts or advertisements. Compared to well-edited enterprise documents, they are much more informal in language. Furthermore, depending on the social media platform, they may have limits on the number of characters used in posts. Even though they include the challenges stated above, they also bring some unique authority signals, such as votes, comments, follower/following information, which can be useful in estimating expertise. Furthermore, compared to previously used enterprise documents, social media provides clear associations between documents and candidates in the context of authorship information. In this work, we propose to develop expert retrieval approaches which will handle these challenges while making use of the advantages. Expert retrieval is a very useful application by itself; furthermore, it can be a step towards improving other social media applications. Social media is different than other web based tools mainly because it is dependent on its users. In social media, users are not just content consumers, but they are also the primary and sometimes the only content creators

自2005年在TREC的企业轨道中引入专家查找任务以来，专家检索得到了广泛的研究。这条赛道提供了从两个组织的面向公众的网站和内部电子邮件中抓取的两个不同的测试集合，这导致了专家检索[1]上许多最先进算法的发展。直到最近，这些数据集还被认为是企业内可用信息资源的良好代表。然而，最近社交媒体的发展也影响了工作环境，社交媒体成为组织内部常见的沟通和协作工具。根据麦肯锡全球研究院(McKinsey Global Institute)最近的一项调查，29%的公司至少使用一种社交媒体工具来匹配员工的任务，26%的公司通过使用社交媒体来评估员工的表现。这表明组织内社交媒体成为组织内识别专业知识的重要资源。近年来，除了组织内部的社交媒体外，Twitter、Facebook、LinkedIn等公共社交媒体工具也成为搜索专业知识的常见环境。这些工具为他们的用户提供了一个向世界展示他们的特定技能的机会，这促使招聘人员在社交媒体上寻找有才华的求职者，或者作家和记者找到专家来咨询他们正在研究的特定主题。考虑到这些动机，在这项工作中，我们建议为组织内部和公共社交媒体工具开发专家检索算法。社交媒体数据集既有挑战，也有优势。就挑战而言，它们并不总是包含特定领域的上下文，相反，一个社交媒体工具可能同时包含有关技术内容、爱好或新闻的讨论。它们也可能包含垃圾邮件或广告。与精心编辑的企业文档相比，它们在语言上要随意得多。此外，根据社交媒体平台的不同，他们可能会限制帖子中使用的字符数量。尽管它们包括上述挑战，但它们也带来了一些独特的权威信号，如投票、评论、追随者/跟踪信息，这些信息在评估专业知识时很有用。此外，与以前使用的企业文档相比，社交媒体在作者信息上下文中提供了文档和候选人之间的明确关联。在这项工作中，我们建议开发专家检索方法来处理这些挑战，同时利用优势。专家检索本身就是一个非常有用的应用;此外，它可以成为改进其他社交媒体应用程序的一步。社交媒体不同于其他基于网络的工具，主要是因为它依赖于它的用户。在社交媒体中，用户不仅仅是内容的消费者，他们也是主要的，有时甚至是唯一的内容创造者。因此，社交媒体中任何用户生成内容的质量取决于其创建者。在本文中，我们建议使用用户的专业知识来改进现有的应用程序，以便他们不仅可以基于内容，还可以基于内容创建者的专业知识来估计内容的相关性。通过使用内容生成器的专业知识，我们也希望增加更可靠的内容。我们建议利用这些用户的专业知识信息来改进社交媒体中的特别搜索和问答应用程序。在这项工作中，以前的TREC企业数据集，可用的组织内部社交媒体和公共社交媒体数据集将用于测试所提出的算法。

{"title":"Effective approaches to retrieving and using expertise in social media","authors":"Reyyan Yeniterzi","doi":"10.1145/2484028.2484230","DOIUrl":"https://doi.org/10.1145/2484028.2484230","url":null,"abstract":"Expert retrieval has been widely studied especially after the introduction of Expert Finding task in the TREC's Enterprise Track in 2005 [3]. This track provided two different test collections crawled from two organizations' public-facing websites and internal emails which led to the development of many state-of-the-art algorithms on expert retrieval [1]. Until recently, these datasets were considered good representatives of the information resources available within enterprise. However, the recent growth of social media also influenced the work environment, and social media became a common communication and collaboration tool within organizations. According to a recent survey by McKinsey Global Institute [2], 29% of the companies use at least one social media tool for matching their employees to tasks, and 26% of them assess their employees' performance by using social media. This shows that intra-organizational social media became an important resource to identify expertise within organizations. In recent years, in addition to the intra-organizational social media, public social media tools like Twitter, Facebook, LinkedIn also became common environments for searching expertise. These tools provide an opportunity for their users to show their specific skills to the world which motivates recruiters to look for talented job candidates on social media, or writers and reporters to find experts for consulting on specific topics they are working on. With these motivations in mind, in this work we propose to develop expert retrieval algorithms for intra-organizational and public social media tools. Social media datasets have both challenges and advantages. In terms of challenges, they do not always contain context on one specific domain, instead one social media tool may contain discussions on technical stuff, hobbies or news concurrently. They may also contain spam posts or advertisements. Compared to well-edited enterprise documents, they are much more informal in language. Furthermore, depending on the social media platform, they may have limits on the number of characters used in posts. Even though they include the challenges stated above, they also bring some unique authority signals, such as votes, comments, follower/following information, which can be useful in estimating expertise. Furthermore, compared to previously used enterprise documents, social media provides clear associations between documents and candidates in the context of authorship information. In this work, we propose to develop expert retrieval approaches which will handle these challenges while making use of the advantages. Expert retrieval is a very useful application by itself; furthermore, it can be a step towards improving other social media applications. Social media is different than other web based tools mainly because it is dependent on its users. In social media, users are not just content consumers, but they are also the primary and sometimes the only content creators","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127693238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Answering natural language queries over linked data graphs: a distributional semantics approach 回答链接数据图上的自然语言查询:一种分布式语义方法

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484209

A. Freitas, Fabrício F. de Faria, Seán O'Riain, E. Curry

This paper demonstrates Treo, a natural language query mechanism for Linked Data graphs. The approach uses a distributional semantic vector space model to semantically match user query terms with data, supporting vocabulary-independent (or schema-agnostic) queries over structured data.

本文演示了Treo，一种关联数据图的自然语言查询机制。该方法使用分布式语义向量空间模型在语义上匹配用户查询术语和数据，支持对结构化数据进行与词汇表无关(或与模式无关)的查询。

引用次数: 7

A query and patient understanding framework for medical records search 用于医疗记录搜索的查询和患者理解框架

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484228

Nut Limsopatham

Electronic medical records (EMRs) are being increasingly used worldwide to facilitate improved healthcare services [2,3]. They describe the clinical decision process relating to a patient, detailing the observed symptoms, the conducted diagnostic tests, the identified diagnoses and the prescribed treatments. However, medical records search is challenging, due to the implicit knowledge inherent within the medical records - such knowledge may be known by medical practitioners, but hidden to an information retrieval (IR) system [3]. For instance, the mention of a treatment such as a drug may indicate to a practitioner that a particular diagnosis has been made even if this was not explicitly mentioned in the patient's EMRs. Moreover, the fact that a symptom has not been observed by a clinician may rule out some specific diagnoses. Our work focuses on searching EMRs to identify patients with medical histories relevant to the medical condition(s) stated in a query. The resulting system can be beneficial to healthcare providers, administrators, and researchers who may wish to analyse the effectiveness of a particular medical procedure to combat a specific disease [2,4]. During retrieval, a healthcare provider may indicate a number of inclusion criteria to describe the type of patients of interest. For example, the used criteria may include personal profiles (e.g. age and gender) or some specific medical symptoms and tests, allowing to identify patients that have EMRs matching the criteria. To attain effective retrieval performance, we hypothesise that, in such a medical IR system, both the information needs and patients should be modelled based on how the medical process is developed. Specifically, our thesis states that since the medical decision process typically encompasses four aspects (symptom, diagnostic test, diagnosis, and treatment), a medical search system should take into account these aspects and apply inferences to recover possible implicit knowledge. We postulate that considering these aspects and their derived implicit knowledge at different levels of the retrieval process (namely, sentence, record, and inter-record level) enhances the retrieval performance. Indeed, we propose to build a query and patient understanding framework that can gain insights from EMRs and queries, by modelling and reasoning during retrieval in terms of the four aforementioned aspects (symptom, diagnostic test, diagnosis, and treatment) at three different levels of the retrieval process.

电子医疗记录(emr)在世界范围内越来越多地用于促进改善医疗保健服务[2,3]。它们描述了与患者有关的临床决策过程，详细说明了观察到的症状、进行的诊断测试、确定的诊断和规定的治疗。然而，由于病历中固有的隐性知识——这些知识可能为医生所知，但却隐藏在信息检索(IR)系统中[3]，因此，病历搜索具有挑战性。例如，提及治疗(如药物)可能会向医生表明已经做出了特定的诊断，即使在患者的电子病历中没有明确提到这一点。此外，临床医生未观察到症状的事实可能会排除某些特定的诊断。我们的工作重点是搜索电子病历，以识别与查询中所述医疗状况相关的病史的患者。由此产生的系统可以有利于医疗保健提供者、管理人员和研究人员，他们可能希望分析特定医疗程序对抗特定疾病的有效性[2,4]。在检索过程中，医疗保健提供者可能指示许多包含标准来描述感兴趣的患者类型。例如，使用的标准可能包括个人概况(例如年龄和性别)或一些特定的医学症状和测试，从而可以识别电子病历符合标准的患者。为了获得有效的检索性能，我们假设，在这样一个医疗IR系统中，信息需求和患者都应该基于医疗过程的发展方式进行建模。具体来说，我们的论文指出，由于医疗决策过程通常包括四个方面(症状，诊断测试，诊断和治疗)，医疗搜索系统应该考虑到这些方面，并应用推断来恢复可能的隐性知识。我们假设在检索过程的不同层次(即句子、记录和记录间层次)考虑这些方面及其衍生的隐性知识可以提高检索性能。事实上，我们建议建立一个查询和患者理解框架，通过在检索过程的三个不同层次上对上述四个方面(症状、诊断测试、诊断和治疗)进行建模和推理，可以从emr和查询中获得见解。

{"title":"A query and patient understanding framework for medical records search","authors":"Nut Limsopatham","doi":"10.1145/2484028.2484228","DOIUrl":"https://doi.org/10.1145/2484028.2484228","url":null,"abstract":"Electronic medical records (EMRs) are being increasingly used worldwide to facilitate improved healthcare services [2,3]. They describe the clinical decision process relating to a patient, detailing the observed symptoms, the conducted diagnostic tests, the identified diagnoses and the prescribed treatments. However, medical records search is challenging, due to the implicit knowledge inherent within the medical records - such knowledge may be known by medical practitioners, but hidden to an information retrieval (IR) system [3]. For instance, the mention of a treatment such as a drug may indicate to a practitioner that a particular diagnosis has been made even if this was not explicitly mentioned in the patient's EMRs. Moreover, the fact that a symptom has not been observed by a clinician may rule out some specific diagnoses. Our work focuses on searching EMRs to identify patients with medical histories relevant to the medical condition(s) stated in a query. The resulting system can be beneficial to healthcare providers, administrators, and researchers who may wish to analyse the effectiveness of a particular medical procedure to combat a specific disease [2,4]. During retrieval, a healthcare provider may indicate a number of inclusion criteria to describe the type of patients of interest. For example, the used criteria may include personal profiles (e.g. age and gender) or some specific medical symptoms and tests, allowing to identify patients that have EMRs matching the criteria. To attain effective retrieval performance, we hypothesise that, in such a medical IR system, both the information needs and patients should be modelled based on how the medical process is developed. Specifically, our thesis states that since the medical decision process typically encompasses four aspects (symptom, diagnostic test, diagnosis, and treatment), a medical search system should take into account these aspects and apply inferences to recover possible implicit knowledge. We postulate that considering these aspects and their derived implicit knowledge at different levels of the retrieval process (namely, sentence, record, and inter-record level) enhances the retrieval performance. Indeed, we propose to build a query and patient understanding framework that can gain insights from EMRs and queries, by modelling and reasoning during retrieval in terms of the four aforementioned aspects (symptom, diagnostic test, diagnosis, and treatment) at three different levels of the retrieval process.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129079654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

TweetMogaz: a news portal of tweets TweetMogaz: tweets的新闻门户

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484212

Walid Magdy

Twitter is currently one of the largest social hubs for users to spread and discuss news. For most of the top news stories happening, there are corresponding discussions on social media. In this demonstration TweetMogaz is presented, which is a platform for microblog search and filtering. It creates a real-time comprehensive report about what people discuss and share around news happening in certain regions. TweetMogaz reports the most popular tweets, jokes, videos, images, and news articles that people share about top news stories. Moreover, it allows users to search for specific topics. A scalable automatic technique for microblog filtering is used to obtain relevant tweets to a certain news category in a region. TweetMogaz.com demonstrates the effectiveness of our filtering technique for reporting public response toward news in different Arabic regions including Egypt and Syria in real-time.

推特目前是用户传播和讨论新闻的最大社交中心之一。对于大多数正在发生的头条新闻，社交媒体上都有相应的讨论。在这个演示中，TweetMogaz是一个微博搜索和过滤平台。它创建了一个实时的综合报告，关于人们讨论和分享在某些地区发生的新闻。TweetMogaz报告最受欢迎的推文、笑话、视频、图片和人们分享的新闻报道。此外，它还允许用户搜索特定的主题。采用一种可扩展的微博自动过滤技术，获取某一地区某一新闻类别的相关推文。TweetMogaz.com展示了我们的过滤技术在不同阿拉伯地区(包括埃及和叙利亚)实时报道公众对新闻反应的有效性。

引用次数: 10

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀