首页 > 最新文献

Foundations and Trends in Information Retrieval最新文献

英文 中文
LifeLogging: Personal Big Data 生活日志:个人大数据
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2014-06-20 DOI: 10.1561/1500000033
C. Gurrin, A. Smeaton, A. Doherty
We have recently observed a convergence of technologies to foster the emergence of lifelogging as a mainstream activity. Computer storage has become significantly cheaper, and advancements in sensing technology allows for the efficient sensing of personal activities, locations and the environment. This is best seen in the growing popularity of the quantified self movement, in which life activities are tracked using wearable sensors in the hope of better understanding human performance in a variety of tasks. This review aims to provide a comprehensive summary of lifelogging, to cover its research history, current technologies, and applications. Thus far, most of the lifelogging research has focused predominantly on visual lifelogging, hence we maintain this focus in this review. However, we also reflect on the challenges lifelogging poses for information access and retrieval in general. This review is a suitable reference for those seeking an information retrieval scientist's perspective on lifelogging and the quantified self.
我们最近观察到,技术的融合促进了生活记录作为一种主流活动的出现。计算机存储已经变得非常便宜,传感技术的进步使得对个人活动、地点和环境的有效传感成为可能。这在量化自我运动的日益流行中得到了最好的体现,在这种运动中,使用可穿戴传感器跟踪生命活动,以期更好地了解人类在各种任务中的表现。本文旨在对生命记录的研究历史、技术现状和应用进行综述。到目前为止,大多数的生活记录研究主要集中在视觉生活记录上,因此我们在这篇综述中保持这一重点。然而,我们也反映了生活记录对信息访问和检索提出的挑战。这篇综述对那些寻求信息检索科学家对生活记录和量化自我的看法的人来说是一个合适的参考。
{"title":"LifeLogging: Personal Big Data","authors":"C. Gurrin, A. Smeaton, A. Doherty","doi":"10.1561/1500000033","DOIUrl":"https://doi.org/10.1561/1500000033","url":null,"abstract":"We have recently observed a convergence of technologies to foster the emergence of lifelogging as a mainstream activity. Computer storage has become significantly cheaper, and advancements in sensing technology allows for the efficient sensing of personal activities, locations and the environment. This is best seen in the growing popularity of the quantified self movement, in which life activities are tracked using wearable sensors in the hope of better understanding human performance in a variety of tasks. This review aims to provide a comprehensive summary of lifelogging, to cover its research history, current technologies, and applications. Thus far, most of the lifelogging research has focused predominantly on visual lifelogging, hence we maintain this focus in this review. However, we also reflect on the challenges lifelogging poses for information access and retrieval in general. This review is a suitable reference for those seeking an information retrieval scientist's perspective on lifelogging and the quantified self.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"4 1","pages":"1-125"},"PeriodicalIF":10.4,"publicationDate":"2014-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88918794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 406
Computational Advertising: Techniques for Targeting Relevant Ads 计算广告:定位相关广告的技术
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2014-01-01 DOI: 10.1561/1500000045
Kushal S. Dave, Vasudeva Varma
{"title":"Computational Advertising: Techniques for Targeting Relevant Ads","authors":"Kushal S. Dave, Vasudeva Varma","doi":"10.1561/1500000045","DOIUrl":"https://doi.org/10.1561/1500000045","url":null,"abstract":"","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"41 1","pages":"263-418"},"PeriodicalIF":10.4,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79064423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Music Information Retrieval: Recent Developments and Applications 音乐信息检索:最新发展与应用
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2014-01-01 DOI: 10.1561/9781601988331
Kushal S. Dave, Vasudeva Varma
{"title":"Music Information Retrieval: Recent Developments and Applications","authors":"Kushal S. Dave, Vasudeva Varma","doi":"10.1561/9781601988331","DOIUrl":"https://doi.org/10.1561/9781601988331","url":null,"abstract":"","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"8 1","pages":"263-418"},"PeriodicalIF":10.4,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67081977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Information Retrieval for E-Discovery 电子检索的信息检索
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2013-06-28 DOI: 10.1561/1500000025
Douglas W. Oard, William Webber
E-discovery refers generally to the process by which one party (for example, the plaintiff) is entitled to discover evidence in the form of electronically stored information that is held by another party (for example, the defendant), and that is relevant to some matter that is the subject of civil litigation (that is, what is commonly called a "lawsuit"). Information Retrieval for E-Discovery describes the emergence of the field, identifies the information retrieval issues that arise, reviews the work to date on this topic, and summarizes major open issues. Information Retrieval for E-Discovery is an ideal primer for anyone with an interest in e-discovery; be it researchers who first practiced law but now study information retrieval, or those who studied information retrieval but now practice law.
电子证据开示一般是指一方(如原告)有权发现另一方(如被告)持有的电子存储信息形式的证据,这些证据与民事诉讼(即通常所说的“诉讼”)的主题有关。电子发现的信息检索描述了该领域的出现,确定了出现的信息检索问题,回顾了迄今为止关于该主题的工作,并总结了主要的开放问题。信息检索的电子发现是一个理想的入门对任何人有兴趣的电子发现;无论是最初从事法律工作但现在研究信息检索的研究人员,还是研究信息检索但现在从事法律工作的研究人员。
{"title":"Information Retrieval for E-Discovery","authors":"Douglas W. Oard, William Webber","doi":"10.1561/1500000025","DOIUrl":"https://doi.org/10.1561/1500000025","url":null,"abstract":"E-discovery refers generally to the process by which one party (for example, the plaintiff) is entitled to discover evidence in the form of electronically stored information that is held by another party (for example, the defendant), and that is relevant to some matter that is the subject of civil litigation (that is, what is commonly called a \"lawsuit\"). Information Retrieval for E-Discovery describes the emergence of the field, identifies the information retrieval issues that arise, reviews the work to date on this topic, and summarizes major open issues. Information Retrieval for E-Discovery is an ideal primer for anyone with an interest in e-discovery; be it researchers who first practiced law but now study information retrieval, or those who studied information retrieval but now practice law.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"46 1","pages":"99-237"},"PeriodicalIF":10.4,"publicationDate":"2013-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86985833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Patent Retrieval 专利检索
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2013-03-11 DOI: 10.1561/1500000027
M. Lupu, A. Hanbury
Intellectual property and the patent system in particular have been extremely present in research and discussion, even in the public media, in the last few years. Without going into any controversial issues regarding the patent system, we approach a very real and growing problem: searching for innovation. The target collection for this task does not consist of patent documents only, but it is in these documents that the main difference is found compared to web or news information retrieval. In addition, the issue of patent search implies a particular user model and search process model. This review is concerned with how research and technology in the field of Information Retrieval assists or even changes the processes of patent search. It is a survey of work done on patent data in relation to Information Retrieval in the last 20–25 years. It explains the sources of difficulty and the existing document processing and retrieval methods of the domain, and provides a motivation for further research in the area.
在过去的几年里,知识产权和专利制度在研究和讨论中,甚至在公共媒体中,都非常普遍。在不讨论任何有关专利制度的争议性问题的情况下,我们将探讨一个非常现实且日益严重的问题:寻求创新。此任务的目标集合不只是由专利文档组成,但是与web或新闻信息检索相比,在这些文档中发现了主要区别。此外,专利检索问题隐含着特定的用户模型和检索过程模型。这篇综述关注的是信息检索领域的研究和技术如何协助甚至改变专利检索的过程。它是对过去20-25年间与信息检索相关的专利数据所做的调查。阐述了该领域的难点来源和现有的文献处理和检索方法,为该领域的进一步研究提供了动力。
{"title":"Patent Retrieval","authors":"M. Lupu, A. Hanbury","doi":"10.1561/1500000027","DOIUrl":"https://doi.org/10.1561/1500000027","url":null,"abstract":"Intellectual property and the patent system in particular have been extremely present in research and discussion, even in the public media, in the last few years. Without going into any controversial issues regarding the patent system, we approach a very real and growing problem: searching for innovation. The target collection for this task does not consist of patent documents only, but it is in these documents that the main difference is found compared to web or news information retrieval. In addition, the issue of patent search implies a particular user model and search process model. This review is concerned with how research and technology in the field of Information Retrieval assists or even changes the processes of patent search. It is a survey of work done on patent data in relation to Information Retrieval in the last 20–25 years. It explains the sources of difficulty and the existing document processing and retrieval methods of the domain, and provides a motivation for further research in the area.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"28 1","pages":"1-97"},"PeriodicalIF":10.4,"publicationDate":"2013-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85286647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
Contextual Search: A Computational Framework 上下文搜索:一个计算框架
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2012-10-08 DOI: 10.1561/1500000023
M. Melucci
1: Introduction 2: Query Intent 3: Personal Interest 4: Document Quality 5: Contextual Search Evaluation 6: Conclusions. Acknowledgements. References. A. Implementations
1:介绍2:查询意图3:个人兴趣4:文档质量5:上下文搜索评价6:结论致谢参考文献答:实现
{"title":"Contextual Search: A Computational Framework","authors":"M. Melucci","doi":"10.1561/1500000023","DOIUrl":"https://doi.org/10.1561/1500000023","url":null,"abstract":"1: Introduction 2: Query Intent 3: Personal Interest 4: Document Quality 5: Contextual Search Evaluation 6: Conclusions. Acknowledgements. References. A. Implementations","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"17 1","pages":"257-405"},"PeriodicalIF":10.4,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76297410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Expertise Retrieval 专业知识检索
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2012-08-12 DOI: 10.1561/1500000024
K. Balog, Yi Fang, M. de Rijke, P. Serdyukov, Luo Si
People have looked for experts since before the advent of computers. With advances in information retrieval technology and the large-scale availability of digital traces of knowledge-related activities, computer systems that can fully automate the process of locating expertise have become a reality. The past decade has witnessed tremendous interest, and a wealth of results, in expertise retrieval as an emerging subdiscipline in information retrieval. This survey highlights advances in models and algorithms relevant to this field. We draw connections among methods proposed in the literature and summarize them in five groups of basic approaches. These serve as the building blocks for more advanced models that arise when we consider a range of content-based factors that may impact the strength of association between a topic and a person. We also discuss practical aspects of building an expert search system and present applications of the technology in other domains, such as blog distillation and entity retrieval. The limitations of current approaches are also pointed out. We end our survey with a set of conjectures on what the future may hold for expertise retrieval research.
在计算机出现之前,人们就一直在寻找专家。随着信息检索技术的进步和知识相关活动的数字痕迹的大规模可用性,能够完全自动化定位专业知识过程的计算机系统已经成为现实。在过去的十年中,专家知识检索作为信息检索领域的一个新兴分支学科得到了极大的关注和大量的成果。这项调查突出了与该领域相关的模型和算法的进展。我们在文献中提出的方法之间建立联系,并将其归纳为五组基本方法。当我们考虑一系列可能影响主题和人之间关联强度的基于内容的因素时,这些模型将成为更高级模型的构建块。我们还讨论了建立专家搜索系统的实际方面,以及该技术在其他领域的应用,如博客蒸馏和实体检索。同时指出了现有方法的局限性。我们以一组关于专业知识检索研究的未来的猜想来结束我们的调查。
{"title":"Expertise Retrieval","authors":"K. Balog, Yi Fang, M. de Rijke, P. Serdyukov, Luo Si","doi":"10.1561/1500000024","DOIUrl":"https://doi.org/10.1561/1500000024","url":null,"abstract":"People have looked for experts since before the advent of computers. With advances in information retrieval technology and the large-scale availability of digital traces of knowledge-related activities, computer systems that can fully automate the process of locating expertise have become a reality. The past decade has witnessed tremendous interest, and a wealth of results, in expertise retrieval as an emerging subdiscipline in information retrieval. This survey highlights advances in models and algorithms relevant to this field. We draw connections among methods proposed in the literature and summarize them in five groups of basic approaches. These serve as the building blocks for more advanced models that arise when we consider a range of content-based factors that may impact the strength of association between a topic and a person. We also discuss practical aspects of building an expert search system and present applications of the technology in other domains, such as blog distillation and entity retrieval. The limitations of current approaches are also pointed out. We end our survey with a set of conjectures on what the future may hold for expertise retrieval research.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"2 1","pages":"127-256"},"PeriodicalIF":10.4,"publicationDate":"2012-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84257898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 227
Information Retrieval on the Blogosphere 博客圈的信息检索
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2012-08-01 DOI: 10.1561/1500000026
Rodrygo L. T. Santos, C. Macdonald, R. McCreadie, I. Ounis, I. Soboroff
Blogs have recently emerged as a new open, rapidly evolving and reactive publishing medium on the Web. Rather than managed by a central entity, the content on the blogosphere — the collection of all blogs on the Web — is produced by millions of independent bloggers, who can write about virtually anything. This open publishing paradigm has led to a growing mass of user-generated content on the Web, which can vary tremendously both in format and quality when looked at in isolation, but which can also reveal interesting patterns when observed in aggregation. One field particularly interested in studying how information is produced, consumed, and searched in the blogosphere is information retrieval. In this survey, we review the published literature on searching the blogosphere. In particular, we describe the phenomenon of blogging and the motivations for searching for information on blogs. We cover both the search tasks underlying blog searchers' information needs and the most successful approaches to these tasks. These include blog post and full blog search tasks, as well as blog-aided search tasks, such as trend and market analysis. Finally, we also describe the publicly available resources that support research on searching the blogosphere.
博客最近作为一种新的开放的、快速发展的、反应性的网络发布媒介而出现。博客圈上的内容——网络上所有博客的集合——不是由一个中央实体管理,而是由数百万独立的博主制作的,他们几乎可以写任何东西。这种开放的发布模式导致了Web上用户生成内容的数量不断增长,这些内容在格式和质量上可能会有很大的差异,但如果放在一起观察,也会揭示出有趣的模式。在博客圈中,对研究信息如何产生、消费和搜索特别感兴趣的一个领域是信息检索。在这项调查中,我们回顾了已发表的关于搜索博客圈的文献。特别地,我们描述了博客现象和在博客上搜索信息的动机。我们涵盖了博客搜索者信息需求的搜索任务,以及实现这些任务的最成功方法。这包括博客文章和完整的博客搜索任务,以及博客辅助搜索任务,如趋势和市场分析。最后,我们还描述了支持搜索博客圈研究的公开可用资源。
{"title":"Information Retrieval on the Blogosphere","authors":"Rodrygo L. T. Santos, C. Macdonald, R. McCreadie, I. Ounis, I. Soboroff","doi":"10.1561/1500000026","DOIUrl":"https://doi.org/10.1561/1500000026","url":null,"abstract":"Blogs have recently emerged as a new open, rapidly evolving and reactive publishing medium on the Web. Rather than managed by a central entity, the content on the blogosphere — the collection of all blogs on the Web — is produced by millions of independent bloggers, who can write about virtually anything. This open publishing paradigm has led to a growing mass of user-generated content on the Web, which can vary tremendously both in format and quality when looked at in isolation, but which can also reveal interesting patterns when observed in aggregation. One field particularly interested in studying how information is produced, consumed, and searched in the blogosphere is information retrieval. In this survey, we review the published literature on searching the blogosphere. In particular, we describe the phenomenon of blogging and the motivations for searching for information on blogs. We cover both the search tasks underlying blog searchers' information needs and the most successful approaches to these tasks. These include blog post and full blog search tasks, as well as blog-aided search tasks, such as trend and market analysis. Finally, we also describe the publicly available resources that support research on searching the blogosphere.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"38 1","pages":"1-125"},"PeriodicalIF":10.4,"publicationDate":"2012-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76898614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Spoken Content Retrieval: A Survey of Techniques and Technologies 口语内容检索:技术与技术综述
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2012-02-23 DOI: 10.1561/1500000020
M. Larson, G. Jones
Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR.
语音媒体,即包含语音内容的数字音频和视频,近年来蓬勃发展。在Internet上以及在私人和企业设置中,大量的集合正在积累。这种增长推动了对促进可靠索引和检索的技术和技术的广泛研究。语音内容检索(SCR)需要将音频和语音处理技术与信息检索(IR)方法相结合。SCR研究最初调查了以文档式单位结构的计划演讲,但随后将重点转移到更非正式的自发演讲内容上,在演播室之外和会话环境中。本调查提供了SCR领域的概述,包括组件技术,SCR与文本IR和自动语音识别以及用户交互问题的关系。它的目标是具有语音技术或IR背景的研究人员,他们正在寻求更深入的见解,了解如何将这些领域集成到支持研究和开发中,从而解决SCR的核心挑战。
{"title":"Spoken Content Retrieval: A Survey of Techniques and Technologies","authors":"M. Larson, G. Jones","doi":"10.1561/1500000020","DOIUrl":"https://doi.org/10.1561/1500000020","url":null,"abstract":"Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"49 1","pages":"235-422"},"PeriodicalIF":10.4,"publicationDate":"2012-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73010469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 81
Federated Search 联邦搜索
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2011-03-05 DOI: 10.1561/1500000010
Milad Shokouhi, Luo Si
Federated search (federated information retrieval or distributed information retrieval) is a technique for searching multiple text collections simultaneously. Queries are submitted to a subset of collections that are most likely to return relevant answers. The results returned by selected collections are integrated and merged into a single list. Federated search is preferred over centralized search alternatives in many environments. For example, commercial search engines such as Google cannot easily index uncrawlable hidden web collections while federated search systems can search the contents of hidden web collections without crawling. In enterprise environments, where each organization maintains an independent search engine, federated search techniques can provide parallel search over multiple collections. There are three major challenges in federated search. For each query, a subset of collections that are most likely to return relevant documents are selected. This creates the collection selection problem. To be able to select suitable collections, federated search systems need to acquire some knowledge about the contents of each collection, creating the collection representation problem. The results returned from the selected collections are merged before the final presentation to the user. This final step is the result merging problem. The goal of this work, is to provide a comprehensive summary of the previous research on the federated search challenges described above.
联邦搜索(联邦信息检索或分布式信息检索)是一种同时搜索多个文本集合的技术。查询被提交给最有可能返回相关答案的集合子集。所选集合返回的结果被集成并合并到单个列表中。在许多环境中,联邦搜索比集中式搜索更受欢迎。例如,谷歌这样的商业搜索引擎不能很容易地索引无法抓取的隐藏web集合,而联邦搜索系统可以搜索隐藏web集合的内容而不需要抓取。在企业环境中,每个组织维护一个独立的搜索引擎,联邦搜索技术可以提供对多个集合的并行搜索。在联邦搜索中有三个主要挑战。对于每个查询,选择最有可能返回相关文档的集合子集。这就产生了集合选择问题。为了能够选择合适的集合,联邦搜索系统需要获取关于每个集合内容的一些知识,这就产生了集合表示问题。从所选集合返回的结果在最终呈现给用户之前被合并。最后一步是结果合并问题。这项工作的目标是对前面描述的联邦搜索挑战的研究提供一个全面的总结。
{"title":"Federated Search","authors":"Milad Shokouhi, Luo Si","doi":"10.1561/1500000010","DOIUrl":"https://doi.org/10.1561/1500000010","url":null,"abstract":"Federated search (federated information retrieval or distributed information retrieval) is a technique for searching multiple text collections simultaneously. Queries are submitted to a subset of collections that are most likely to return relevant answers. The results returned by selected collections are integrated and merged into a single list. Federated search is preferred over centralized search alternatives in many environments. For example, commercial search engines such as Google cannot easily index uncrawlable hidden web collections while federated search systems can search the contents of hidden web collections without crawling. In enterprise environments, where each organization maintains an independent search engine, federated search techniques can provide parallel search over multiple collections. \u0000 \u0000There are three major challenges in federated search. For each query, a subset of collections that are most likely to return relevant documents are selected. This creates the collection selection problem. To be able to select suitable collections, federated search systems need to acquire some knowledge about the contents of each collection, creating the collection representation problem. The results returned from the selected collections are merged before the final presentation to the user. This final step is the result merging problem. \u0000 \u0000The goal of this work, is to provide a comprehensive summary of the previous research on the federated search challenges described above.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"30 1","pages":"1-102"},"PeriodicalIF":10.4,"publicationDate":"2011-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77818433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 167
期刊
Foundations and Trends in Information Retrieval
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1