Links: information retrieval

Syed S. Ali, S. McRoy
{"title":"Links: information retrieval","authors":"Syed S. Ali, S. McRoy","doi":"10.1145/355137.355141","DOIUrl":null,"url":null,"abstract":"A n information retrieval (IR) system informs the user about the existence and whereabouts of documents or data relating to a query made by the user. Traditional methods for automated information retrieval are largely based on searching and indexing techniques performed by people (such as librarians). Figure 1 illustrates the operation of a generic IR system. In Figure 1, the user enters a query (in this example a Boolean query that asks the IR system to find documents that contain the phrase \" information retrieval \" as well as the word \" resources \"). The user query may be processed (for example, to convert the plural \" resources \" to the singular \" resource \") and matched against a database of documents that have been preprocessed in order to speed matching. The database can be a local document collection or a collection of networked documents, such as those on the World Wide Web (WWW). The output of the IR system is typically a ranked list of documents. Some IR systems may provide an option for user feedback, such as asking the user to give his opinions on the quality of the matches, and can use this feedback to improve the quality of the search. Increased capabilities of computer hardware and software have created a vast body of machine-readable resources. Typically there is no lack of available information; more often, users, seeking needles in haystacks, are overwhelmed by the quantity of irrelevant information. Often this is caused by a poor query (too vague or too generic; for example, try searching for \" computer science \"). Even with a well-formulated specific query (such as in Figure 1), results can be poor (for example, Google.com returned as one match a document titled: \" Distributed Information Search and Retrieval for Astronomical Resource Discovery and Data Mining \"). The popularity of the Web has spurred enormous growth in the number and types of available resources. Many networked information retrieval (NIR) tools can be used to search the Web and provide information on demand to unsophisticated end users. Search engines are a simple example; typically they make use of a program (called a spider) that traverses the Web and creates databases of the keywords in a Web page (allowing fast, local retrieval of these resources). IR systems, such as search engines, are most useful when the user makes a precise query, has a clear idea what …","PeriodicalId":8272,"journal":{"name":"Appl. Intell.","volume":"41 1","pages":"17-19"},"PeriodicalIF":0.0000,"publicationDate":"2000-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"82","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Appl. Intell.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/355137.355141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 82

Abstract

A n information retrieval (IR) system informs the user about the existence and whereabouts of documents or data relating to a query made by the user. Traditional methods for automated information retrieval are largely based on searching and indexing techniques performed by people (such as librarians). Figure 1 illustrates the operation of a generic IR system. In Figure 1, the user enters a query (in this example a Boolean query that asks the IR system to find documents that contain the phrase " information retrieval " as well as the word " resources "). The user query may be processed (for example, to convert the plural " resources " to the singular " resource ") and matched against a database of documents that have been preprocessed in order to speed matching. The database can be a local document collection or a collection of networked documents, such as those on the World Wide Web (WWW). The output of the IR system is typically a ranked list of documents. Some IR systems may provide an option for user feedback, such as asking the user to give his opinions on the quality of the matches, and can use this feedback to improve the quality of the search. Increased capabilities of computer hardware and software have created a vast body of machine-readable resources. Typically there is no lack of available information; more often, users, seeking needles in haystacks, are overwhelmed by the quantity of irrelevant information. Often this is caused by a poor query (too vague or too generic; for example, try searching for " computer science "). Even with a well-formulated specific query (such as in Figure 1), results can be poor (for example, Google.com returned as one match a document titled: " Distributed Information Search and Retrieval for Astronomical Resource Discovery and Data Mining "). The popularity of the Web has spurred enormous growth in the number and types of available resources. Many networked information retrieval (NIR) tools can be used to search the Web and provide information on demand to unsophisticated end users. Search engines are a simple example; typically they make use of a program (called a spider) that traverses the Web and creates databases of the keywords in a Web page (allowing fast, local retrieval of these resources). IR systems, such as search engines, are most useful when the user makes a precise query, has a clear idea what …
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
链接:信息检索
信息检索(IR)系统通知用户与用户查询有关的文档或数据的存在和位置。自动信息检索的传统方法主要基于人(如图书管理员)执行的搜索和索引技术。图1说明了通用IR系统的操作。在图1中,用户输入一个查询(在本例中是一个布尔查询,要求IR系统查找包含短语“信息检索”和单词“资源”的文档)。可以处理用户查询(例如,将复数“resources”转换为单数“resource”),并与经过预处理的文档数据库进行匹配,以加快匹配速度。数据库可以是本地文档集合,也可以是网络文档的集合,例如万维网上的文档。IR系统的输出通常是文档的排序列表。一些IR系统可能提供用户反馈选项,例如要求用户给出他对匹配质量的意见,并且可以使用这些反馈来提高搜索质量。计算机硬件和软件性能的提高创造了大量的机器可读资源。通常不缺乏可用的信息;更多的时候,用户就像大海捞针一样,被大量不相关的信息淹没了。这通常是由糟糕的查询(太模糊或太一般;例如,试着搜索“计算机科学”)。即使使用公式良好的特定查询(如图1所示),结果也可能很差(例如,Google.com作为一个匹配返回的文档标题为:“用于天文资源发现和数据挖掘的分布式信息搜索和检索”)。Web的普及刺激了可用资源数量和类型的巨大增长。许多网络信息检索(NIR)工具可用于搜索Web并按需向不成熟的最终用户提供信息。搜索引擎就是一个简单的例子;它们通常使用一个程序(称为spider),该程序遍历Web并创建Web页面中关键字的数据库(允许对这些资源进行快速的本地检索)。IR系统,如搜索引擎,在用户进行精确查询时最有用,有一个清晰的概念…
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Object interaction-based surveillance video synopsis Total generalized variational-liked network for image denoising Multi-level clustering based on cluster order constructed with dynamic local density Natural-language processing for computer-supported instruction Is AI abstract and impractical? isn't the answer obvious?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1