{"title":"Links: information retrieval","authors":"Syed S. Ali, S. McRoy","doi":"10.1145/355137.355141","DOIUrl":null,"url":null,"abstract":"A n information retrieval (IR) system informs the user about the existence and whereabouts of documents or data relating to a query made by the user. Traditional methods for automated information retrieval are largely based on searching and indexing techniques performed by people (such as librarians). Figure 1 illustrates the operation of a generic IR system. In Figure 1, the user enters a query (in this example a Boolean query that asks the IR system to find documents that contain the phrase \" information retrieval \" as well as the word \" resources \"). The user query may be processed (for example, to convert the plural \" resources \" to the singular \" resource \") and matched against a database of documents that have been preprocessed in order to speed matching. The database can be a local document collection or a collection of networked documents, such as those on the World Wide Web (WWW). The output of the IR system is typically a ranked list of documents. Some IR systems may provide an option for user feedback, such as asking the user to give his opinions on the quality of the matches, and can use this feedback to improve the quality of the search. Increased capabilities of computer hardware and software have created a vast body of machine-readable resources. Typically there is no lack of available information; more often, users, seeking needles in haystacks, are overwhelmed by the quantity of irrelevant information. Often this is caused by a poor query (too vague or too generic; for example, try searching for \" computer science \"). Even with a well-formulated specific query (such as in Figure 1), results can be poor (for example, Google.com returned as one match a document titled: \" Distributed Information Search and Retrieval for Astronomical Resource Discovery and Data Mining \"). The popularity of the Web has spurred enormous growth in the number and types of available resources. Many networked information retrieval (NIR) tools can be used to search the Web and provide information on demand to unsophisticated end users. Search engines are a simple example; typically they make use of a program (called a spider) that traverses the Web and creates databases of the keywords in a Web page (allowing fast, local retrieval of these resources). IR systems, such as search engines, are most useful when the user makes a precise query, has a clear idea what …","PeriodicalId":8272,"journal":{"name":"Appl. Intell.","volume":"41 1","pages":"17-19"},"PeriodicalIF":0.0000,"publicationDate":"2000-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"82","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Appl. Intell.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/355137.355141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 82
Abstract
A n information retrieval (IR) system informs the user about the existence and whereabouts of documents or data relating to a query made by the user. Traditional methods for automated information retrieval are largely based on searching and indexing techniques performed by people (such as librarians). Figure 1 illustrates the operation of a generic IR system. In Figure 1, the user enters a query (in this example a Boolean query that asks the IR system to find documents that contain the phrase " information retrieval " as well as the word " resources "). The user query may be processed (for example, to convert the plural " resources " to the singular " resource ") and matched against a database of documents that have been preprocessed in order to speed matching. The database can be a local document collection or a collection of networked documents, such as those on the World Wide Web (WWW). The output of the IR system is typically a ranked list of documents. Some IR systems may provide an option for user feedback, such as asking the user to give his opinions on the quality of the matches, and can use this feedback to improve the quality of the search. Increased capabilities of computer hardware and software have created a vast body of machine-readable resources. Typically there is no lack of available information; more often, users, seeking needles in haystacks, are overwhelmed by the quantity of irrelevant information. Often this is caused by a poor query (too vague or too generic; for example, try searching for " computer science "). Even with a well-formulated specific query (such as in Figure 1), results can be poor (for example, Google.com returned as one match a document titled: " Distributed Information Search and Retrieval for Astronomical Resource Discovery and Data Mining "). The popularity of the Web has spurred enormous growth in the number and types of available resources. Many networked information retrieval (NIR) tools can be used to search the Web and provide information on demand to unsophisticated end users. Search engines are a simple example; typically they make use of a program (called a spider) that traverses the Web and creates databases of the keywords in a Web page (allowing fast, local retrieval of these resources). IR systems, such as search engines, are most useful when the user makes a precise query, has a clear idea what …