{"title":"On-line searching of council of Europe conventions and agreements: A study in bilingual document retrieval","authors":"N.H. Price, C. Bye , B. Niblett","doi":"10.1016/0020-0271(74)90016-3","DOIUrl":null,"url":null,"abstract":"<div><p>At the Second International Conference on Mechanised Information Storage and Retrieval held at Cranfield in 1969, results were presented illustrating the application of a suite of computer programs (the STATUS package) to the searching of the full text of U.K. atomic energy legislation. These programs were at that time implemented on a KDF9 computer at the Culham Laboratory. Since then the programs have been rewritten in modified form for the IBM 370/165 machine at Harwell and used to search (both in batch mode and on-line) the text of material supplied by the Council of Europe. The present paper describes current progress with this work.</p><p>The Agreements, Conventions and Protocols concluded between the Member States of the Council of Europe are a largely self-contained set of documents admirably suited for a small-scale full-text retrieval system. They cover a wide range of subject matter including economic, social, cultural, scientific, legal and administrative topics. The best known and the most important of the Conventions is that concerned with the Protection of Human Rights and Fundamental Freedoms. A feature of particular interest is that English and French are the official languages of the organisation and thus the documents are available in both these languages. The total size of the text is some 200,000 words in each language.</p><p>For searching purposes the computer stores both the text itself and an inverted file to the text which gives the address (in terms of document, sentence and position in the sentence) of each word. The QUEST search language provides facilities for the enquirer to interrogate the text using logical and positional operators. These operators use the address file to determine which documents satisfy the search criterion formulated by the user. This makes possible a wide variety of searching techniques.</p><p>The QUEST language includes a special facility which enables an enquirer to formulate at the console, and to store for his use, what are termed “macro” operators and “macro” words. For example, one commonly used “macro” searches text for the definition of a word or phrase, another looks for dates contained in the text. Once a macro is defined it may be used in the formulation of other operators or words. This feature of the language offers the user the opportunity of building up his own library of searching algorithms, as simple or as complex as he wishes, which are personal to him.</p><p>The paper describes the main features of the computer programs as implemented on the IBM 370/165 at Harwell, and includes results of typical search enquiries.</p></div>","PeriodicalId":100670,"journal":{"name":"Information Storage and Retrieval","volume":"10 3","pages":"Pages 145-154"},"PeriodicalIF":0.0000,"publicationDate":"1974-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0020-0271(74)90016-3","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Storage and Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/0020027174900163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
At the Second International Conference on Mechanised Information Storage and Retrieval held at Cranfield in 1969, results were presented illustrating the application of a suite of computer programs (the STATUS package) to the searching of the full text of U.K. atomic energy legislation. These programs were at that time implemented on a KDF9 computer at the Culham Laboratory. Since then the programs have been rewritten in modified form for the IBM 370/165 machine at Harwell and used to search (both in batch mode and on-line) the text of material supplied by the Council of Europe. The present paper describes current progress with this work.
The Agreements, Conventions and Protocols concluded between the Member States of the Council of Europe are a largely self-contained set of documents admirably suited for a small-scale full-text retrieval system. They cover a wide range of subject matter including economic, social, cultural, scientific, legal and administrative topics. The best known and the most important of the Conventions is that concerned with the Protection of Human Rights and Fundamental Freedoms. A feature of particular interest is that English and French are the official languages of the organisation and thus the documents are available in both these languages. The total size of the text is some 200,000 words in each language.
For searching purposes the computer stores both the text itself and an inverted file to the text which gives the address (in terms of document, sentence and position in the sentence) of each word. The QUEST search language provides facilities for the enquirer to interrogate the text using logical and positional operators. These operators use the address file to determine which documents satisfy the search criterion formulated by the user. This makes possible a wide variety of searching techniques.
The QUEST language includes a special facility which enables an enquirer to formulate at the console, and to store for his use, what are termed “macro” operators and “macro” words. For example, one commonly used “macro” searches text for the definition of a word or phrase, another looks for dates contained in the text. Once a macro is defined it may be used in the formulation of other operators or words. This feature of the language offers the user the opportunity of building up his own library of searching algorithms, as simple or as complex as he wishes, which are personal to him.
The paper describes the main features of the computer programs as implemented on the IBM 370/165 at Harwell, and includes results of typical search enquiries.