On-line searching of council of Europe conventions and agreements: A study in bilingual document retrieval

N.H. Price, C. Bye , B. Niblett
{"title":"On-line searching of council of Europe conventions and agreements: A study in bilingual document retrieval","authors":"N.H. Price,&nbsp;C. Bye ,&nbsp;B. Niblett","doi":"10.1016/0020-0271(74)90016-3","DOIUrl":null,"url":null,"abstract":"<div><p>At the Second International Conference on Mechanised Information Storage and Retrieval held at Cranfield in 1969, results were presented illustrating the application of a suite of computer programs (the STATUS package) to the searching of the full text of U.K. atomic energy legislation. These programs were at that time implemented on a KDF9 computer at the Culham Laboratory. Since then the programs have been rewritten in modified form for the IBM 370/165 machine at Harwell and used to search (both in batch mode and on-line) the text of material supplied by the Council of Europe. The present paper describes current progress with this work.</p><p>The Agreements, Conventions and Protocols concluded between the Member States of the Council of Europe are a largely self-contained set of documents admirably suited for a small-scale full-text retrieval system. They cover a wide range of subject matter including economic, social, cultural, scientific, legal and administrative topics. The best known and the most important of the Conventions is that concerned with the Protection of Human Rights and Fundamental Freedoms. A feature of particular interest is that English and French are the official languages of the organisation and thus the documents are available in both these languages. The total size of the text is some 200,000 words in each language.</p><p>For searching purposes the computer stores both the text itself and an inverted file to the text which gives the address (in terms of document, sentence and position in the sentence) of each word. The QUEST search language provides facilities for the enquirer to interrogate the text using logical and positional operators. These operators use the address file to determine which documents satisfy the search criterion formulated by the user. This makes possible a wide variety of searching techniques.</p><p>The QUEST language includes a special facility which enables an enquirer to formulate at the console, and to store for his use, what are termed “macro” operators and “macro” words. For example, one commonly used “macro” searches text for the definition of a word or phrase, another looks for dates contained in the text. Once a macro is defined it may be used in the formulation of other operators or words. This feature of the language offers the user the opportunity of building up his own library of searching algorithms, as simple or as complex as he wishes, which are personal to him.</p><p>The paper describes the main features of the computer programs as implemented on the IBM 370/165 at Harwell, and includes results of typical search enquiries.</p></div>","PeriodicalId":100670,"journal":{"name":"Information Storage and Retrieval","volume":"10 3","pages":"Pages 145-154"},"PeriodicalIF":0.0000,"publicationDate":"1974-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0020-0271(74)90016-3","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Storage and Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/0020027174900163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

At the Second International Conference on Mechanised Information Storage and Retrieval held at Cranfield in 1969, results were presented illustrating the application of a suite of computer programs (the STATUS package) to the searching of the full text of U.K. atomic energy legislation. These programs were at that time implemented on a KDF9 computer at the Culham Laboratory. Since then the programs have been rewritten in modified form for the IBM 370/165 machine at Harwell and used to search (both in batch mode and on-line) the text of material supplied by the Council of Europe. The present paper describes current progress with this work.

The Agreements, Conventions and Protocols concluded between the Member States of the Council of Europe are a largely self-contained set of documents admirably suited for a small-scale full-text retrieval system. They cover a wide range of subject matter including economic, social, cultural, scientific, legal and administrative topics. The best known and the most important of the Conventions is that concerned with the Protection of Human Rights and Fundamental Freedoms. A feature of particular interest is that English and French are the official languages of the organisation and thus the documents are available in both these languages. The total size of the text is some 200,000 words in each language.

For searching purposes the computer stores both the text itself and an inverted file to the text which gives the address (in terms of document, sentence and position in the sentence) of each word. The QUEST search language provides facilities for the enquirer to interrogate the text using logical and positional operators. These operators use the address file to determine which documents satisfy the search criterion formulated by the user. This makes possible a wide variety of searching techniques.

The QUEST language includes a special facility which enables an enquirer to formulate at the console, and to store for his use, what are termed “macro” operators and “macro” words. For example, one commonly used “macro” searches text for the definition of a word or phrase, another looks for dates contained in the text. Once a macro is defined it may be used in the formulation of other operators or words. This feature of the language offers the user the opportunity of building up his own library of searching algorithms, as simple or as complex as he wishes, which are personal to him.

The paper describes the main features of the computer programs as implemented on the IBM 370/165 at Harwell, and includes results of typical search enquiries.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
联机检索欧洲委员会公约和协议:双语文献检索的研究
1969年在克兰菲尔德举行的第二届机械化信息存储和检索国际会议上,展示了一套计算机程序(STATUS包)在英国原子能立法全文搜索中的应用。这些程序当时是在Culham实验室的KDF9计算机上实现的。从那时起,这些程序被重新编写为哈维尔的IBM 370/165机器的修改形式,并用于搜索(包括批处理模式和在线)欧洲委员会提供的材料文本。本文介绍了这项工作的最新进展。欧洲委员会成员国之间缔结的《协定》、《公约》和《议定书》基本上是一套独立的文件,非常适合于小型全文检索系统。它们涵盖了广泛的主题,包括经济、社会、文化、科学、法律和行政主题。《公约》中最著名和最重要的是《保护人权和基本自由公约》。特别令人感兴趣的一个特点是,英语和法语是本组织的官方语言,因此文件以这两种语言提供。每种语言的文本总大小约为20万个单词。为了搜索目的,计算机存储文本本身和文本的反向文件,该文件给出每个单词的地址(根据文档、句子和句子中的位置)。QUEST搜索语言为查询器提供了使用逻辑和位置操作符查询文本的工具。这些操作符使用地址文件来确定哪些文档满足用户制定的搜索条件。这使得各种各样的搜索技术成为可能。QUEST语言包括一种特殊的功能,它使查询器能够在控制台上制定并存储所谓的“宏”操作符和“宏”字以供使用。例如,一个常用的“宏”在文本中搜索单词或短语的定义,另一个搜索文本中包含的日期。一旦定义了宏,就可以将其用于其他操作符或词的表述中。该语言的这一特性为用户提供了建立自己的搜索算法库的机会,无论简单还是复杂,都是他个人的愿望。本文描述了在Harwell的IBM 370/165计算机上实现的计算机程序的主要特点,并包括典型的搜索查询结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Information Storage: A Multidisciplinary Perspective Computer systems in the library: A handbook for managers and designers Knowing books and men: Knowing computers, too Grundlagen universaler wissensordnung; probleme und möglichkeiten eines universalen klassifikationssystems des wissens Resource sharing in libraries: Why, how, when, next action steps
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1