L. Brown, M. Consens, I. Davis, C. R. Palmer, Frank Wm. Tompa
{"title":"A Structured Text ADT for Object-Relational Databases","authors":"L. Brown, M. Consens, I. Davis, C. R. Palmer, Frank Wm. Tompa","doi":"10.1002/(SICI)1096-9942(1998)4:4<227::AID-TAPO3>3.3.CO;2-L","DOIUrl":null,"url":null,"abstract":"There is a growing need to develop tools that are able to retrieve relevant textual information rapidly, to present textual information in a meaningful way, and to integrate textual information with related data retrieved from other sources. These tools are critical to support applications within corporate intranets and across the rapidly evolving World Wide Web. This paper introduces a framework for modelling structured text and presents a small set of operations that may be applied against such models. Using these operations structured text may be selected, marked, fragmented, and transformed into relations for use in relational and object-oriented database systems. The extended functionality has been accepted for inclusion within the SQL/MM standard, and a prototype database engine has been implemented to support SQL with the proposed extensions. This prototype serves as a proof of concept intended to address industrial concerns, and it demonstrates the power of the proposed abstract data type for structured text. 1. The challenge Database technology is essential to the operation of conventional business enterprises, and it is becoming increasingly important in the development of distributed information systems. However, most database systems, and in particular relational database systems, provide few facilities for effectively managing the vast body of electronic information embedded within text. Many customers require that large texts be searched both vertically, with respect to their internal structure, and horizontally, with respect to their textual content [Wei85]. Texts often need to be fragmented at appropriate structural boundaries. Sometimes selected text needs to be extracted as separate units, but often the appropriate context surrounding selected text must be recovered, and thus the selected text needs to be marked in some manner, so that it can be subsequently located within a potentially much larger context.","PeriodicalId":293061,"journal":{"name":"Theory Pract. Object Syst.","volume":"201 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theory Pract. Object Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/(SICI)1096-9942(1998)4:4<227::AID-TAPO3>3.3.CO;2-L","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18
Abstract
There is a growing need to develop tools that are able to retrieve relevant textual information rapidly, to present textual information in a meaningful way, and to integrate textual information with related data retrieved from other sources. These tools are critical to support applications within corporate intranets and across the rapidly evolving World Wide Web. This paper introduces a framework for modelling structured text and presents a small set of operations that may be applied against such models. Using these operations structured text may be selected, marked, fragmented, and transformed into relations for use in relational and object-oriented database systems. The extended functionality has been accepted for inclusion within the SQL/MM standard, and a prototype database engine has been implemented to support SQL with the proposed extensions. This prototype serves as a proof of concept intended to address industrial concerns, and it demonstrates the power of the proposed abstract data type for structured text. 1. The challenge Database technology is essential to the operation of conventional business enterprises, and it is becoming increasingly important in the development of distributed information systems. However, most database systems, and in particular relational database systems, provide few facilities for effectively managing the vast body of electronic information embedded within text. Many customers require that large texts be searched both vertically, with respect to their internal structure, and horizontally, with respect to their textual content [Wei85]. Texts often need to be fragmented at appropriate structural boundaries. Sometimes selected text needs to be extracted as separate units, but often the appropriate context surrounding selected text must be recovered, and thus the selected text needs to be marked in some manner, so that it can be subsequently located within a potentially much larger context.