{"title":"More versatile scientific documents","authors":"R. Fateman","doi":"10.1109/ICDAR.1997.620680","DOIUrl":null,"url":null,"abstract":"The electronic representation of scientific documents (journals, technical reports, program documentation, laboratory notebooks, etc.) presents challenges in several distinct communities. We see five distinct groups who are concerned with electronic versions of scientific documents: (1) publishers of journals, texts and reference works, and their authors; (2) software publishers for OCR/document analysis and document formatting; (3) software publishers whose products access \"contents semantics\" from documents, including library keyword search programs, natural language search programs, database systems, visual presentation systems, mathematical computation systems, etc.; (4) institutions maintaining access to electronic libraries, which must be broadly construed to include data and programs of all sorts; and (5) individuals and programs acting as their agents who need to use these libraries to identify, locate and retrieve relevant documents. It would be good to have a convergence in design and standards for encoding new or pre-existing (typically paper-based) documents in order to meet the needs of all these groups. Various efforts, some loosely coordinated, but just as often competing, are trying to set standards and build tools. This paper discusses where we are headed.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.1997.620680","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
The electronic representation of scientific documents (journals, technical reports, program documentation, laboratory notebooks, etc.) presents challenges in several distinct communities. We see five distinct groups who are concerned with electronic versions of scientific documents: (1) publishers of journals, texts and reference works, and their authors; (2) software publishers for OCR/document analysis and document formatting; (3) software publishers whose products access "contents semantics" from documents, including library keyword search programs, natural language search programs, database systems, visual presentation systems, mathematical computation systems, etc.; (4) institutions maintaining access to electronic libraries, which must be broadly construed to include data and programs of all sorts; and (5) individuals and programs acting as their agents who need to use these libraries to identify, locate and retrieve relevant documents. It would be good to have a convergence in design and standards for encoding new or pre-existing (typically paper-based) documents in order to meet the needs of all these groups. Various efforts, some loosely coordinated, but just as often competing, are trying to set standards and build tools. This paper discusses where we are headed.