{"title":"Utilizing text mining results: The Pasta Web System","authors":"G. Demetriou, R. Gaizauskas","doi":"10.3115/1118149.1118160","DOIUrl":null,"url":null,"abstract":"Information Extraction (IE), defined as the activity to extract structured knowledge from unstructured text sources, offers new opportunities for the exploitation of biological information contained in the vast amounts of scientific literature. But while IE technology has received increasing attention in the area of molecular biology, there have not been many examples of IE systems successfully deployed in end-user applications. We describe the development of PASTAWeb, a WWW-based interface to the extraction output of PASTA, an IE system that extracts protein structure information from MEDLINE abstracts. Key characteristics of PASTAWeb are the seamless integration of the PASTA extraction results (templates) with WWW-based technology, the dynamic generation of WWW content from 'static' data and the fusion of information extracted from multiple documents.","PeriodicalId":339993,"journal":{"name":"ACL Workshop on Natural Language Processing in the Biomedical Domain","volume":"195 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACL Workshop on Natural Language Processing in the Biomedical Domain","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3115/1118149.1118160","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
Information Extraction (IE), defined as the activity to extract structured knowledge from unstructured text sources, offers new opportunities for the exploitation of biological information contained in the vast amounts of scientific literature. But while IE technology has received increasing attention in the area of molecular biology, there have not been many examples of IE systems successfully deployed in end-user applications. We describe the development of PASTAWeb, a WWW-based interface to the extraction output of PASTA, an IE system that extracts protein structure information from MEDLINE abstracts. Key characteristics of PASTAWeb are the seamless integration of the PASTA extraction results (templates) with WWW-based technology, the dynamic generation of WWW content from 'static' data and the fusion of information extracted from multiple documents.