{"title":"大型集合的可伸缩浏览:一个案例研究","authors":"G. Paynter, I. Witten, S. Cunningham, G. Buchanan","doi":"10.1145/336597.336666","DOIUrl":null,"url":null,"abstract":"Phrase browsing techniques use phrases extracted automatically from a large information collection as a basis for browsing and accessing it. This paper describes a case study that uses an automatically constructed phrase hierarchy to facilitate browsing of an ordinary large Web site. Phrases are extracted from the full text using a novel combination of rudimentary syntactic processing and sequential grammar induction techniques. The interface is simple, robust and easy to use.\nTo convey a feeling for the quality of the phrases that are generated automatically, a thesaurus used by the organization responsible for the Web site is studied and its degree of overlap with the phrases in the hierarchy is analyzed. Our ultimate goal is to amalgamate hierarchical phrase browsing and hierarchical thesaurus browsing: the latter provides an authoritative domain vocabulary and the former augments coverage in areas the thesaurus does not reach.","PeriodicalId":42447,"journal":{"name":"Digital Library Perspectives","volume":"4 1","pages":"215-223"},"PeriodicalIF":1.1000,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"65","resultStr":"{\"title\":\"Scalable browsing for large collections: a case study\",\"authors\":\"G. Paynter, I. Witten, S. Cunningham, G. Buchanan\",\"doi\":\"10.1145/336597.336666\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Phrase browsing techniques use phrases extracted automatically from a large information collection as a basis for browsing and accessing it. This paper describes a case study that uses an automatically constructed phrase hierarchy to facilitate browsing of an ordinary large Web site. Phrases are extracted from the full text using a novel combination of rudimentary syntactic processing and sequential grammar induction techniques. The interface is simple, robust and easy to use.\\nTo convey a feeling for the quality of the phrases that are generated automatically, a thesaurus used by the organization responsible for the Web site is studied and its degree of overlap with the phrases in the hierarchy is analyzed. Our ultimate goal is to amalgamate hierarchical phrase browsing and hierarchical thesaurus browsing: the latter provides an authoritative domain vocabulary and the former augments coverage in areas the thesaurus does not reach.\",\"PeriodicalId\":42447,\"journal\":{\"name\":\"Digital Library Perspectives\",\"volume\":\"4 1\",\"pages\":\"215-223\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2000-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"65\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Library Perspectives\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/336597.336666\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Library Perspectives","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/336597.336666","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
Scalable browsing for large collections: a case study
Phrase browsing techniques use phrases extracted automatically from a large information collection as a basis for browsing and accessing it. This paper describes a case study that uses an automatically constructed phrase hierarchy to facilitate browsing of an ordinary large Web site. Phrases are extracted from the full text using a novel combination of rudimentary syntactic processing and sequential grammar induction techniques. The interface is simple, robust and easy to use.
To convey a feeling for the quality of the phrases that are generated automatically, a thesaurus used by the organization responsible for the Web site is studied and its degree of overlap with the phrases in the hierarchy is analyzed. Our ultimate goal is to amalgamate hierarchical phrase browsing and hierarchical thesaurus browsing: the latter provides an authoritative domain vocabulary and the former augments coverage in areas the thesaurus does not reach.
期刊介绍:
Digital Library Perspectives (DLP) is a peer-reviewed journal concerned with digital content collections. It publishes research related to the curation and web-based delivery of digital objects collected for the advancement of scholarship, teaching and learning. And which advance the digital information environment as it relates to global knowledge, communication and world memory. The journal aims to keep readers informed about current trends, initiatives, and developments. Including those in digital libraries and digital repositories, along with their standards and technologies. The editor invites contributions on the following, as well as other related topics: Digitization, Data as information, Archives and manuscripts, Digital preservation and digital archiving, Digital cultural memory initiatives, Usability studies, K-12 and higher education uses of digital collections.