{"title":"Image classification and retrieval on the World Wide Web","authors":"Noureddine Abbadeni, D. Ziou, Shengrui Wang","doi":"10.1145/313238.313316","DOIUrl":null,"url":null,"abstract":"Image retrieval is emerging as an important research area with many applications in various fields such as image and multimedia databases and digital libraries. The World Wide Web is an enormous, distributed, hypermedia and non-structured information system. Tens of millions of images exist on the World Wide Web. Developing tools which would make it possible to seek specific images in this enormous image database is of an unquestionable utility and would give to the WWW all its potential. However, the search for images in the context of the WWW is an extremely difficult task and poses new challenges. Two characteristics are to be taken into account when dealing with images on the WWW: 1. the incredibly large size of all these images and the extraordinary diversity of types of images which one can find on the WWW; 2. in the field of image processing and computer vision, there is no general algorithm able to process all types of images. Two fundamental issues must be addressed when developing retrieval tools: effectiveness of search and effectivity of search. The effectiveness implies that one can find information in a reasonable time. With the power of the current workstations and the development of techniques such as parallel programming and multi-thread programming, the effectiveness is not a bottleneck. However the effectivity of the images retrieved compared to the request is a major problem and should be examined more closely. The majority of the retrieval tools existing on the WWW are not pertinent: many retrieved documents are not pertinent to the request (noise) and many documents pertinent to the request are not retrieved (silence). Taking into account all these facts, we believe that a preliminary and crucial step before developing an image retrieval tool on the WWW is, first, to classify these images in many classes such as photographs, graphics, cartoons, faces, textured images, color images, etc. and then perform a search in each class. Doing so, we take at least two advantages: 1. effectivity is improved (noise and silence are reduced) since search is done in a specific class and not in all the database; 2. we can apply appropriate algorithms on each class of images given that a general algorithm for all types of images does not exist. In this paper, we are interested in performing image search on the WWW using both the image content and textual key-words. The following features are yet available in our system:","PeriodicalId":42447,"journal":{"name":"Digital Library Perspectives","volume":"22 1","pages":"208-209"},"PeriodicalIF":1.1000,"publicationDate":"1999-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Library Perspectives","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/313238.313316","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 3
Abstract
Image retrieval is emerging as an important research area with many applications in various fields such as image and multimedia databases and digital libraries. The World Wide Web is an enormous, distributed, hypermedia and non-structured information system. Tens of millions of images exist on the World Wide Web. Developing tools which would make it possible to seek specific images in this enormous image database is of an unquestionable utility and would give to the WWW all its potential. However, the search for images in the context of the WWW is an extremely difficult task and poses new challenges. Two characteristics are to be taken into account when dealing with images on the WWW: 1. the incredibly large size of all these images and the extraordinary diversity of types of images which one can find on the WWW; 2. in the field of image processing and computer vision, there is no general algorithm able to process all types of images. Two fundamental issues must be addressed when developing retrieval tools: effectiveness of search and effectivity of search. The effectiveness implies that one can find information in a reasonable time. With the power of the current workstations and the development of techniques such as parallel programming and multi-thread programming, the effectiveness is not a bottleneck. However the effectivity of the images retrieved compared to the request is a major problem and should be examined more closely. The majority of the retrieval tools existing on the WWW are not pertinent: many retrieved documents are not pertinent to the request (noise) and many documents pertinent to the request are not retrieved (silence). Taking into account all these facts, we believe that a preliminary and crucial step before developing an image retrieval tool on the WWW is, first, to classify these images in many classes such as photographs, graphics, cartoons, faces, textured images, color images, etc. and then perform a search in each class. Doing so, we take at least two advantages: 1. effectivity is improved (noise and silence are reduced) since search is done in a specific class and not in all the database; 2. we can apply appropriate algorithms on each class of images given that a general algorithm for all types of images does not exist. In this paper, we are interested in performing image search on the WWW using both the image content and textual key-words. The following features are yet available in our system:
期刊介绍:
Digital Library Perspectives (DLP) is a peer-reviewed journal concerned with digital content collections. It publishes research related to the curation and web-based delivery of digital objects collected for the advancement of scholarship, teaching and learning. And which advance the digital information environment as it relates to global knowledge, communication and world memory. The journal aims to keep readers informed about current trends, initiatives, and developments. Including those in digital libraries and digital repositories, along with their standards and technologies. The editor invites contributions on the following, as well as other related topics: Digitization, Data as information, Archives and manuscripts, Digital preservation and digital archiving, Digital cultural memory initiatives, Usability studies, K-12 and higher education uses of digital collections.