{"title":"基于图像识别的电视新闻文章索引与分类","authors":"Y. Ariki, T. Teranishi","doi":"10.1109/ICDAR.1997.619882","DOIUrl":null,"url":null,"abstract":"In accumulating and retrieving multimedia information such as images, speech and text, it is necessary to compress and retrieve the information efficiently and accurately. The purpose of this paper is to construct a multimedia database of TV news images based on telop character recognition. The first step is to detect telop frames and to segment the characters by differentiating the telop frames based on the fact that character regions have high brightness and the character edges are clear. The second step is the telop character recognition. It is performed by a subspace method using direction histogram features. The third step is indexing by extracting noun words after morphological analysis of the recognized telop characters. These noun words correspond with key words and are given to TV news articles as their indices. Finally TV news articles are classified into 10 topics such as politics, economics, culture, amusements, sports and so on based on the extracted indices. We employed an index-topic table to classify the articles using indices. The telop character recognition rate was 65.7% and the article classification rate was 67.3%.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Indexing and classification of TV news articles based on telop recognition\",\"authors\":\"Y. Ariki, T. Teranishi\",\"doi\":\"10.1109/ICDAR.1997.619882\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In accumulating and retrieving multimedia information such as images, speech and text, it is necessary to compress and retrieve the information efficiently and accurately. The purpose of this paper is to construct a multimedia database of TV news images based on telop character recognition. The first step is to detect telop frames and to segment the characters by differentiating the telop frames based on the fact that character regions have high brightness and the character edges are clear. The second step is the telop character recognition. It is performed by a subspace method using direction histogram features. The third step is indexing by extracting noun words after morphological analysis of the recognized telop characters. These noun words correspond with key words and are given to TV news articles as their indices. Finally TV news articles are classified into 10 topics such as politics, economics, culture, amusements, sports and so on based on the extracted indices. We employed an index-topic table to classify the articles using indices. The telop character recognition rate was 65.7% and the article classification rate was 67.3%.\",\"PeriodicalId\":435320,\"journal\":{\"name\":\"Proceedings of the Fourth International Conference on Document Analysis and Recognition\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Fourth International Conference on Document Analysis and Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.1997.619882\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.1997.619882","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Indexing and classification of TV news articles based on telop recognition
In accumulating and retrieving multimedia information such as images, speech and text, it is necessary to compress and retrieve the information efficiently and accurately. The purpose of this paper is to construct a multimedia database of TV news images based on telop character recognition. The first step is to detect telop frames and to segment the characters by differentiating the telop frames based on the fact that character regions have high brightness and the character edges are clear. The second step is the telop character recognition. It is performed by a subspace method using direction histogram features. The third step is indexing by extracting noun words after morphological analysis of the recognized telop characters. These noun words correspond with key words and are given to TV news articles as their indices. Finally TV news articles are classified into 10 topics such as politics, economics, culture, amusements, sports and so on based on the extracted indices. We employed an index-topic table to classify the articles using indices. The telop character recognition rate was 65.7% and the article classification rate was 67.3%.