Pub Date : 2015-06-10DOI: 10.1109/CBMI.2015.7153621
Tomás Grosup, Juraj Mosko, Premysl Cech
Preserving continuity between individual exploration steps in a process of multimedia exploration is a concept of natural intuition that sometimes decides if a particular exploration system is usable or not. One of ways how to emulate the continuity of the exploration process is adding some sort of granularity into this process. Then anyone who uses such system can explore particular areas in less or more details. In this paper we proposed new concept, hierarchical querying, which keeps consecutive steps of the exploration process more tight to each other. As a second concept, which directly supports the continuity of the exploration process, we proposed preservation of a user context between consecutive steps of the exploration process. In addition, we also presented an evaluation process and architecture design of our multimedia exploration system. For a validity confirmation of our ideas, we have implemented all proposed concepts in a web application that is accessible online.
{"title":"Continuous hierarchical exploration of multimedia collections","authors":"Tomás Grosup, Juraj Mosko, Premysl Cech","doi":"10.1109/CBMI.2015.7153621","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153621","url":null,"abstract":"Preserving continuity between individual exploration steps in a process of multimedia exploration is a concept of natural intuition that sometimes decides if a particular exploration system is usable or not. One of ways how to emulate the continuity of the exploration process is adding some sort of granularity into this process. Then anyone who uses such system can explore particular areas in less or more details. In this paper we proposed new concept, hierarchical querying, which keeps consecutive steps of the exploration process more tight to each other. As a second concept, which directly supports the continuity of the exploration process, we proposed preservation of a user context between consecutive steps of the exploration process. In addition, we also presented an evaluation process and architecture design of our multimedia exploration system. For a validity confirmation of our ideas, we have implemented all proposed concepts in a web application that is accessible online.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122847139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-10DOI: 10.1109/CBMI.2015.7153627
M. Svanera, Sergio Benini, N. Adami, R. Leonardi, A. Kovács
The ability to characterize a film, in terms of its narrative and style, is becoming a necessity especially for developing personal video recommendation systems to better deliver on-demand Internet streaming media. Among the set of identifiable stylistic features which play an important role in the film's emotional effects, the use of Over-the-shoulder (OtS) shots in movies is able to convey a big dramatic tension on the viewers. In this work we propose a methodology able to automatically detect this kind of shots by combining in a SVM learning scheme some state-of-the-art human presence detectors, with a set of saliency features based on colour and motion. In the experimental investigation, the comparison of obtained results with manual annotations made by cinema experts proves the validity of the framework. Experiments are conducted on two art films directed by Michelangelo Antonioni belonging to his famous “tetralogy on modernity and its discontent”, one in shades of gray (L'avventura, 1960), and the other in colour motion (Il deserto rosso, 1964).
{"title":"Over-the-shoulder shot detection in art films","authors":"M. Svanera, Sergio Benini, N. Adami, R. Leonardi, A. Kovács","doi":"10.1109/CBMI.2015.7153627","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153627","url":null,"abstract":"The ability to characterize a film, in terms of its narrative and style, is becoming a necessity especially for developing personal video recommendation systems to better deliver on-demand Internet streaming media. Among the set of identifiable stylistic features which play an important role in the film's emotional effects, the use of Over-the-shoulder (OtS) shots in movies is able to convey a big dramatic tension on the viewers. In this work we propose a methodology able to automatically detect this kind of shots by combining in a SVM learning scheme some state-of-the-art human presence detectors, with a set of saliency features based on colour and motion. In the experimental investigation, the comparison of obtained results with manual annotations made by cinema experts proves the validity of the framework. Experiments are conducted on two art films directed by Michelangelo Antonioni belonging to his famous “tetralogy on modernity and its discontent”, one in shades of gray (L'avventura, 1960), and the other in colour motion (Il deserto rosso, 1964).","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116978870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-10DOI: 10.1109/CBMI.2015.7153620
Hannes Fassold, H. Stiegler, Jakub Rosner, M. Thaler, W. Bailer
We propose a two stage visual matching pipeline including a first step using VLAD signatures for filtering results, and a second step which reranks the top results using raw matching of SIFT descriptors. This enables adjusting the tradeoff between high computational cost of matching local descriptors and the insufficient accuracy of compact signatures in many application scenarios. We describe GPU accelerated extraction and matching algorithms for SIFT, which result in a speedup factor of at least 4. The VLAD filtering step reduces the number of images/frames for which the local descriptors need to be matched, thus speeding up retrieval by an additional factor of 9-10 without sacrificing mean average precision over full raw descriptor matching.
{"title":"A GPU-accelerated two stage visual matching pipeline for image and video retrieval","authors":"Hannes Fassold, H. Stiegler, Jakub Rosner, M. Thaler, W. Bailer","doi":"10.1109/CBMI.2015.7153620","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153620","url":null,"abstract":"We propose a two stage visual matching pipeline including a first step using VLAD signatures for filtering results, and a second step which reranks the top results using raw matching of SIFT descriptors. This enables adjusting the tradeoff between high computational cost of matching local descriptors and the insufficient accuracy of compact signatures in many application scenarios. We describe GPU accelerated extraction and matching algorithms for SIFT, which result in a speedup factor of at least 4. The VLAD filtering step reduces the number of images/frames for which the local descriptors need to be matched, thus speeding up retrieval by an additional factor of 9-10 without sacrificing mean average precision over full raw descriptor matching.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132830175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-10DOI: 10.1109/CBMI.2015.7153624
A. Ramirez, J. Benois-Pineau, M. García-Vázquez, A. Stoian, M. Crucianu, M. Nakano-Miyatake, F. Garcia-Ugalde, Jean-Luc Rouas, H. Nicolas, J. Carrive
In this paper we present the Mex-Culture Multimedia platform, which is the first prototype of multimedia indexing and retrieval for a large-scale access to digitized Mexican cultural audio-visual content. The platform is designed as an open and extensible architecture of Web services. The different architectural layers and media services are presented, ensuring a rich set of scenarios. The latter comprises summarization of audio-visual content in cross-media description spaces, video queries by actions, key-frame and image queries by example and audio-analysis services. Specific attention is paid to the selection of data to be representative of Mexican cultural content. Scalability issues are addressed as well.
{"title":"The Mex-Culture Multimedia platform: Preservation and dissemination of the Mexican Culture","authors":"A. Ramirez, J. Benois-Pineau, M. García-Vázquez, A. Stoian, M. Crucianu, M. Nakano-Miyatake, F. Garcia-Ugalde, Jean-Luc Rouas, H. Nicolas, J. Carrive","doi":"10.1109/CBMI.2015.7153624","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153624","url":null,"abstract":"In this paper we present the Mex-Culture Multimedia platform, which is the first prototype of multimedia indexing and retrieval for a large-scale access to digitized Mexican cultural audio-visual content. The platform is designed as an open and extensible architecture of Web services. The different architectural layers and media services are presented, ensuring a rich set of scenarios. The latter comprises summarization of audio-visual content in cross-media description spaces, video queries by actions, key-frame and image queries by example and audio-analysis services. Specific attention is paid to the selection of data to be representative of Mexican cultural content. Scalability issues are addressed as well.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128435467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-10DOI: 10.1109/CBMI.2015.7153617
Ashok Dahal, Jung-Hwan Oh, Wallapak Tavanapong, J. Wong, P. C. Groen
Ulcerative colitis (UC) is a chronic inflammatory disease characterized by periods of relapses and remissions affecting more than 500,000 people in the United States. The therapeutic goals of UC are to first induce and then maintain disease remission. However, it is very difficult to evaluate the severity of UC objectively because of non-uniform nature of symptoms associated with UC, and large variations in their patterns. To address this, we objectively measure and classify the severity of UC presented in optical colonoscopy video frames based on the image textures. To extract distinct textures, we are using a hybrid approach in which a new proposed feature based on the accumulation of pixel value differences is combined with an existing feature such as LBP (Local Binary Pattern). The experimental results show the hybrid method can achieve more than 90% overall accuracy.
{"title":"Detection of ulcerative colitis severity in colonoscopy video frames","authors":"Ashok Dahal, Jung-Hwan Oh, Wallapak Tavanapong, J. Wong, P. C. Groen","doi":"10.1109/CBMI.2015.7153617","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153617","url":null,"abstract":"Ulcerative colitis (UC) is a chronic inflammatory disease characterized by periods of relapses and remissions affecting more than 500,000 people in the United States. The therapeutic goals of UC are to first induce and then maintain disease remission. However, it is very difficult to evaluate the severity of UC objectively because of non-uniform nature of symptoms associated with UC, and large variations in their patterns. To address this, we objectively measure and classify the severity of UC presented in optical colonoscopy video frames based on the image textures. To extract distinct textures, we are using a hybrid approach in which a new proposed feature based on the accumulation of pixel value differences is combined with an existing feature such as LBP (Local Binary Pattern). The experimental results show the hybrid method can achieve more than 90% overall accuracy.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133073961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-10DOI: 10.1109/CBMI.2015.7153604
M. Schedl, Mats Sjöberg, Ionut Mironica, B. Ionescu, Vu Lam Quang, Yu-Gang Jiang, C. Demarty
In this paper, we introduce a violent scenes and violence-related concept detection dataset named VSD2014. It contains annotations as well as auditory and visual features of Hollywood movies and user-generated footage shared on the web. The dataset is the result of a joint annotation endeavor of different research institutions and responds to the real-world use case of parental guidance in selecting appropriate content for children. The dataset has been validated during the Violent Scenes Detection (VSD) task at the MediaEval benchmarking initiative for multimedia evaluation.
{"title":"VSD2014: A dataset for violent scenes detection in hollywood movies and web videos","authors":"M. Schedl, Mats Sjöberg, Ionut Mironica, B. Ionescu, Vu Lam Quang, Yu-Gang Jiang, C. Demarty","doi":"10.1109/CBMI.2015.7153604","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153604","url":null,"abstract":"In this paper, we introduce a violent scenes and violence-related concept detection dataset named VSD2014. It contains annotations as well as auditory and visual features of Hollywood movies and user-generated footage shared on the web. The dataset is the result of a joint annotation endeavor of different research institutions and responds to the real-world use case of parental guidance in selecting appropriate content for children. The dataset has been validated during the Violent Scenes Detection (VSD) task at the MediaEval benchmarking initiative for multimedia evaluation.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122640738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-10DOI: 10.1109/CBMI.2015.7153622
Jennifer Roldan-Carlos, M. Lux, Xavier Giró-i-Nieto, P. Muñoz, N. Anagnostopoulos
With the advent of affordable multimedia smart phones, it has become common that people take videos when they are at events. The larger the event, the larger is the amount of videos taken there and also, the more videos get shared online. To search in this mass of videos is a challenging topic. In this paper we present and discuss a prototype software for searching in such videos. We focus only on visual information, and we report on experiments based on a research data set. With a small study we show that our prototype demonstrates promising results by identifying the same scene in different videos taken from different angles solely based on content based image retrieval.
{"title":"Event video retrieval using global and local descriptors in visual domain","authors":"Jennifer Roldan-Carlos, M. Lux, Xavier Giró-i-Nieto, P. Muñoz, N. Anagnostopoulos","doi":"10.1109/CBMI.2015.7153622","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153622","url":null,"abstract":"With the advent of affordable multimedia smart phones, it has become common that people take videos when they are at events. The larger the event, the larger is the amount of videos taken there and also, the more videos get shared online. To search in this mass of videos is a challenging topic. In this paper we present and discuss a prototype software for searching in such videos. We focus only on visual information, and we report on experiments based on a research data set. With a small study we show that our prototype demonstrates promising results by identifying the same scene in different videos taken from different angles solely based on content based image retrieval.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127902819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-10DOI: 10.1109/CBMI.2015.7153616
Manfred Jürgen Primus, Klaus Schöffmann, L. Böszörményi
In medical endoscopy more and more surgeons record videos of their interventions in a long-term storage archive for later retrieval. In order to allow content-based search in such endoscopic video archives, the video data needs to be indexed first. However, even the very basic step of content-based indexing, namely content segmentation, is already very challenging due to the special characteristics of such video data. Therefore, we propose to use instrument classification to enable semantic segmentation of laparoscopic videos. In this paper, we evaluate the performance of such an instrument classification approach. Our results show satisfying performance for all instruments used in our evaluation.
{"title":"Instrument classification in laparoscopic videos","authors":"Manfred Jürgen Primus, Klaus Schöffmann, L. Böszörményi","doi":"10.1109/CBMI.2015.7153616","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153616","url":null,"abstract":"In medical endoscopy more and more surgeons record videos of their interventions in a long-term storage archive for later retrieval. In order to allow content-based search in such endoscopic video archives, the video data needs to be indexed first. However, even the very basic step of content-based indexing, namely content segmentation, is already very challenging due to the special characteristics of such video data. Therefore, we propose to use instrument classification to enable semantic segmentation of laparoscopic videos. In this paper, we evaluate the performance of such an instrument classification approach. Our results show satisfying performance for all instruments used in our evaluation.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"88 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124374457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-10DOI: 10.1109/CBMI.2015.7153615
T. Furuya, Shigeru Kuriyama, Ryutarou Ohbuchi
In creating web pages, books, or presentation slides, consistent use of tasteful visual style(s) is quite important. In this paper, we consider the problem of style-based comparison and retrieval of illustrations. In their pioneering work, Garces et al. [2] proposed an algorithm for comparing illustrative style. The algorithm uses supervised learning that relied on stylistic labels present in a training dataset. In reality, obtaining such labels is quite difficult. In this paper, we propose an unsupervised approach to achieve accurate and efficient stylistic comparison among illustrations. The proposed algorithm combines heterogeneous local visual features extracted densely. These features are aggregated into a feature vector per illustration prior to be treated with distance metric learning based on unsupervised dimension reduction for saliency and compactness. Experimental evaluation of the proposed method by using multiple benchmark datasets indicates that the proposed method outperforms existing approaches.
{"title":"An unsupervised approach for comparing styles of illustrations","authors":"T. Furuya, Shigeru Kuriyama, Ryutarou Ohbuchi","doi":"10.1109/CBMI.2015.7153615","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153615","url":null,"abstract":"In creating web pages, books, or presentation slides, consistent use of tasteful visual style(s) is quite important. In this paper, we consider the problem of style-based comparison and retrieval of illustrations. In their pioneering work, Garces et al. [2] proposed an algorithm for comparing illustrative style. The algorithm uses supervised learning that relied on stylistic labels present in a training dataset. In reality, obtaining such labels is quite difficult. In this paper, we propose an unsupervised approach to achieve accurate and efficient stylistic comparison among illustrations. The proposed algorithm combines heterogeneous local visual features extracted densely. These features are aggregated into a feature vector per illustration prior to be treated with distance metric learning based on unsupervised dimension reduction for saliency and compactness. Experimental evaluation of the proposed method by using multiple benchmark datasets indicates that the proposed method outperforms existing approaches.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126436927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-10DOI: 10.1109/CBMI.2015.7153631
Premysl Cech, Tomás Grosup
In this paper, we compare eight different multimedia exploration methods. We describe each of them individually and evaluate their effectiveness in a user study focusing on different aspects of image exploration needs. We also created a testing scenario for the user study and defined several metrics to compare the exploration methods.
{"title":"Comparison of metric space browsing strategies for efficient image exploration","authors":"Premysl Cech, Tomás Grosup","doi":"10.1109/CBMI.2015.7153631","DOIUrl":"https://doi.org/10.1109/CBMI.2015.7153631","url":null,"abstract":"In this paper, we compare eight different multimedia exploration methods. We describe each of them individually and evaluate their effectiveness in a user study focusing on different aspects of image exploration needs. We also created a testing scenario for the user study and defined several metrics to compare the exploration methods.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130510792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}