Pub Date : 2008-11-05DOI: 10.1109/MMSP.2008.4665118
A. Naman, D. Taubman
A JPEG2000 compressed video sequence can provide better support for scalability, flexibility, and accessibility at a wider range of bit-rates than the current motion-compensated predictive video coding standards; however, it requires considerably more bandwidth to stream. The authors have recently proposed a novel approach that reduces the required bandwidth; this approach uses motion compensation and conditional replenishment of JPEG2000 code-blocks, aided by server-optimized selection of these code-blocks. The proposed approach can serve a diverse range of client requirements and can adapt immediately to interactive changes in client interests, such as forward or backward playback and zooming into individual frames. This work extends the previous work by approximating the distortion associated with the decisions made by the server without the need to recreate the actual video sequence at the server. The proposed distortion estimation algorithm is general and can be applied to various frames arrangements. Here, we choose to employ it in a hierarchical arrangement of frames, similar to the hierarchical B-frames of the SVC scalable video coding extension of the H.264/AVC standard. We employ a Lagrangian-style rate-distortion optimization procedure to the server transmission problem and compare the performance of both distortion estimation and exact distortion calculation cases against streaming individual frames and SVC. Results obtained suggest that the distortion estimation algorithm considerably reduces the amount of calculation needed by the server without enormously degrading the performance compared to the exact distortion calculation case. This work introduces the concepts, formulates the estimation and optimization problems, proposes a solution, and compares the performance to alternate strategies.
{"title":"Distortion estimation for optimized delivery of JPEG2000 compressed video with motion","authors":"A. Naman, D. Taubman","doi":"10.1109/MMSP.2008.4665118","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665118","url":null,"abstract":"A JPEG2000 compressed video sequence can provide better support for scalability, flexibility, and accessibility at a wider range of bit-rates than the current motion-compensated predictive video coding standards; however, it requires considerably more bandwidth to stream. The authors have recently proposed a novel approach that reduces the required bandwidth; this approach uses motion compensation and conditional replenishment of JPEG2000 code-blocks, aided by server-optimized selection of these code-blocks. The proposed approach can serve a diverse range of client requirements and can adapt immediately to interactive changes in client interests, such as forward or backward playback and zooming into individual frames. This work extends the previous work by approximating the distortion associated with the decisions made by the server without the need to recreate the actual video sequence at the server. The proposed distortion estimation algorithm is general and can be applied to various frames arrangements. Here, we choose to employ it in a hierarchical arrangement of frames, similar to the hierarchical B-frames of the SVC scalable video coding extension of the H.264/AVC standard. We employ a Lagrangian-style rate-distortion optimization procedure to the server transmission problem and compare the performance of both distortion estimation and exact distortion calculation cases against streaming individual frames and SVC. Results obtained suggest that the distortion estimation algorithm considerably reduces the amount of calculation needed by the server without enormously degrading the performance compared to the exact distortion calculation case. This work introduces the concepts, formulates the estimation and optimization problems, proposes a solution, and compares the performance to alternate strategies.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"18 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130920332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2008-11-05DOI: 10.1109/MMSP.2008.4665183
Hao-Tian Wu, J. Dugelay
In this paper, a reversible watermarking algorithm is proposed for 3D mesh models based on prediction-error expansion. Firstly, we predict a vertex position by calculating the centroid of its traversed neighbors. Then the prediction error, i.e. the difference between the predicted and real positions, is expanded for data embedding. So only the vertex coordinates are modified to embed a watermark into the mesh content without changing the topology. We further reduce the distortion by adaptively choosing a threshold so that the prediction errors with too large magnitude will not be expanded. The chosen threshold value and critical location information should be saved in the watermarked mesh to guide the recovery process. The experiments show that the original mesh can be exactly recovered and consequently our algorithm can be used for symmetric or public key authentication of 3D mesh models.
{"title":"Reversible watermarking of 3D mesh models by prediction-error expansion","authors":"Hao-Tian Wu, J. Dugelay","doi":"10.1109/MMSP.2008.4665183","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665183","url":null,"abstract":"In this paper, a reversible watermarking algorithm is proposed for 3D mesh models based on prediction-error expansion. Firstly, we predict a vertex position by calculating the centroid of its traversed neighbors. Then the prediction error, i.e. the difference between the predicted and real positions, is expanded for data embedding. So only the vertex coordinates are modified to embed a watermark into the mesh content without changing the topology. We further reduce the distortion by adaptively choosing a threshold so that the prediction errors with too large magnitude will not be expanded. The chosen threshold value and critical location information should be saved in the watermarked mesh to guide the recovery process. The experiments show that the original mesh can be exactly recovered and consequently our algorithm can be used for symmetric or public key authentication of 3D mesh models.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134381887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2008-11-05DOI: 10.1109/MMSP.2008.4665207
S. Mehrotra, Weig-Ge Chen, K. Koishida, Naveen Thumpudi
Audio coding at low bitrates typically suffers from artifacts caused by bandwidth truncation. In this paper we present a novel scheme to code audio signals at low bitrates which uses a traditional scalar quantization followed by entropy coding to code some portions of the spectrum (typically the lower portion). The other portions (typically the higher portions) of the spectrum are coded at a low bitrate using an adaptive gain shape vector quantizer where the codebook for vector quantization is formed by unmodified or modified versions of the portions of the spectrum which have already been coded. Fixed pre-trained codebooks are also available for use in certain cases. The use of such a scheme results in an audio codec which has been shown to be among the best audio codecs available at low bitrates. In addition, the decoder complexity of this audio codec is significantly lower than any other codec of equal quality at low bitrates.
{"title":"Hybrid low bitrate audio coding using adaptive gain shape vector quantization","authors":"S. Mehrotra, Weig-Ge Chen, K. Koishida, Naveen Thumpudi","doi":"10.1109/MMSP.2008.4665207","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665207","url":null,"abstract":"Audio coding at low bitrates typically suffers from artifacts caused by bandwidth truncation. In this paper we present a novel scheme to code audio signals at low bitrates which uses a traditional scalar quantization followed by entropy coding to code some portions of the spectrum (typically the lower portion). The other portions (typically the higher portions) of the spectrum are coded at a low bitrate using an adaptive gain shape vector quantizer where the codebook for vector quantization is formed by unmodified or modified versions of the portions of the spectrum which have already been coded. Fixed pre-trained codebooks are also available for use in certain cases. The use of such a scheme results in an audio codec which has been shown to be among the best audio codecs available at low bitrates. In addition, the decoder complexity of this audio codec is significantly lower than any other codec of equal quality at low bitrates.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129462774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2008-11-05DOI: 10.1109/MMSP.2008.4665053
I. Feldmann, W. Waizenegger, O. Schreer
In this paper we discuss the application of 3D scene reconstruction techniques in the area of automatic semantic annotation, search and retrieval of unedited video footage. Rather than working with static key-frames we exploit the time-depended dynamic properties of a moving camera. Based on state of the art camera self calibration techniques we develop a powerful analysis chain. We demonstrate, that the reconstructed 3D scene information can be used to generate both, accurate low level scene descriptors as well as meaningful medium and high level semantic information. We show, that the proposed algorithms work even in case of sparse data sets. The proposed algorithms provide a powerful working base for further investigations in the area of low, medium and high level extraction of semantic information for unedited video.
{"title":"Extraction of 3D scene structure for semantic annotation and retrieval of unedited video","authors":"I. Feldmann, W. Waizenegger, O. Schreer","doi":"10.1109/MMSP.2008.4665053","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665053","url":null,"abstract":"In this paper we discuss the application of 3D scene reconstruction techniques in the area of automatic semantic annotation, search and retrieval of unedited video footage. Rather than working with static key-frames we exploit the time-depended dynamic properties of a moving camera. Based on state of the art camera self calibration techniques we develop a powerful analysis chain. We demonstrate, that the reconstructed 3D scene information can be used to generate both, accurate low level scene descriptors as well as meaningful medium and high level semantic information. We show, that the proposed algorithms work even in case of sparse data sets. The proposed algorithms provide a powerful working base for further investigations in the area of low, medium and high level extraction of semantic information for unedited video.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129836417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2008-11-05DOI: 10.1109/MMSP.2008.4665093
Moyuresh Biswas, M. Frater, J. Arnold, M. Pickering
The problem of resilient video transmission over lossy packet networks is addressed in this paper. We propose a rate-distortion optimized multiple description (MD) codec. Two different optimization controls of the codec are described that are suited to rates of packet loss, including the case where packets can travel over multiple paths through the network, with each path-dependent packet-loss probabilities. A packetization method optimized to work seamlessly with the proposed MD codec is also proposed. Simulations performed under various packet loss scenarios show the importance of the two optimizations and also that the proposed framework achieves significantly improved video quality when compared with similar techniques.
{"title":"An optimized Multiple Description video codec for lossy packet networks","authors":"Moyuresh Biswas, M. Frater, J. Arnold, M. Pickering","doi":"10.1109/MMSP.2008.4665093","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665093","url":null,"abstract":"The problem of resilient video transmission over lossy packet networks is addressed in this paper. We propose a rate-distortion optimized multiple description (MD) codec. Two different optimization controls of the codec are described that are suited to rates of packet loss, including the case where packets can travel over multiple paths through the network, with each path-dependent packet-loss probabilities. A packetization method optimized to work seamlessly with the proposed MD codec is also proposed. Simulations performed under various packet loss scenarios show the importance of the two optimizations and also that the proposed framework achieves significantly improved video quality when compared with similar techniques.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133474625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2008-11-05DOI: 10.1109/MMSP.2008.4665148
Xiaole Ding, Yin-Jun Miao, Fan Bu, Lifeng Sun, Shiqiang Yang
Highlight detection is a challenge task in soccer video analysis. Using Web-casting text as external knowledge is proved to be a short cut to achieve both efficiency and effectiveness. Based on the previous framework using Web-casting text, we have improved the processes of video time detection and highlight boundary detection. Our method can detect the transparent time bar and can achieve acceptable precision in highlight boundary detection though the Web text time is not accurate at all. This progress can make the framework more robust in practice.
{"title":"Highlight detection in soccer video using web-casting text","authors":"Xiaole Ding, Yin-Jun Miao, Fan Bu, Lifeng Sun, Shiqiang Yang","doi":"10.1109/MMSP.2008.4665148","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665148","url":null,"abstract":"Highlight detection is a challenge task in soccer video analysis. Using Web-casting text as external knowledge is proved to be a short cut to achieve both efficiency and effectiveness. Based on the previous framework using Web-casting text, we have improved the processes of video time detection and highlight boundary detection. Our method can detect the transparent time bar and can achieve acceptable precision in highlight boundary detection though the Web text time is not accurate at all. This progress can make the framework more robust in practice.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116288582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2008-11-05DOI: 10.1109/MMSP.2008.4665170
Xudong Lv, Z. J. Wang
Dimension reduction based techniques, such as singular value decomposition (SVD) and non-negative matrix factorization (NMF), have been proved to provide excellent performance for robust and secure image hashing by retaining the essential features of the original image matrix while preventing intentional attacks. In this paper, we introduce a recently proposed low-distortion, dimension reduction technique, referred as fast Johnson-Lindenstrauss transform (FJLT), and propose the use of FJLT for image hashing. FJLT shares the low-distortion characteristics of a random projection but requires a much lower complexity. These two desirable properties make it suitable for image hashing. Our experiment results show that the proposed FJLT-based hash yields good robustness under a wide range of attacks. Furthermore, the influence of secret key on the proposed hashing algorithm is evaluated by receiver operating characteristics (ROC) graph, revealing the efficiency of the proposed approach.
{"title":"Fast Johnson-Lindenstrauss Transform for robust and secure image hashing","authors":"Xudong Lv, Z. J. Wang","doi":"10.1109/MMSP.2008.4665170","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665170","url":null,"abstract":"Dimension reduction based techniques, such as singular value decomposition (SVD) and non-negative matrix factorization (NMF), have been proved to provide excellent performance for robust and secure image hashing by retaining the essential features of the original image matrix while preventing intentional attacks. In this paper, we introduce a recently proposed low-distortion, dimension reduction technique, referred as fast Johnson-Lindenstrauss transform (FJLT), and propose the use of FJLT for image hashing. FJLT shares the low-distortion characteristics of a random projection but requires a much lower complexity. These two desirable properties make it suitable for image hashing. Our experiment results show that the proposed FJLT-based hash yields good robustness under a wide range of attacks. Furthermore, the influence of secret key on the proposed hashing algorithm is evaluated by receiver operating characteristics (ROC) graph, revealing the efficiency of the proposed approach.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114988089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2008-11-05DOI: 10.1109/MMSP.2008.4665119
Ruiqin Xiong, D. Taubman
This paper investigates the optimal PET protection for streaming scalably compressed streams over networks where the delivery time constraints allow limited retransmissions (LR) and the communication channels exhibit both random losses and delays. A key property must be considered in this scenario is the possibility that a packet successfully arrives at the receiver in time, even if its acknowledgment is not received by the sender at certain deadlines. This paper proposes an extended LRPET scheme, namely random-delay LR-PET, in which additional streams may be sent to provide supplemental protection for the packets whose acknowledgments are still missing at a specified time after the transmission. To determine the optimal protection in each transmission opportunity, hypotheses concerning the number of acknowledged packets and the effect of future retransmission are considered. As the key contribution of this paper, we develop a method to derive the effective overall recovery probability versus redundancy characteristic, which significantly simplifies the actual protection assignment procedure. This paper also demonstrates the benefits of the optimization strategy proposed for this random-delay LR-PET scheme and the cruciality of time selection for scheduling retransmission.
{"title":"Optimal LR-PET protection for scalable video streams over lossy channels with random delay","authors":"Ruiqin Xiong, D. Taubman","doi":"10.1109/MMSP.2008.4665119","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665119","url":null,"abstract":"This paper investigates the optimal PET protection for streaming scalably compressed streams over networks where the delivery time constraints allow limited retransmissions (LR) and the communication channels exhibit both random losses and delays. A key property must be considered in this scenario is the possibility that a packet successfully arrives at the receiver in time, even if its acknowledgment is not received by the sender at certain deadlines. This paper proposes an extended LRPET scheme, namely random-delay LR-PET, in which additional streams may be sent to provide supplemental protection for the packets whose acknowledgments are still missing at a specified time after the transmission. To determine the optimal protection in each transmission opportunity, hypotheses concerning the number of acknowledged packets and the effect of future retransmission are considered. As the key contribution of this paper, we develop a method to derive the effective overall recovery probability versus redundancy characteristic, which significantly simplifies the actual protection assignment procedure. This paper also demonstrates the benefits of the optimization strategy proposed for this random-delay LR-PET scheme and the cruciality of time selection for scheduling retransmission.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128396404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2008-11-05DOI: 10.1109/MMSP.2008.4665135
Hui Li, Yuhua Peng, W. Hwang
Improving the subjective quality and reducing the computational complexity of interpolation algorithms are important issues in video and network signal processing. To this end, we propose a fast adaptive image interpolation algorithm that classifies pixels and uses different linear interpolation kernels that are adaptive to the class of a pixel. Pixels are classified into regions relevant to the perception of an image, either in a texture region, an edge region, or a smooth region. Image interpolation is performed with Neville filters, which can be efficiently implemented by a lifting scheme. Since linear interpolation tends to over-smooth pixels in edge regions and texture regions, we apply the Laplacian operator to enhance the pixels in those regions. The results of simulations show that the proposed algorithm not only reduces the computational complexity of the process, but also improves the visual quality of the interpolated images.
{"title":"A fast content-dependent interpolation approach via adaptive filtering","authors":"Hui Li, Yuhua Peng, W. Hwang","doi":"10.1109/MMSP.2008.4665135","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665135","url":null,"abstract":"Improving the subjective quality and reducing the computational complexity of interpolation algorithms are important issues in video and network signal processing. To this end, we propose a fast adaptive image interpolation algorithm that classifies pixels and uses different linear interpolation kernels that are adaptive to the class of a pixel. Pixels are classified into regions relevant to the perception of an image, either in a texture region, an edge region, or a smooth region. Image interpolation is performed with Neville filters, which can be efficiently implemented by a lifting scheme. Since linear interpolation tends to over-smooth pixels in edge regions and texture regions, we apply the Laplacian operator to enhance the pixels in those regions. The results of simulations show that the proposed algorithm not only reduces the computational complexity of the process, but also improves the visual quality of the interpolated images.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"262 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133726483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2008-11-05DOI: 10.1109/MMSP.2008.4665043
Jialie Shen, D. Tao, Xuelong Li
This paper describes a new video event detection framework based on subspace selection technique. With the approach, feature vectors presenting different kinds of video information can be easily projected from different modalities onto an unified subspace, on which recognition process can be performed. The approach is capable of discriminating different classes and preserving the intra-modal geometry of samples within an identical class. Distinguished from the existing multi-modal detection methods, the new system works well when some modalities are not available. Experimental results based on soccer video and TRECVID news video collections demonstrate the effectiveness, efficiency and robustness of the proposed method for individual recognition tasks in comparison to the existing approaches.
{"title":"Effective video event detection via subspace projection","authors":"Jialie Shen, D. Tao, Xuelong Li","doi":"10.1109/MMSP.2008.4665043","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665043","url":null,"abstract":"This paper describes a new video event detection framework based on subspace selection technique. With the approach, feature vectors presenting different kinds of video information can be easily projected from different modalities onto an unified subspace, on which recognition process can be performed. The approach is capable of discriminating different classes and preserving the intra-modal geometry of samples within an identical class. Distinguished from the existing multi-modal detection methods, the new system works well when some modalities are not available. Experimental results based on soccer video and TRECVID news video collections demonstrate the effectiveness, efficiency and robustness of the proposed method for individual recognition tasks in comparison to the existing approaches.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134079787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}