Franco Lin, Wen-Yi Chang, Lung-Cheng Lee, Hung-Ta Hsiao, W. Tsai, J. Lai
In this paper, we present two types of the real-time water monitoring system using the image processing technology, the water level recognition and the surface velocity recognition. According to the bridge failure investigation, floods in the river often pose potential risk to bridges, and scouring could undermine the pier foundation and cause the structures to collapse. It is very important to develop monitoring techniques for bridge safety in the field. In this study, we installed two high-resolution cameras on the in-situ bridge site to get the real-time water level and surface velocity image. For the water level recognition, we use the image processing techniques of the image binarization, character recognition, and water line detection. For the surface velocity recognition, the proposed system apply the PIV(Particle Image Velocimetry, PIV) method to obtain the recognition of the water surface velocity by the cross correlation analysis. Finally, the proposed systems are used to record and measure the variations of the water level and surface velocity for a period of three days. The good results show that the proposed systems have potential to provide real-time information of water level and surface velocity during flood periods.
{"title":"Applications of Image Recognition for Real-Time Water Level and Surface Velocity","authors":"Franco Lin, Wen-Yi Chang, Lung-Cheng Lee, Hung-Ta Hsiao, W. Tsai, J. Lai","doi":"10.1109/ISM.2013.49","DOIUrl":"https://doi.org/10.1109/ISM.2013.49","url":null,"abstract":"In this paper, we present two types of the real-time water monitoring system using the image processing technology, the water level recognition and the surface velocity recognition. According to the bridge failure investigation, floods in the river often pose potential risk to bridges, and scouring could undermine the pier foundation and cause the structures to collapse. It is very important to develop monitoring techniques for bridge safety in the field. In this study, we installed two high-resolution cameras on the in-situ bridge site to get the real-time water level and surface velocity image. For the water level recognition, we use the image processing techniques of the image binarization, character recognition, and water line detection. For the surface velocity recognition, the proposed system apply the PIV(Particle Image Velocimetry, PIV) method to obtain the recognition of the water surface velocity by the cross correlation analysis. Finally, the proposed systems are used to record and measure the variations of the water level and surface velocity for a period of three days. The good results show that the proposed systems have potential to provide real-time information of water level and surface velocity during flood periods.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"9 1 1","pages":"259-262"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78266870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leonardo S. de Oliveira, Zenilton K. G. Patrocínio, S. Guimarães, G. Gravier
Near-duplicate video sequence identification consists in identifying real positions of a specific video clip in a video stream stored in a database. To address this problem, we propose a new approach based on a scalable sequence aligner borrowed from proteomics. Sequence alignment is performed on symbolic representations of features extracted from the input videos, based on an algorithm originally applied to bio-informatics. Experimental results demonstrate that our method performance achieved 94% recall with 100% precision, with an average searching time of about 1 second.
{"title":"Searching for Near-Duplicate Video Sequences from a Scalable Sequence Aligner","authors":"Leonardo S. de Oliveira, Zenilton K. G. Patrocínio, S. Guimarães, G. Gravier","doi":"10.1109/ISM.2013.42","DOIUrl":"https://doi.org/10.1109/ISM.2013.42","url":null,"abstract":"Near-duplicate video sequence identification consists in identifying real positions of a specific video clip in a video stream stored in a database. To address this problem, we propose a new approach based on a scalable sequence aligner borrowed from proteomics. Sequence alignment is performed on symbolic representations of features extracted from the input videos, based on an algorithm originally applied to bio-informatics. Experimental results demonstrate that our method performance achieved 94% recall with 100% precision, with an average searching time of about 1 second.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"21 1","pages":"223-226"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80337534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Histogram-based watermarking schemes are invariant against pixel permutations and can be combined with permutation-based ciphers. However, typical histogram-based watermarking schemes based on comparison of histogram bins are prone to de-synchronization attacks, where the whole histogram is shifted by a certain amount. In this paper we investigate the possibility of avoiding this kind of attacks by synchronizing the embedding and detection processes, using the mean of the histogram as a calibration point. The resulting watermarking scheme is resistant to three common types of shifts of the histogram, while the advantages of previous histogram-based schemes, especially commutativity of watermarking and permutation-based encryption, are preserved.
{"title":"Towards More Robust Commutative Watermarking-Encryption of Images","authors":"R. Schmitz, Shujun Li, C. Grecos, Xinpeng Zhang","doi":"10.1109/ISM.2013.54","DOIUrl":"https://doi.org/10.1109/ISM.2013.54","url":null,"abstract":"Histogram-based watermarking schemes are invariant against pixel permutations and can be combined with permutation-based ciphers. However, typical histogram-based watermarking schemes based on comparison of histogram bins are prone to de-synchronization attacks, where the whole histogram is shifted by a certain amount. In this paper we investigate the possibility of avoiding this kind of attacks by synchronizing the embedding and detection processes, using the mean of the histogram as a calibration point. The resulting watermarking scheme is resistant to three common types of shifts of the histogram, while the advantages of previous histogram-based schemes, especially commutativity of watermarking and permutation-based encryption, are preserved.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"40 1","pages":"283-286"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77697828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Kafetzakis, Chris Xilouris, M. Kourtis, M. Nieto, Iveel Jargalsaikhan, S. Little
The process of transcoding videos apart from being computationally intensive, can also be a rather complex procedure. The complexity refers to the choice of appropriate parameters for the transcoding engine, with the aim of decreasing video sizes, transcoding times and network bandwidth without degrading video quality beyond some threshold that event detectors lose their accuracy. This paper explains the need for transcoding, and then studies different video quality metrics. Commonly used algorithms for motion and person detection are briefly described, with emphasis in investigating the optimum transcoding configuration parameters. The analysis of the experimental results reveals that the existing video quality metrics are not suitable for automated systems and that the detection of persons is affected by the reduction of bit rate and resolution, while motion detection is more sensitive to frame rate.
{"title":"The Impact of Video Transcoding Parameters on Event Detection for Surveillance Systems","authors":"E. Kafetzakis, Chris Xilouris, M. Kourtis, M. Nieto, Iveel Jargalsaikhan, S. Little","doi":"10.1109/ISM.2013.64","DOIUrl":"https://doi.org/10.1109/ISM.2013.64","url":null,"abstract":"The process of transcoding videos apart from being computationally intensive, can also be a rather complex procedure. The complexity refers to the choice of appropriate parameters for the transcoding engine, with the aim of decreasing video sizes, transcoding times and network bandwidth without degrading video quality beyond some threshold that event detectors lose their accuracy. This paper explains the need for transcoding, and then studies different video quality metrics. Commonly used algorithms for motion and person detection are briefly described, with emphasis in investigating the optimum transcoding configuration parameters. The analysis of the experimental results reveals that the existing video quality metrics are not suitable for automated systems and that the detection of persons is affected by the reduction of bit rate and resolution, while motion detection is more sensitive to frame rate.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"15 1","pages":"333-338"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88966656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qiusha Zhu, Zhao Li, Haohong Wang, Yimin Yang, M. Shyu
Most content-based recommender systems focus on analyzing the textual information of items. For items with images, the images can be treated as another information modality. In this paper, an effective method called MSLIM is proposed to integrate multimodal information for content-based item recommendation. It formalizes the probelm into a regularized optimization problem in the least-squares sense and the coordinate gradient descent is applied to solve the problem. The aggregation coefficients of the items are learned in an unsupervised manner during this process, based on which the k-nearest neighbor (k-NN) algorithm is used to generate the top-N recommendations of each item by finding its k nearest neighbors. A framework of using MSLIM for item recommendation is proposed accordingly. The experimental results on a self-collected handbag dataset show that MSLIM outperforms the selected comparison methods and show how the model parameters affect the final recommendation results.
{"title":"Multimodal Sparse Linear Integration for Content-Based Item Recommendation","authors":"Qiusha Zhu, Zhao Li, Haohong Wang, Yimin Yang, M. Shyu","doi":"10.1109/ISM.2013.37","DOIUrl":"https://doi.org/10.1109/ISM.2013.37","url":null,"abstract":"Most content-based recommender systems focus on analyzing the textual information of items. For items with images, the images can be treated as another information modality. In this paper, an effective method called MSLIM is proposed to integrate multimodal information for content-based item recommendation. It formalizes the probelm into a regularized optimization problem in the least-squares sense and the coordinate gradient descent is applied to solve the problem. The aggregation coefficients of the items are learned in an unsupervised manner during this process, based on which the k-nearest neighbor (k-NN) algorithm is used to generate the top-N recommendations of each item by finding its k nearest neighbors. A framework of using MSLIM for item recommendation is proposed accordingly. The experimental results on a self-collected handbag dataset show that MSLIM outperforms the selected comparison methods and show how the model parameters affect the final recommendation results.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"185 1","pages":"187-194"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78050324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marius Tennøe, Espen Helgedagsrud, Mikkel Næss, Henrik Kjus Alstad, H. Stensland, V. Reddy, Dag Johansen, C. Griwodz, P. Halvorsen
High resolution, wide field of view video generated from multiple camera feeds has many use cases. However, processing the different steps of a panorama video pipeline in real-time is challenging due to the high data rates and the stringent requirements of timeliness. We use panorama video in a sport analysis system where video events must be generated in real-time. In this respect, we present a system for real-time panorama video generation from an array of low-cost CCD HD video cameras. We describe how we have implemented different components and evaluated alternatives. We also present performance results with and without co-processors like graphics processing units (GPUs), and we evaluate each individual component and show how the entire pipeline is able to run in real-time on commodity hardware.
{"title":"Efficient Implementation and Processing of a Real-Time Panorama Video Pipeline","authors":"Marius Tennøe, Espen Helgedagsrud, Mikkel Næss, Henrik Kjus Alstad, H. Stensland, V. Reddy, Dag Johansen, C. Griwodz, P. Halvorsen","doi":"10.1109/ISM.2013.21","DOIUrl":"https://doi.org/10.1109/ISM.2013.21","url":null,"abstract":"High resolution, wide field of view video generated from multiple camera feeds has many use cases. However, processing the different steps of a panorama video pipeline in real-time is challenging due to the high data rates and the stringent requirements of timeliness. We use panorama video in a sport analysis system where video events must be generated in real-time. In this respect, we present a system for real-time panorama video generation from an array of low-cost CCD HD video cameras. We describe how we have implemented different components and evaluated alternatives. We also present performance results with and without co-processors like graphics processing units (GPUs), and we evaluate each individual component and show how the entire pipeline is able to run in real-time on commodity hardware.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"131 1","pages":"76-83"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86834280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we propose a novel framework for object retrieval based on automatic foreground object extraction and multi-layer information integration. Specifically, user interested objects are firstly detected from unconstrained videos via a multimodal cues method, then an automatic object extraction algorithm based on Grab Cut is applied to separate foreground object from background. The object-level information is enhanced during the feature extraction layer by assigning different weights to foreground and background pixels respectively, and the spatial color and texture information is integrated during the similarity calculation layer. Experimental results on both benchmark data set and real-world data set demonstrate the effectiveness of the proposed framework.
{"title":"An Automatic Object Retrieval Framework for Complex Background","authors":"Yimin Yang, Fausto Fleites, Haohong Wang, Shu‐Ching Chen","doi":"10.1109/ISM.2013.71","DOIUrl":"https://doi.org/10.1109/ISM.2013.71","url":null,"abstract":"In this paper we propose a novel framework for object retrieval based on automatic foreground object extraction and multi-layer information integration. Specifically, user interested objects are firstly detected from unconstrained videos via a multimodal cues method, then an automatic object extraction algorithm based on Grab Cut is applied to separate foreground object from background. The object-level information is enhanced during the feature extraction layer by assigning different weights to foreground and background pixels respectively, and the spatial color and texture information is integrated during the similarity calculation layer. Experimental results on both benchmark data set and real-world data set demonstrate the effectiveness of the proposed framework.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"1 1","pages":"374-377"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84710378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a novel image super-resolution method from digital cinema to 8K ultra high-definition television using registration of wavelet multi-scale components with affine transformation. The proposed method features that an original image is divided into signal and noise components by the wavelet soft-shrinkage with detection of white noise level. The signal component enhances resolution by registration between a signal component and its wavelet multi-scale components with affine transformation and parameters optimization. The affine transformation enhances super-resolution image quality because it increases registration candidates. The noise component enhances resolution with power control considering cinema noise representation. Super-resolution image outputs by synthesis of super-resolved signal and noise components. Experiments show that the proposed method has objectively better PSNR measurement and subjectively better appearance in comparison with conventional super-resolution methods.
{"title":"Image Super-resolution Using Registration of Wavelet Multi-scale Components with Affine Transformation","authors":"Y. Matsuo, Ryoki Takada, Shinya Iwasaki, J. Katto","doi":"10.1109/ISM.2013.53","DOIUrl":"https://doi.org/10.1109/ISM.2013.53","url":null,"abstract":"We propose a novel image super-resolution method from digital cinema to 8K ultra high-definition television using registration of wavelet multi-scale components with affine transformation. The proposed method features that an original image is divided into signal and noise components by the wavelet soft-shrinkage with detection of white noise level. The signal component enhances resolution by registration between a signal component and its wavelet multi-scale components with affine transformation and parameters optimization. The affine transformation enhances super-resolution image quality because it increases registration candidates. The noise component enhances resolution with power control considering cinema noise representation. Super-resolution image outputs by synthesis of super-resolved signal and noise components. Experiments show that the proposed method has objectively better PSNR measurement and subjectively better appearance in comparison with conventional super-resolution methods.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"36 3 1","pages":"279-282"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89589593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-06-05DOI: 10.1109/BMSB.2013.6621707
M. Nawaz, J. Cosmas, A. Adnan, M. F. U. Haq, E. Alazawi
In the background subtraction method one of the core issue is; how to setup the threshold value precisely at run time, which can ultimately overcome several bugs of this approach in the foreground detection. In the proposed algorithm the key feature of any foreground detection algorithm; motion is used however getting the threshold value from the original motion histogram is not possible, so for the said purpose smooth motion histogram is used in a systematic way to obtain the threshold value. In the proposed algorithm the main focus is to get a better estimation of threshold so that to get a dynamic value, from histogram at run time. If the proposed algorithm is used intelligently in terms of motion magnitude and motion direction it can distinguish accurately between background and foreground, camera motion along with camera motion and object motion.
{"title":"Foreground detection using background subtraction with histogram","authors":"M. Nawaz, J. Cosmas, A. Adnan, M. F. U. Haq, E. Alazawi","doi":"10.1109/BMSB.2013.6621707","DOIUrl":"https://doi.org/10.1109/BMSB.2013.6621707","url":null,"abstract":"In the background subtraction method one of the core issue is; how to setup the threshold value precisely at run time, which can ultimately overcome several bugs of this approach in the foreground detection. In the proposed algorithm the key feature of any foreground detection algorithm; motion is used however getting the threshold value from the original motion histogram is not possible, so for the said purpose smooth motion histogram is used in a systematic way to obtain the threshold value. In the proposed algorithm the main focus is to get a better estimation of threshold so that to get a dynamic value, from histogram at run time. If the proposed algorithm is used intelligently in terms of motion magnitude and motion direction it can distinguish accurately between background and foreground, camera motion along with camera motion and object motion.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"43 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2013-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73407698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-06-05DOI: 10.1109/BMSB.2013.6621691
Youngkwon Lim
During the last two decades MPEG has successfully developed the standards for multimedia delivery such as MPEG-2 TS and ISO Base Media File Format. Recent changes of multimedia delivery environment due to the rapid increase of multimedia services over the Internet brought new requirements to the standards for multimedia delivery, namely (i) flexible and dynamic access to multimedia components, (ii) Easy conversion between the format for storage and the format for the packetized delivery and (iii) mixed use of multimedia components from multiple sources including the caches and the local storages. MPEG has started the development of MPEG Media Transport (MMT) standard to respond to such new requirements. In this paper, challenges to MPEG-2 TS and RTP regarding new requirements are discussed and brief descriptions about MMT showing how MMT provides solutions to those challenges are provided.
{"title":"MMT, new alternative to MPEG-2 TS and RTP","authors":"Youngkwon Lim","doi":"10.1109/BMSB.2013.6621691","DOIUrl":"https://doi.org/10.1109/BMSB.2013.6621691","url":null,"abstract":"During the last two decades MPEG has successfully developed the standards for multimedia delivery such as MPEG-2 TS and ISO Base Media File Format. Recent changes of multimedia delivery environment due to the rapid increase of multimedia services over the Internet brought new requirements to the standards for multimedia delivery, namely (i) flexible and dynamic access to multimedia components, (ii) Easy conversion between the format for storage and the format for the packetized delivery and (iii) mixed use of multimedia components from multiple sources including the caches and the local storages. MPEG has started the development of MPEG Media Transport (MMT) standard to respond to such new requirements. In this paper, challenges to MPEG-2 TS and RTP regarding new requirements are discussed and brief descriptions about MMT showing how MMT provides solutions to those challenges are provided.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"24 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2013-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74806526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}