Pub Date : 2017-09-01DOI: 10.1109/ICSIPA.2017.8120607
Hirotaka Tanaka, Yuji Waizumi, T. Kasezawa
To improve text detection and text recognition in natural scenes, it is important to improve legibility by removing the effects of illumination on the image. Therefore, in this study, we present a signal enhancement method for the dark regions in the image. In addition, we propose a procedure to reduce computational complexity of the bilateral filter that is used to estimate the illumination in our method. Since our main aim is to provide the preprocessing of text detection and recognition in natural scenes, we have proposed a simple processing in which only the signals of the dark regions in the image are enhanced. Experimental results show that our method is effective in conserving the naturalness of the image and improving the legibility of texts in natural scenes. Furthermore, by using our proposed procedure, bilateral filter computation loads can be reduced by about 30%.
{"title":"Retinex-based signal enhancement for image dark regions","authors":"Hirotaka Tanaka, Yuji Waizumi, T. Kasezawa","doi":"10.1109/ICSIPA.2017.8120607","DOIUrl":"https://doi.org/10.1109/ICSIPA.2017.8120607","url":null,"abstract":"To improve text detection and text recognition in natural scenes, it is important to improve legibility by removing the effects of illumination on the image. Therefore, in this study, we present a signal enhancement method for the dark regions in the image. In addition, we propose a procedure to reduce computational complexity of the bilateral filter that is used to estimate the illumination in our method. Since our main aim is to provide the preprocessing of text detection and recognition in natural scenes, we have proposed a simple processing in which only the signals of the dark regions in the image are enhanced. Experimental results show that our method is effective in conserving the naturalness of the image and improving the legibility of texts in natural scenes. Furthermore, by using our proposed procedure, bilateral filter computation loads can be reduced by about 30%.","PeriodicalId":268112,"journal":{"name":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"145 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133454753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/ICSIPA.2017.8120610
R. A. Hamzah, M. S. Hamid, Ahmad Fauzan Kadmin, S. Ghani
This paper proposes a new stereo corresponding algorithm which uses local-based. The Sum of Absolute Differences (SAD) algorithm produces accurate results on the disparity map for the textured regions. However, this algorithm is sensitive to low texture areas and high noise on images with high different brightness and contrast of images. To get over these problems, the proposed algorithm utilizes edge-preserving filter which is known as Bilateral Filter (BF). The BF kernel well-recovered low texture areas which is able to reduce noise and sharpen the images. Additionally, BF is strong against the distortions due to high brightness and contrast. The proposed work in this paper produces accurate results and performs much better compared to some established algorithms based on the quantitative and qualitative measurements using standard stereo benchmarking evaluation from the Middlebury.
本文提出了一种新的基于局部的立体对应算法。绝对差和(Sum of Absolute difference, SAD)算法对纹理区域的视差图产生准确的结果。然而,该算法对亮度和对比度差异较大的图像的低纹理区域和高噪声敏感。为了克服这些问题,该算法采用了边缘保持滤波器,即双边滤波器(BF)。该算法能够很好地恢复低纹理区域,从而降低图像的噪声,提高图像的锐化程度。此外,由于高亮度和高对比度,BF具有很强的抗畸变能力。本文提出的工作产生了准确的结果,并且与一些基于使用Middlebury标准立体基准评估的定量和定性测量的既定算法相比,性能要好得多。
{"title":"Improvement of stereo corresponding algorithm based on sum of absolute differences and edge preserving filter","authors":"R. A. Hamzah, M. S. Hamid, Ahmad Fauzan Kadmin, S. Ghani","doi":"10.1109/ICSIPA.2017.8120610","DOIUrl":"https://doi.org/10.1109/ICSIPA.2017.8120610","url":null,"abstract":"This paper proposes a new stereo corresponding algorithm which uses local-based. The Sum of Absolute Differences (SAD) algorithm produces accurate results on the disparity map for the textured regions. However, this algorithm is sensitive to low texture areas and high noise on images with high different brightness and contrast of images. To get over these problems, the proposed algorithm utilizes edge-preserving filter which is known as Bilateral Filter (BF). The BF kernel well-recovered low texture areas which is able to reduce noise and sharpen the images. Additionally, BF is strong against the distortions due to high brightness and contrast. The proposed work in this paper produces accurate results and performs much better compared to some established algorithms based on the quantitative and qualitative measurements using standard stereo benchmarking evaluation from the Middlebury.","PeriodicalId":268112,"journal":{"name":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131463971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/ICSIPA.2017.8120626
Tareq Aziz AL-Qutami, R. Ibrahim, I. Ismail
Virtual flow metering (VFM) is an attractive and cost-effective solution to meet the rising multiphase flow monitoring demands in the petroleum industry. It can also augment and backup physical multiphase flow metering. In this study, a heterogeneous ensemble of neural networks and regression trees is proposed to develop a VFM model utilizing bootstrapping and parameter perturbation to generate diversity among learners. The ensemble is pruned using simulated annealing optimization to further ensure accuracy and reduce ensemble complexity. The proposed VFM model is validated using five years well-test data from eight production wells. Results show improved performance over homogeneous ensemble techniques. Average errors achieved are 1.5%, 6.5%, and 4.7% for gas, oil, and, water flow rate estimations. The developed VFM provides accurate flow rate estimations across a wide range of gas volume fractions and water cuts and is anticipated to be a step forward towards the vision of completely integrated operations.
{"title":"Hybrid neural network and regression tree ensemble pruned by simulated annealing for virtual flow metering application","authors":"Tareq Aziz AL-Qutami, R. Ibrahim, I. Ismail","doi":"10.1109/ICSIPA.2017.8120626","DOIUrl":"https://doi.org/10.1109/ICSIPA.2017.8120626","url":null,"abstract":"Virtual flow metering (VFM) is an attractive and cost-effective solution to meet the rising multiphase flow monitoring demands in the petroleum industry. It can also augment and backup physical multiphase flow metering. In this study, a heterogeneous ensemble of neural networks and regression trees is proposed to develop a VFM model utilizing bootstrapping and parameter perturbation to generate diversity among learners. The ensemble is pruned using simulated annealing optimization to further ensure accuracy and reduce ensemble complexity. The proposed VFM model is validated using five years well-test data from eight production wells. Results show improved performance over homogeneous ensemble techniques. Average errors achieved are 1.5%, 6.5%, and 4.7% for gas, oil, and, water flow rate estimations. The developed VFM provides accurate flow rate estimations across a wide range of gas volume fractions and water cuts and is anticipated to be a step forward towards the vision of completely integrated operations.","PeriodicalId":268112,"journal":{"name":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114191973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/ICSIPA.2017.8120650
Ahmed Kamil Hasan Al-Ali, B. Senadji, V. Chandran
The robustness of speaker verification systems is often degraded in real forensic applications, which contain environmental noise and reverberation. Reverberation results in mismatched conditions between enrolment and test speech signals. In this work, we investigate the effectiveness of combining features of discrete wavelet transform (DWT) and feature-warped mel frequency cepstral coefficients (MFCCs) to improve the performance of speaker verification under conditions of reverberation and environmental noises. State of the art intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) were used as a classifier. The algorithm was evaluated by convolving the impulse room response with enrolment speech from an Australian forensic voice comparison database. The test speech signals were combined with car, street, and home noises from the QUT-NOISE database at signal to noise ratios (SNR) ranging from −10 dB to 10 dB. Experimental results indicate that the algorithm achieves a reduction in average equal error rate (EER) ranging from 17.10% to 51.86% over traditional MFCC features when reverberated enrolment data and the test speech signals are corrupted with car, street and home noises at SNRs ranging from −10 dB to 10 dB.
在包含环境噪声和混响的实际司法应用中,说话人验证系统的鲁棒性经常下降。混响导致入学和测试语音信号不匹配。在这项工作中,我们研究了将离散小波变换(DWT)和特征扭曲的mel频率倒谱系数(MFCCs)相结合的有效性,以改善混响和环境噪声条件下的说话人验证性能。使用最先进的中间向量(i-vector)和概率线性判别分析(PLDA)作为分类器。该算法通过将脉冲房间响应与来自澳大利亚法医语音比较数据库的入学演讲进行卷积来评估。测试语音信号与来自QUT-NOISE数据库的汽车、街道和家庭噪声结合在一起,信噪比(SNR)范围为−10 dB至10 dB。实验结果表明,当混响注册数据和测试语音信号被汽车、街道和家庭噪声(信噪比为- 10 dB ~ 10 dB)干扰时,该算法比传统的MFCC算法平均误差率(EER)降低了17.10% ~ 51.86%。
{"title":"Hybrid DWT and MFCC feature warping for noisy forensic speaker verification in room reverberation","authors":"Ahmed Kamil Hasan Al-Ali, B. Senadji, V. Chandran","doi":"10.1109/ICSIPA.2017.8120650","DOIUrl":"https://doi.org/10.1109/ICSIPA.2017.8120650","url":null,"abstract":"The robustness of speaker verification systems is often degraded in real forensic applications, which contain environmental noise and reverberation. Reverberation results in mismatched conditions between enrolment and test speech signals. In this work, we investigate the effectiveness of combining features of discrete wavelet transform (DWT) and feature-warped mel frequency cepstral coefficients (MFCCs) to improve the performance of speaker verification under conditions of reverberation and environmental noises. State of the art intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) were used as a classifier. The algorithm was evaluated by convolving the impulse room response with enrolment speech from an Australian forensic voice comparison database. The test speech signals were combined with car, street, and home noises from the QUT-NOISE database at signal to noise ratios (SNR) ranging from −10 dB to 10 dB. Experimental results indicate that the algorithm achieves a reduction in average equal error rate (EER) ranging from 17.10% to 51.86% over traditional MFCC features when reverberated enrolment data and the test speech signals are corrupted with car, street and home noises at SNRs ranging from −10 dB to 10 dB.","PeriodicalId":268112,"journal":{"name":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114431443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/ICSIPA.2017.8120593
M. Vagac, M. Melichercík, Michaela Samuelcikova
Detection of repetitive patterns in images is subject of several research papers. The majority of them deals with detection of lattice patterns of repetitive elements. However, there are many situations, when element's repetition doesn't follow any particular pattern. In this paper we focus on the following two objectives. Firstly, our algorithm detects repetitive elements regardless of their relative positions. Secondly, the algorithm extracts the shape of repetitive element. Main contribution of this paper is the proposed algorithm with ability to extract shape of not only regulary repeating elements.
{"title":"Extraction of geometric shape of repetitive elements with application to traceology","authors":"M. Vagac, M. Melichercík, Michaela Samuelcikova","doi":"10.1109/ICSIPA.2017.8120593","DOIUrl":"https://doi.org/10.1109/ICSIPA.2017.8120593","url":null,"abstract":"Detection of repetitive patterns in images is subject of several research papers. The majority of them deals with detection of lattice patterns of repetitive elements. However, there are many situations, when element's repetition doesn't follow any particular pattern. In this paper we focus on the following two objectives. Firstly, our algorithm detects repetitive elements regardless of their relative positions. Secondly, the algorithm extracts the shape of repetitive element. Main contribution of this paper is the proposed algorithm with ability to extract shape of not only regulary repeating elements.","PeriodicalId":268112,"journal":{"name":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124493267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/ICSIPA.2017.8120584
Li-Der Fang, Wen-Hsien Fang, D. Chang, Yie-Tarng Chen
Microwave imaging (MWI) is a promising imaging modality for breast tumor detection. One challenge faced by the ultra-wideband (UWB) radar-based breast cancer detection is the estimation of clutter-plus-noise covariance matrix. To render a more accurate covariance matrix estimate when the number of samples is not large, this paper presents a new covariance matrix estimate using the shrinkage method. The parameters of the proposed shrinkage-based covariance matrix are cast as a modified semi-definite programming (SDP) problem based on the minimum mean-squared error (MMSE) criterion. Moreover, to reduce the computational overhead, we also incorporate the compressive sensing (CS) technique with the above scheme for UWB breast tumor detection. The performance of the Capon beamformer based on the new reconstructed covariance matrix is tested under multistatic scenario by a 2-D numerical breast analysis model. Simulations show that the proposed approach possesses a better target identification capability and improves the signal-to-clutter-noise ratio (SCNR) than the existing counterparts.
{"title":"Robust breast tumor detection via shrinkage covariance matrix estimation","authors":"Li-Der Fang, Wen-Hsien Fang, D. Chang, Yie-Tarng Chen","doi":"10.1109/ICSIPA.2017.8120584","DOIUrl":"https://doi.org/10.1109/ICSIPA.2017.8120584","url":null,"abstract":"Microwave imaging (MWI) is a promising imaging modality for breast tumor detection. One challenge faced by the ultra-wideband (UWB) radar-based breast cancer detection is the estimation of clutter-plus-noise covariance matrix. To render a more accurate covariance matrix estimate when the number of samples is not large, this paper presents a new covariance matrix estimate using the shrinkage method. The parameters of the proposed shrinkage-based covariance matrix are cast as a modified semi-definite programming (SDP) problem based on the minimum mean-squared error (MMSE) criterion. Moreover, to reduce the computational overhead, we also incorporate the compressive sensing (CS) technique with the above scheme for UWB breast tumor detection. The performance of the Capon beamformer based on the new reconstructed covariance matrix is tested under multistatic scenario by a 2-D numerical breast analysis model. Simulations show that the proposed approach possesses a better target identification capability and improves the signal-to-clutter-noise ratio (SCNR) than the existing counterparts.","PeriodicalId":268112,"journal":{"name":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"319 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122987527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/ICSIPA.2017.8120657
Mohamed Shaafiee, R. Logeswaran
The advent of 8K and better resolutions of video pose problems for the capture and storage of data by these standards. The contemporary alternative is to compromise on quality and use various (often lossy) compression techniques to reduce the bandwidth required to move this data. This paper proposes a novel method for handling large volumes of video data without compromising its quality through space saving techniques such as chroma subsampling. A proposed implementation is also presented. The method is shown to be capable of handling the capture and storage of raw 8K video data as well as supports better video streaming.
{"title":"Non-von-neumann heap for better streaming, capturing and storing of raw 8K video data","authors":"Mohamed Shaafiee, R. Logeswaran","doi":"10.1109/ICSIPA.2017.8120657","DOIUrl":"https://doi.org/10.1109/ICSIPA.2017.8120657","url":null,"abstract":"The advent of 8K and better resolutions of video pose problems for the capture and storage of data by these standards. The contemporary alternative is to compromise on quality and use various (often lossy) compression techniques to reduce the bandwidth required to move this data. This paper proposes a novel method for handling large volumes of video data without compromising its quality through space saving techniques such as chroma subsampling. A proposed implementation is also presented. The method is shown to be capable of handling the capture and storage of raw 8K video data as well as supports better video streaming.","PeriodicalId":268112,"journal":{"name":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132356780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/ICSIPA.2017.8120576
K. Lay, M. Zhou
Nowadays, QR (quick-response) codes have become part of our daily life. In many applications, QR codes are posted (i.e. pasted or printed) on cylinders. Then, the QR image as captured by a camera would be distorted. In this paper, we try to tackle the decoding of QR codes in such a situation. It is based on perspective projection (PP), which is specified by a camera matrix (cM), with the assistance of cross ratio (CR). In the proposed scheme, the mathematics involved is neat, and the computation is fast. Experimental results show that the proposed scheme is effective, in the sense that with the aid of it many failed decoding attempts became successful.
{"title":"Perspective projection for decoding of QR codes posted on cylinders","authors":"K. Lay, M. Zhou","doi":"10.1109/ICSIPA.2017.8120576","DOIUrl":"https://doi.org/10.1109/ICSIPA.2017.8120576","url":null,"abstract":"Nowadays, QR (quick-response) codes have become part of our daily life. In many applications, QR codes are posted (i.e. pasted or printed) on cylinders. Then, the QR image as captured by a camera would be distorted. In this paper, we try to tackle the decoding of QR codes in such a situation. It is based on perspective projection (PP), which is specified by a camera matrix (cM), with the assistance of cross ratio (CR). In the proposed scheme, the mathematics involved is neat, and the computation is fast. Experimental results show that the proposed scheme is effective, in the sense that with the aid of it many failed decoding attempts became successful.","PeriodicalId":268112,"journal":{"name":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128799687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Salient Object Detection (SOD) has received much attention from the research community due to its increasing applications in the areas such as object detection and recognition, image editing, image and video compression, video summarization and so on. Most of the SOD methods are proposed in literature presuming that the digital images in which salient objects is to be detected, are free from any kind of artifact. SOD in the presence of noise has received much less attention from research community. In this paper, we study and analyze popular salient object detection methods in the presence of Gaussian, Salt and Pepper and Speckle Noises. Extensive experiments are performed on two publicly available SOD datasets viz. MSRA5K and DUT OMRON. The performance of the methods are evaluated in terms of Precision, Recall and F-measure. It is found that Context Aware Saliency Detection (CA) method gives maximum Precision while Graph Based Visual Saliency (GB) gives maximum Recall and F-measure on both the datasets in presence of any of the three noises.
{"title":"A study of training free salient object detection methods in presence of noise","authors":"Nitin Kumar, Maheep Singh, Surendra Singh, Abhimanyu Kumar","doi":"10.1109/ICSIPA.2017.8120637","DOIUrl":"https://doi.org/10.1109/ICSIPA.2017.8120637","url":null,"abstract":"Salient Object Detection (SOD) has received much attention from the research community due to its increasing applications in the areas such as object detection and recognition, image editing, image and video compression, video summarization and so on. Most of the SOD methods are proposed in literature presuming that the digital images in which salient objects is to be detected, are free from any kind of artifact. SOD in the presence of noise has received much less attention from research community. In this paper, we study and analyze popular salient object detection methods in the presence of Gaussian, Salt and Pepper and Speckle Noises. Extensive experiments are performed on two publicly available SOD datasets viz. MSRA5K and DUT OMRON. The performance of the methods are evaluated in terms of Precision, Recall and F-measure. It is found that Context Aware Saliency Detection (CA) method gives maximum Precision while Graph Based Visual Saliency (GB) gives maximum Recall and F-measure on both the datasets in presence of any of the three noises.","PeriodicalId":268112,"journal":{"name":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125071922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/ICSIPA.2017.8120666
Arif Ahmed, D. P. Dogra, S. Kar, Renuka Patnaik, S. Lee, Heeseung Choi, Ig-Jae Kim
Millions of surveillance cameras operate at 24×7 generating huge amount of visual data for processing. However, retrieval of important activities from such a large data can be time consuming. Thus, researchers are working on finding solutions to present hours of visual data in a compressed, but meaningful way. Video synopsis is one of the ways to represent activities using relatively shorter duration clips. So far, two main approaches have been used by researchers to address this problem, namely synopsis by tracking moving objects and synopsis by clustering moving objects. Synopses outputs, mainly depend on tracking, segmenting, and shifting of moving objects temporally as well as spatially. In many situations, tracking fails, thus produces multiple trajectories of the same object. Due to this, the object may appear and disappear multiple times within the same synopsis output, which is misleading. This also leads to discontinuity and often can be confusing to the viewer of the synopsis. In this paper, we present a new approach for generating compressed video synopsis by grouping tracklets of moving objects. Grouping helps to generate a synopsis where chronologically related objects appear together with meaningful spatio-temporal relation. Our proposed method produces continuous, but a less confusing synopses when tested on publicly available dataset videos as well as in-house dataset videos.
{"title":"Video synopsis generation using spatio-temporal groups","authors":"Arif Ahmed, D. P. Dogra, S. Kar, Renuka Patnaik, S. Lee, Heeseung Choi, Ig-Jae Kim","doi":"10.1109/ICSIPA.2017.8120666","DOIUrl":"https://doi.org/10.1109/ICSIPA.2017.8120666","url":null,"abstract":"Millions of surveillance cameras operate at 24×7 generating huge amount of visual data for processing. However, retrieval of important activities from such a large data can be time consuming. Thus, researchers are working on finding solutions to present hours of visual data in a compressed, but meaningful way. Video synopsis is one of the ways to represent activities using relatively shorter duration clips. So far, two main approaches have been used by researchers to address this problem, namely synopsis by tracking moving objects and synopsis by clustering moving objects. Synopses outputs, mainly depend on tracking, segmenting, and shifting of moving objects temporally as well as spatially. In many situations, tracking fails, thus produces multiple trajectories of the same object. Due to this, the object may appear and disappear multiple times within the same synopsis output, which is misleading. This also leads to discontinuity and often can be confusing to the viewer of the synopsis. In this paper, we present a new approach for generating compressed video synopsis by grouping tracklets of moving objects. Grouping helps to generate a synopsis where chronologically related objects appear together with meaningful spatio-temporal relation. Our proposed method produces continuous, but a less confusing synopses when tested on publicly available dataset videos as well as in-house dataset videos.","PeriodicalId":268112,"journal":{"name":"2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121800895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}