Pub Date : 2020-09-21DOI: 10.1109/MMSP48831.2020.9287056
Anouar Kherchouche, Sid Ahmed Fezza, W. Hamidouche, O. Déforges
The deep neural networks (DNNs) have been adopted in a wide spectrum of applications. However, it has been demonstrated that their are vulnerable to adversarial examples (AEs): carefully-crafted perturbations added to a clean input image. These AEs fool the DNNs which classify them incorrectly. Therefore, it is imperative to develop a detection method of AEs allowing the defense of DNNs. In this paper, we propose to characterize the adversarial perturbations through the use of natural scene statistics. We demonstrate that these statistical properties are altered by the presence of adversarial perturbations. Based on this finding, we design a classifier that exploits these scene statistics to determine if an input is adversarial or not. The proposed method has been evaluated against four prominent adversarial attacks and on three standards datasets. The experimental results have shown that the proposed detection method achieves a high detection accuracy, even against strong attacks, while providing a low false positive rate.
{"title":"Natural Scene Statistics for Detecting Adversarial Examples in Deep Neural Networks","authors":"Anouar Kherchouche, Sid Ahmed Fezza, W. Hamidouche, O. Déforges","doi":"10.1109/MMSP48831.2020.9287056","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287056","url":null,"abstract":"The deep neural networks (DNNs) have been adopted in a wide spectrum of applications. However, it has been demonstrated that their are vulnerable to adversarial examples (AEs): carefully-crafted perturbations added to a clean input image. These AEs fool the DNNs which classify them incorrectly. Therefore, it is imperative to develop a detection method of AEs allowing the defense of DNNs. In this paper, we propose to characterize the adversarial perturbations through the use of natural scene statistics. We demonstrate that these statistical properties are altered by the presence of adversarial perturbations. Based on this finding, we design a classifier that exploits these scene statistics to determine if an input is adversarial or not. The proposed method has been evaluated against four prominent adversarial attacks and on three standards datasets. The experimental results have shown that the proposed detection method achieves a high detection accuracy, even against strong attacks, while providing a low false positive rate.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"259 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120939650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-21DOI: 10.1109/MMSP48831.2020.9287133
A. Wazir, H. A. Karim, Mohd Haris Lye Abdullah, Sarina Mansor, Nouar Aldahoul, M. F. A. Fauzi, John See
Excessive content of profanity in audio and video files has proven to shape one’s character and behavior. Currently, conventional methods of manual detection and censorship are being used. Manual censorship method is time consuming and prone to misdetection of foul language. This paper proposed an intelligent model for foul language censorship through automated and robust detection by deep convolutional neural networks (CNNs). A dataset of foul language was collected and processed for the computation of audio spectrogram images that serve as an input to evaluate the classification of foul language. The proposed model was first tested for 2-class (Foul vs Normal) classification problem, the foul class is then further decomposed into a 10-class classification problem for exact detection of profanity. Experimental results show the viability of proposed system by demonstrating high performance of curse words classification with 1.24-2.71 Error Rate (ER) for 2-class and 5.49-8.30 F1- score. Proposed Resnet50 architecture outperforms other models in terms of accuracy, sensitivity, specificity, F1-score.
{"title":"Spectrogram-Based Classification Of Spoken Foul Language Using Deep CNN","authors":"A. Wazir, H. A. Karim, Mohd Haris Lye Abdullah, Sarina Mansor, Nouar Aldahoul, M. F. A. Fauzi, John See","doi":"10.1109/MMSP48831.2020.9287133","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287133","url":null,"abstract":"Excessive content of profanity in audio and video files has proven to shape one’s character and behavior. Currently, conventional methods of manual detection and censorship are being used. Manual censorship method is time consuming and prone to misdetection of foul language. This paper proposed an intelligent model for foul language censorship through automated and robust detection by deep convolutional neural networks (CNNs). A dataset of foul language was collected and processed for the computation of audio spectrogram images that serve as an input to evaluate the classification of foul language. The proposed model was first tested for 2-class (Foul vs Normal) classification problem, the foul class is then further decomposed into a 10-class classification problem for exact detection of profanity. Experimental results show the viability of proposed system by demonstrating high performance of curse words classification with 1.24-2.71 Error Rate (ER) for 2-class and 5.49-8.30 F1- score. Proposed Resnet50 architecture outperforms other models in terms of accuracy, sensitivity, specificity, F1-score.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116347502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-21DOI: 10.1109/MMSP48831.2020.9287067
H. Huang, I. Schiopu, A. Munteanu
The paper proposes a novel low-complexity Convolutional Neural Network (CNN) architecture for block-wise angular intra-prediction in lossless video coding. The proposed CNN architecture is designed based on an efficient patch processing layer structure. The proposed CNN-based prediction method is employed to process an input patch containing the causal neighborhood of the current block in order to directly generate the predicted block. The trained models are integrated in the HEVC video coding standard to perform CNN-based angular intra-prediction and to compete with the conventional HEVC prediction. The proposed CNN architecture contains a reduced number of parameters equivalent to only 37% of that of the state-of-the-art reference CNN architecture. Experimental results show that the inference runtime is also reduced by around 5.5% compared to that of the reference method. At the same time, the proposed coding systems yield 83% to 91% of the compression performance of the reference method. The results demonstrate the potential of structural and complexity optimizations in CNN-based intra-prediction for lossless HEVC.
{"title":"Low-Complexity Angular Intra-Prediction Convolutional Neural Network for Lossless HEVC","authors":"H. Huang, I. Schiopu, A. Munteanu","doi":"10.1109/MMSP48831.2020.9287067","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287067","url":null,"abstract":"The paper proposes a novel low-complexity Convolutional Neural Network (CNN) architecture for block-wise angular intra-prediction in lossless video coding. The proposed CNN architecture is designed based on an efficient patch processing layer structure. The proposed CNN-based prediction method is employed to process an input patch containing the causal neighborhood of the current block in order to directly generate the predicted block. The trained models are integrated in the HEVC video coding standard to perform CNN-based angular intra-prediction and to compete with the conventional HEVC prediction. The proposed CNN architecture contains a reduced number of parameters equivalent to only 37% of that of the state-of-the-art reference CNN architecture. Experimental results show that the inference runtime is also reduced by around 5.5% compared to that of the reference method. At the same time, the proposed coding systems yield 83% to 91% of the compression performance of the reference method. The results demonstrate the potential of structural and complexity optimizations in CNN-based intra-prediction for lossless HEVC.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115961842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-21DOI: 10.1109/MMSP48831.2020.9287075
Yabing Cui, Yuanzhi Yao, Nenghai Yu
As a newly added in-loop filtering technique in High Efficiency Video Coding (HEVC), sample adaptive offset (SAO) can be utilized to embed messages for video steganography. This paper presents a novel SAO-based HEVC video steganographic scheme. The main principle is to design a suitable distortion function which expresses the embedding impacts on offsets based on minimizing embedding distortion. Two factors including the sample rate-distortion cost fluctuation and the sample statistical characteristic are considered in embedding distortion definition. Adaptive message embedding is implemented using syndrome-trellis codes (STC). Experimental results demonstrate the merits of the proposed scheme in terms of undetectability and video coding performance.
{"title":"Defining Embedding Distortion for Sample Adaptive Offset-Based HEVC Video Steganography","authors":"Yabing Cui, Yuanzhi Yao, Nenghai Yu","doi":"10.1109/MMSP48831.2020.9287075","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287075","url":null,"abstract":"As a newly added in-loop filtering technique in High Efficiency Video Coding (HEVC), sample adaptive offset (SAO) can be utilized to embed messages for video steganography. This paper presents a novel SAO-based HEVC video steganographic scheme. The main principle is to design a suitable distortion function which expresses the embedding impacts on offsets based on minimizing embedding distortion. Two factors including the sample rate-distortion cost fluctuation and the sample statistical characteristic are considered in embedding distortion definition. Adaptive message embedding is implemented using syndrome-trellis codes (STC). Experimental results demonstrate the merits of the proposed scheme in terms of undetectability and video coding performance.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124021560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-21DOI: 10.1109/MMSP48831.2020.9287128
A. Pérez-López, A. Politis, E. Gómez
Reverberation time is an important room acoustic parameter, useful for many acoustic signal processing applications. Most of the existing work on blind reverberation time estimation focuses on the single-channel case. However, the recent developments and interest on immersive audio have brought to the market a number of spherical microphone arrays, together with the usage of ambisonics as a standard spatial audio convention. This work presents a novel blind reverberation time estimation method, which specifically targets ambisonic recordings, a field that remained unexplored to the best of our knowledge. Experimental validation on a synthetic reverberant dataset shows that the proposed algorithm outperforms state-of-the-art methods under most evaluation criteria in low noise conditions.
{"title":"Blind reverberation time estimation from ambisonic recordings","authors":"A. Pérez-López, A. Politis, E. Gómez","doi":"10.1109/MMSP48831.2020.9287128","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287128","url":null,"abstract":"Reverberation time is an important room acoustic parameter, useful for many acoustic signal processing applications. Most of the existing work on blind reverberation time estimation focuses on the single-channel case. However, the recent developments and interest on immersive audio have brought to the market a number of spherical microphone arrays, together with the usage of ambisonics as a standard spatial audio convention. This work presents a novel blind reverberation time estimation method, which specifically targets ambisonic recordings, a field that remained unexplored to the best of our knowledge. Experimental validation on a synthetic reverberant dataset shows that the proposed algorithm outperforms state-of-the-art methods under most evaluation criteria in low noise conditions.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126785154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-21DOI: 10.1109/MMSP48831.2020.9287058
Xiaoyue Jiang, Ding Wang, D. Tran, S. Kiranyaz, M. Gabbouj, Xiaoyi Feng
Material is one of the intrinsic features of objects, and consequently material recognition plays an important role in image understanding. The same material may have various shapes and appearance, while keeping the same physical characteristic. This brings great challenges for material recognition. Besides suitable features, a powerful classifier also can improve the overall recognition performance. Due to the limitations of classical linear neurons, used in all shallow and deep neural networks, such as CNN, we propose to apply the generalized operational neurons to construct a classifier adaptively. These generalized operational perceptrons (GOP) contain a set of linear and nonlinear neurons, and possess a structure that can be built progressively. This makes GOP classifier more compact and can easily discriminate complex classes. The experiments demonstrate that GOP networks trained on a small portion of the data (4%) can achieve comparable performances to state-of-the-arts models trained on much larger portions of the dataset.
{"title":"Generalized Operational Classifiers for Material Identification","authors":"Xiaoyue Jiang, Ding Wang, D. Tran, S. Kiranyaz, M. Gabbouj, Xiaoyi Feng","doi":"10.1109/MMSP48831.2020.9287058","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287058","url":null,"abstract":"Material is one of the intrinsic features of objects, and consequently material recognition plays an important role in image understanding. The same material may have various shapes and appearance, while keeping the same physical characteristic. This brings great challenges for material recognition. Besides suitable features, a powerful classifier also can improve the overall recognition performance. Due to the limitations of classical linear neurons, used in all shallow and deep neural networks, such as CNN, we propose to apply the generalized operational neurons to construct a classifier adaptively. These generalized operational perceptrons (GOP) contain a set of linear and nonlinear neurons, and possess a structure that can be built progressively. This makes GOP classifier more compact and can easily discriminate complex classes. The experiments demonstrate that GOP networks trained on a small portion of the data (4%) can achieve comparable performances to state-of-the-arts models trained on much larger portions of the dataset.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133798696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-21DOI: 10.1109/mmsp48831.2020.9287137
{"title":"MMSP 2020 Index","authors":"","doi":"10.1109/mmsp48831.2020.9287137","DOIUrl":"https://doi.org/10.1109/mmsp48831.2020.9287137","url":null,"abstract":"","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132595380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-21DOI: 10.1109/MMSP48831.2020.9287091
Pierre R. Lebreton, Kazuhisa Yamagishi
In this paper, a model is investigated for optimizing the encoding of adaptive bitrate video streaming. To this end, the relationship between quality, content duration, and acceptability measured by using the completion ratio is studied. This work is based on intensive subjective testing performed in a laboratory environment and shows the importance of stimulus duration in acceptance studies. A model to predict the completion ratio of videos is provided and shows good accuracy. By using this model, quality requirements can be derived on the basis of the target abandonment rate and content duration. This work will help video streaming providers to define suitable coding conditions when preparing content to be broadcast on their platform that will maintain user engagement.
{"title":"Study on viewing completion ratio of video streaming","authors":"Pierre R. Lebreton, Kazuhisa Yamagishi","doi":"10.1109/MMSP48831.2020.9287091","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287091","url":null,"abstract":"In this paper, a model is investigated for optimizing the encoding of adaptive bitrate video streaming. To this end, the relationship between quality, content duration, and acceptability measured by using the completion ratio is studied. This work is based on intensive subjective testing performed in a laboratory environment and shows the importance of stimulus duration in acceptance studies. A model to predict the completion ratio of videos is provided and shows good accuracy. By using this model, quality requirements can be derived on the basis of the target abandonment rate and content duration. This work will help video streaming providers to define suitable coding conditions when preparing content to be broadcast on their platform that will maintain user engagement.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115099430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-21DOI: 10.1109/MMSP48831.2020.9287122
Steve Goering, Robert Steger, Rakesh Rao Ramachandra Rao, A. Raake
Besides classical videos, videos of gaming matches, entire tournaments or individual sessions are streamed and viewed all over the world. The increased popularity of Twitch or YoutubeGaming shows the importance of additional research on gaming videos. One important pre-condition for live or offline encoding of gaming videos is the knowledge of game-specific properties. Knowing or automatically predicting the genre of a gaming video enables a more advanced and optimized encoding pipeline for streaming providers, especially because gaming videos of different genres vary a lot from classical 2D video, e.g., considering the CGI content, textures or camera motion. We describe several computer-vision based features that are optimized for speed and motivated by characteristics of popular games, to automatically predict the genre of a gaming video. Our prediction system uses random forest and gradient boosting trees as underlying machine-learning techniques, combined with feature selection. For the evaluation of our approach we use a dataset that was built as part of this work and consists of recorded gaming sessions for 6 genres from Twitch. In total 351 different videos are considered. We show that our prediction approach shows a good performance in terms of f1-score. Besides the evaluation of different machine-learning approaches, we additionally investigate the influence of the hyper-parameters for the algorithms.
{"title":"Automated Genre Classification for Gaming Videos","authors":"Steve Goering, Robert Steger, Rakesh Rao Ramachandra Rao, A. Raake","doi":"10.1109/MMSP48831.2020.9287122","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287122","url":null,"abstract":"Besides classical videos, videos of gaming matches, entire tournaments or individual sessions are streamed and viewed all over the world. The increased popularity of Twitch or YoutubeGaming shows the importance of additional research on gaming videos. One important pre-condition for live or offline encoding of gaming videos is the knowledge of game-specific properties. Knowing or automatically predicting the genre of a gaming video enables a more advanced and optimized encoding pipeline for streaming providers, especially because gaming videos of different genres vary a lot from classical 2D video, e.g., considering the CGI content, textures or camera motion. We describe several computer-vision based features that are optimized for speed and motivated by characteristics of popular games, to automatically predict the genre of a gaming video. Our prediction system uses random forest and gradient boosting trees as underlying machine-learning techniques, combined with feature selection. For the evaluation of our approach we use a dataset that was built as part of this work and consists of recorded gaming sessions for 6 genres from Twitch. In total 351 different videos are considered. We show that our prediction approach shows a good performance in terms of f1-score. Besides the evaluation of different machine-learning approaches, we additionally investigate the influence of the hyper-parameters for the algorithms.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115686579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-21DOI: 10.1109/MMSP48831.2020.9287129
Md. Asikuzzaman, Deepak Rajamohan, M. Pickering
Video data storage and transmission cost can be reduced by minimizing the temporally redundant information among frames using an appropriate motion-compensated prediction technique. In the current video coding standard, the neighbouring frames are exploited to predict the motion of the current frame using global motion estimation-based approaches. However, the global motion estimation of a frame may not produce the actual motion of individual objects in the frame as each of the objects in a frame usually has its own motion. In this paper, an edge-based motion estimation technique is presented that finds the motion of each object in the frame rather than finding the global motion of that frame. In the proposed method, edge position difference (EPD) similarity measure-based image registration between the two frames is applied to register each object in the frame. A superpixel search is then applied to segment the registered object. Finally, the proposed edge-based image registration technique and Demons algorithm are applied to predict the objects in the current frame. Our experimental analysis demonstrates that the proposed algorithm can estimate the motions of individual objects in the current frame accurately compared to the existing global motion estimation-based approaches.
{"title":"Object-Oriented Motion Estimation using Edge-Based Image Registration","authors":"Md. Asikuzzaman, Deepak Rajamohan, M. Pickering","doi":"10.1109/MMSP48831.2020.9287129","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287129","url":null,"abstract":"Video data storage and transmission cost can be reduced by minimizing the temporally redundant information among frames using an appropriate motion-compensated prediction technique. In the current video coding standard, the neighbouring frames are exploited to predict the motion of the current frame using global motion estimation-based approaches. However, the global motion estimation of a frame may not produce the actual motion of individual objects in the frame as each of the objects in a frame usually has its own motion. In this paper, an edge-based motion estimation technique is presented that finds the motion of each object in the frame rather than finding the global motion of that frame. In the proposed method, edge position difference (EPD) similarity measure-based image registration between the two frames is applied to register each object in the frame. A superpixel search is then applied to segment the registered object. Finally, the proposed edge-based image registration technique and Demons algorithm are applied to predict the objects in the current frame. Our experimental analysis demonstrates that the proposed algorithm can estimate the motions of individual objects in the current frame accurately compared to the existing global motion estimation-based approaches.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123177343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}