首页 > 最新文献

2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)最新文献

英文 中文
Natural Scene Statistics for Detecting Adversarial Examples in Deep Neural Networks 基于自然场景统计的深度神经网络对抗样本检测
Pub Date : 2020-09-21 DOI: 10.1109/MMSP48831.2020.9287056
Anouar Kherchouche, Sid Ahmed Fezza, W. Hamidouche, O. Déforges
The deep neural networks (DNNs) have been adopted in a wide spectrum of applications. However, it has been demonstrated that their are vulnerable to adversarial examples (AEs): carefully-crafted perturbations added to a clean input image. These AEs fool the DNNs which classify them incorrectly. Therefore, it is imperative to develop a detection method of AEs allowing the defense of DNNs. In this paper, we propose to characterize the adversarial perturbations through the use of natural scene statistics. We demonstrate that these statistical properties are altered by the presence of adversarial perturbations. Based on this finding, we design a classifier that exploits these scene statistics to determine if an input is adversarial or not. The proposed method has been evaluated against four prominent adversarial attacks and on three standards datasets. The experimental results have shown that the proposed detection method achieves a high detection accuracy, even against strong attacks, while providing a low false positive rate.
深度神经网络(dnn)已被广泛应用。然而,已经证明它们很容易受到对抗性示例(AEs)的影响:在干净的输入图像中添加精心制作的扰动。这些ae欺骗了对它们进行错误分类的dnn。因此,开发一种能够防御深层神经网络的ae检测方法势在必行。在本文中,我们建议通过使用自然场景统计来表征对抗性摄动。我们证明,这些统计性质被对抗性扰动的存在所改变。基于这一发现,我们设计了一个分类器,利用这些场景统计来确定输入是否是对抗性的。提出的方法已经针对四种突出的对抗性攻击和三个标准数据集进行了评估。实验结果表明,该检测方法在面对强攻击的情况下也具有较高的检测精度,同时具有较低的误报率。
{"title":"Natural Scene Statistics for Detecting Adversarial Examples in Deep Neural Networks","authors":"Anouar Kherchouche, Sid Ahmed Fezza, W. Hamidouche, O. Déforges","doi":"10.1109/MMSP48831.2020.9287056","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287056","url":null,"abstract":"The deep neural networks (DNNs) have been adopted in a wide spectrum of applications. However, it has been demonstrated that their are vulnerable to adversarial examples (AEs): carefully-crafted perturbations added to a clean input image. These AEs fool the DNNs which classify them incorrectly. Therefore, it is imperative to develop a detection method of AEs allowing the defense of DNNs. In this paper, we propose to characterize the adversarial perturbations through the use of natural scene statistics. We demonstrate that these statistical properties are altered by the presence of adversarial perturbations. Based on this finding, we design a classifier that exploits these scene statistics to determine if an input is adversarial or not. The proposed method has been evaluated against four prominent adversarial attacks and on three standards datasets. The experimental results have shown that the proposed detection method achieves a high detection accuracy, even against strong attacks, while providing a low false positive rate.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"259 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120939650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Spectrogram-Based Classification Of Spoken Foul Language Using Deep CNN 基于谱图的深度CNN口语污言秽语分类
Pub Date : 2020-09-21 DOI: 10.1109/MMSP48831.2020.9287133
A. Wazir, H. A. Karim, Mohd Haris Lye Abdullah, Sarina Mansor, Nouar Aldahoul, M. F. A. Fauzi, John See
Excessive content of profanity in audio and video files has proven to shape one’s character and behavior. Currently, conventional methods of manual detection and censorship are being used. Manual censorship method is time consuming and prone to misdetection of foul language. This paper proposed an intelligent model for foul language censorship through automated and robust detection by deep convolutional neural networks (CNNs). A dataset of foul language was collected and processed for the computation of audio spectrogram images that serve as an input to evaluate the classification of foul language. The proposed model was first tested for 2-class (Foul vs Normal) classification problem, the foul class is then further decomposed into a 10-class classification problem for exact detection of profanity. Experimental results show the viability of proposed system by demonstrating high performance of curse words classification with 1.24-2.71 Error Rate (ER) for 2-class and 5.49-8.30 F1- score. Proposed Resnet50 architecture outperforms other models in terms of accuracy, sensitivity, specificity, F1-score.
事实证明,音频和视频文件中过多的亵渎内容会塑造一个人的性格和行为。目前,正在使用传统的人工检测和审查方法。人工审查方法耗时长,容易对粗言秽语进行误检。本文提出了一种基于深度卷积神经网络(cnn)自动鲁棒检测的脏话审查智能模型。收集并处理了一个脏话数据集,用于计算音频谱图图像,作为评估脏话分类的输入。首先对该模型进行2类(犯规vs正常)分类问题的测试,然后将犯规类进一步分解为10类分类问题,以准确检测脏话。实验结果表明,该系统具有较高的分类效率,2类分类错误率为1.24 ~ 2.71,F1-分数为5.49 ~ 8.30。提出的Resnet50架构在准确性、灵敏度、特异性和f1评分方面优于其他模型。
{"title":"Spectrogram-Based Classification Of Spoken Foul Language Using Deep CNN","authors":"A. Wazir, H. A. Karim, Mohd Haris Lye Abdullah, Sarina Mansor, Nouar Aldahoul, M. F. A. Fauzi, John See","doi":"10.1109/MMSP48831.2020.9287133","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287133","url":null,"abstract":"Excessive content of profanity in audio and video files has proven to shape one’s character and behavior. Currently, conventional methods of manual detection and censorship are being used. Manual censorship method is time consuming and prone to misdetection of foul language. This paper proposed an intelligent model for foul language censorship through automated and robust detection by deep convolutional neural networks (CNNs). A dataset of foul language was collected and processed for the computation of audio spectrogram images that serve as an input to evaluate the classification of foul language. The proposed model was first tested for 2-class (Foul vs Normal) classification problem, the foul class is then further decomposed into a 10-class classification problem for exact detection of profanity. Experimental results show the viability of proposed system by demonstrating high performance of curse words classification with 1.24-2.71 Error Rate (ER) for 2-class and 5.49-8.30 F1- score. Proposed Resnet50 architecture outperforms other models in terms of accuracy, sensitivity, specificity, F1-score.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116347502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Low-Complexity Angular Intra-Prediction Convolutional Neural Network for Lossless HEVC 低复杂度角内预测卷积神经网络无损HEVC
Pub Date : 2020-09-21 DOI: 10.1109/MMSP48831.2020.9287067
H. Huang, I. Schiopu, A. Munteanu
The paper proposes a novel low-complexity Convolutional Neural Network (CNN) architecture for block-wise angular intra-prediction in lossless video coding. The proposed CNN architecture is designed based on an efficient patch processing layer structure. The proposed CNN-based prediction method is employed to process an input patch containing the causal neighborhood of the current block in order to directly generate the predicted block. The trained models are integrated in the HEVC video coding standard to perform CNN-based angular intra-prediction and to compete with the conventional HEVC prediction. The proposed CNN architecture contains a reduced number of parameters equivalent to only 37% of that of the state-of-the-art reference CNN architecture. Experimental results show that the inference runtime is also reduced by around 5.5% compared to that of the reference method. At the same time, the proposed coding systems yield 83% to 91% of the compression performance of the reference method. The results demonstrate the potential of structural and complexity optimizations in CNN-based intra-prediction for lossless HEVC.
提出了一种新颖的低复杂度卷积神经网络(CNN)结构,用于无损视频编码中逐块角度内预测。本文提出的CNN架构是基于高效的patch处理层结构设计的。提出的基于cnn的预测方法是对包含当前块的因果邻域的输入patch进行处理,从而直接生成预测块。将训练好的模型集成到HEVC视频编码标准中,进行基于cnn的角度内预测,并与传统的HEVC预测相抗衡。所提出的CNN架构包含的参数数量减少了,相当于目前最先进的参考CNN架构的37%。实验结果表明,与参考方法相比,推理运行时间也缩短了5.5%左右。同时,所提出的编码系统的压缩性能是参考方法的83%到91%。结果表明,在基于cnn的无损HEVC内部预测中,结构和复杂性优化具有潜力。
{"title":"Low-Complexity Angular Intra-Prediction Convolutional Neural Network for Lossless HEVC","authors":"H. Huang, I. Schiopu, A. Munteanu","doi":"10.1109/MMSP48831.2020.9287067","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287067","url":null,"abstract":"The paper proposes a novel low-complexity Convolutional Neural Network (CNN) architecture for block-wise angular intra-prediction in lossless video coding. The proposed CNN architecture is designed based on an efficient patch processing layer structure. The proposed CNN-based prediction method is employed to process an input patch containing the causal neighborhood of the current block in order to directly generate the predicted block. The trained models are integrated in the HEVC video coding standard to perform CNN-based angular intra-prediction and to compete with the conventional HEVC prediction. The proposed CNN architecture contains a reduced number of parameters equivalent to only 37% of that of the state-of-the-art reference CNN architecture. Experimental results show that the inference runtime is also reduced by around 5.5% compared to that of the reference method. At the same time, the proposed coding systems yield 83% to 91% of the compression performance of the reference method. The results demonstrate the potential of structural and complexity optimizations in CNN-based intra-prediction for lossless HEVC.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115961842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Defining Embedding Distortion for Sample Adaptive Offset-Based HEVC Video Steganography 基于样本自适应偏移的HEVC视频隐写嵌入失真定义
Pub Date : 2020-09-21 DOI: 10.1109/MMSP48831.2020.9287075
Yabing Cui, Yuanzhi Yao, Nenghai Yu
As a newly added in-loop filtering technique in High Efficiency Video Coding (HEVC), sample adaptive offset (SAO) can be utilized to embed messages for video steganography. This paper presents a novel SAO-based HEVC video steganographic scheme. The main principle is to design a suitable distortion function which expresses the embedding impacts on offsets based on minimizing embedding distortion. Two factors including the sample rate-distortion cost fluctuation and the sample statistical characteristic are considered in embedding distortion definition. Adaptive message embedding is implemented using syndrome-trellis codes (STC). Experimental results demonstrate the merits of the proposed scheme in terms of undetectability and video coding performance.
采样自适应偏移(SAO)是高效视频编码(HEVC)中新增的一种环内滤波技术,可用于视频隐写信息的嵌入。提出了一种新的基于sao的HEVC视频隐写方案。其主要原理是在最小化嵌入失真的基础上,设计一个合适的畸变函数来表达嵌入对偏移量的影响。在嵌入失真定义中考虑了样本率失真成本波动和样本统计特性两个因素。自适应信息嵌入采用证格码(STC)实现。实验结果证明了该方案在不可检测性和视频编码性能方面的优点。
{"title":"Defining Embedding Distortion for Sample Adaptive Offset-Based HEVC Video Steganography","authors":"Yabing Cui, Yuanzhi Yao, Nenghai Yu","doi":"10.1109/MMSP48831.2020.9287075","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287075","url":null,"abstract":"As a newly added in-loop filtering technique in High Efficiency Video Coding (HEVC), sample adaptive offset (SAO) can be utilized to embed messages for video steganography. This paper presents a novel SAO-based HEVC video steganographic scheme. The main principle is to design a suitable distortion function which expresses the embedding impacts on offsets based on minimizing embedding distortion. Two factors including the sample rate-distortion cost fluctuation and the sample statistical characteristic are considered in embedding distortion definition. Adaptive message embedding is implemented using syndrome-trellis codes (STC). Experimental results demonstrate the merits of the proposed scheme in terms of undetectability and video coding performance.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124021560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Blind reverberation time estimation from ambisonic recordings 从双声录音中估计盲混响时间
Pub Date : 2020-09-21 DOI: 10.1109/MMSP48831.2020.9287128
A. Pérez-López, A. Politis, E. Gómez
Reverberation time is an important room acoustic parameter, useful for many acoustic signal processing applications. Most of the existing work on blind reverberation time estimation focuses on the single-channel case. However, the recent developments and interest on immersive audio have brought to the market a number of spherical microphone arrays, together with the usage of ambisonics as a standard spatial audio convention. This work presents a novel blind reverberation time estimation method, which specifically targets ambisonic recordings, a field that remained unexplored to the best of our knowledge. Experimental validation on a synthetic reverberant dataset shows that the proposed algorithm outperforms state-of-the-art methods under most evaluation criteria in low noise conditions.
混响时间是一个重要的室内声学参数,在许多声学信号处理应用中都很有用。现有的盲混响时间估计工作大多集中在单通道情况下。然而,最近的发展和对沉浸式音频的兴趣已经给市场带来了许多球形麦克风阵列,以及作为标准空间音频惯例的双声系统的使用。这项工作提出了一种新的盲混响时间估计方法,该方法专门针对双声录音,这是一个据我们所知尚未探索的领域。在合成混响数据集上的实验验证表明,在低噪声条件下,该算法在大多数评估标准下都优于最先进的方法。
{"title":"Blind reverberation time estimation from ambisonic recordings","authors":"A. Pérez-López, A. Politis, E. Gómez","doi":"10.1109/MMSP48831.2020.9287128","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287128","url":null,"abstract":"Reverberation time is an important room acoustic parameter, useful for many acoustic signal processing applications. Most of the existing work on blind reverberation time estimation focuses on the single-channel case. However, the recent developments and interest on immersive audio have brought to the market a number of spherical microphone arrays, together with the usage of ambisonics as a standard spatial audio convention. This work presents a novel blind reverberation time estimation method, which specifically targets ambisonic recordings, a field that remained unexplored to the best of our knowledge. Experimental validation on a synthetic reverberant dataset shows that the proposed algorithm outperforms state-of-the-art methods under most evaluation criteria in low noise conditions.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126785154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Generalized Operational Classifiers for Material Identification 用于材料识别的广义操作分类器
Pub Date : 2020-09-21 DOI: 10.1109/MMSP48831.2020.9287058
Xiaoyue Jiang, Ding Wang, D. Tran, S. Kiranyaz, M. Gabbouj, Xiaoyi Feng
Material is one of the intrinsic features of objects, and consequently material recognition plays an important role in image understanding. The same material may have various shapes and appearance, while keeping the same physical characteristic. This brings great challenges for material recognition. Besides suitable features, a powerful classifier also can improve the overall recognition performance. Due to the limitations of classical linear neurons, used in all shallow and deep neural networks, such as CNN, we propose to apply the generalized operational neurons to construct a classifier adaptively. These generalized operational perceptrons (GOP) contain a set of linear and nonlinear neurons, and possess a structure that can be built progressively. This makes GOP classifier more compact and can easily discriminate complex classes. The experiments demonstrate that GOP networks trained on a small portion of the data (4%) can achieve comparable performances to state-of-the-arts models trained on much larger portions of the dataset.
材料是物体的内在特征之一,因此材料识别在图像理解中起着重要的作用。同一种材料可以具有不同的形状和外观,同时保持相同的物理特性。这给材料识别带来了巨大的挑战。除了合适的特征外,强大的分类器还可以提高整体识别性能。由于经典线性神经元用于所有浅层和深层神经网络(如CNN)的局限性,我们提出应用广义操作神经元自适应构建分类器。这些广义操作感知器(GOP)包含一组线性和非线性神经元,并具有可逐步构建的结构。这使得GOP分类器更加紧凑,可以很容易地区分复杂的类。实验表明,在一小部分数据(4%)上训练的GOP网络可以达到与在更大部分数据集上训练的最先进模型相当的性能。
{"title":"Generalized Operational Classifiers for Material Identification","authors":"Xiaoyue Jiang, Ding Wang, D. Tran, S. Kiranyaz, M. Gabbouj, Xiaoyi Feng","doi":"10.1109/MMSP48831.2020.9287058","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287058","url":null,"abstract":"Material is one of the intrinsic features of objects, and consequently material recognition plays an important role in image understanding. The same material may have various shapes and appearance, while keeping the same physical characteristic. This brings great challenges for material recognition. Besides suitable features, a powerful classifier also can improve the overall recognition performance. Due to the limitations of classical linear neurons, used in all shallow and deep neural networks, such as CNN, we propose to apply the generalized operational neurons to construct a classifier adaptively. These generalized operational perceptrons (GOP) contain a set of linear and nonlinear neurons, and possess a structure that can be built progressively. This makes GOP classifier more compact and can easily discriminate complex classes. The experiments demonstrate that GOP networks trained on a small portion of the data (4%) can achieve comparable performances to state-of-the-arts models trained on much larger portions of the dataset.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133798696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
MMSP 2020 Index MMSP 2020指数
Pub Date : 2020-09-21 DOI: 10.1109/mmsp48831.2020.9287137
{"title":"MMSP 2020 Index","authors":"","doi":"10.1109/mmsp48831.2020.9287137","DOIUrl":"https://doi.org/10.1109/mmsp48831.2020.9287137","url":null,"abstract":"","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132595380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learned BRIEF – transferring the knowledge from hand-crafted to learning-based descriptors 学习概要-将知识从手工制作到基于学习的描述符
Pub Date : 2020-09-21 DOI: 10.1109/MMSP48831.2020.9287159
Nina Žižakić, A. Pižurica
In this paper, we present a novel approach for designing local image descriptors that learn from data and from hand-crafted descriptors. In particular, we construct a learning model that first mimics the behaviour of a hand-crafted descriptor and then learns to improve upon it in an unsupervised manner. We demonstrate the use of this knowledge-transfer framework by constructing the learned BRIEF descriptor based on the well-known hand-crafted descriptor BRIEF. We implement our learned BRIEF with a convolutional autoencoder architecture. Evaluation on the HPatches benchmark for local image descriptors shows the effectiveness of the proposed approach in the tasks of patch retrieval, patch verification, and image matching.
在本文中,我们提出了一种新的方法来设计从数据和手工制作的描述符中学习的局部图像描述符。特别是,我们构建了一个学习模型,该模型首先模仿手工制作的描述符的行为,然后以无监督的方式学习改进它。我们通过基于众所周知的手工描述符BRIEF构建学习到的BRIEF描述符来演示这种知识转移框架的使用。我们用卷积自编码器架构实现我们的学习BRIEF。对局部图像描述符的HPatches基准的评估表明了该方法在补丁检索、补丁验证和图像匹配任务中的有效性。
{"title":"Learned BRIEF – transferring the knowledge from hand-crafted to learning-based descriptors","authors":"Nina Žižakić, A. Pižurica","doi":"10.1109/MMSP48831.2020.9287159","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287159","url":null,"abstract":"In this paper, we present a novel approach for designing local image descriptors that learn from data and from hand-crafted descriptors. In particular, we construct a learning model that first mimics the behaviour of a hand-crafted descriptor and then learns to improve upon it in an unsupervised manner. We demonstrate the use of this knowledge-transfer framework by constructing the learned BRIEF descriptor based on the well-known hand-crafted descriptor BRIEF. We implement our learned BRIEF with a convolutional autoencoder architecture. Evaluation on the HPatches benchmark for local image descriptors shows the effectiveness of the proposed approach in the tasks of patch retrieval, patch verification, and image matching.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114903195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automated Genre Classification for Gaming Videos 自动类型分类的游戏视频
Pub Date : 2020-09-21 DOI: 10.1109/MMSP48831.2020.9287122
Steve Goering, Robert Steger, Rakesh Rao Ramachandra Rao, A. Raake
Besides classical videos, videos of gaming matches, entire tournaments or individual sessions are streamed and viewed all over the world. The increased popularity of Twitch or YoutubeGaming shows the importance of additional research on gaming videos. One important pre-condition for live or offline encoding of gaming videos is the knowledge of game-specific properties. Knowing or automatically predicting the genre of a gaming video enables a more advanced and optimized encoding pipeline for streaming providers, especially because gaming videos of different genres vary a lot from classical 2D video, e.g., considering the CGI content, textures or camera motion. We describe several computer-vision based features that are optimized for speed and motivated by characteristics of popular games, to automatically predict the genre of a gaming video. Our prediction system uses random forest and gradient boosting trees as underlying machine-learning techniques, combined with feature selection. For the evaluation of our approach we use a dataset that was built as part of this work and consists of recorded gaming sessions for 6 genres from Twitch. In total 351 different videos are considered. We show that our prediction approach shows a good performance in terms of f1-score. Besides the evaluation of different machine-learning approaches, we additionally investigate the influence of the hyper-parameters for the algorithms.
除了经典的视频,游戏比赛的视频,整个比赛或个人会议都流媒体和世界各地观看。Twitch或youtubeaming的日益流行表明了对游戏视频进行额外研究的重要性。对游戏视频进行实时或离线编码的一个重要先决条件是了解游戏特定属性。了解或自动预测游戏视频的类型可以为流媒体提供商提供更先进和优化的编码管道,特别是因为不同类型的游戏视频与经典2D视频有很大不同,例如,考虑到CGI内容,纹理或摄像机运动。我们描述了几个基于计算机视觉的特征,这些特征针对速度进行了优化,并受到流行游戏特征的激励,以自动预测游戏视频的类型。我们的预测系统使用随机森林和梯度增强树作为潜在的机器学习技术,并结合特征选择。为了评估我们的方法,我们使用了一个数据集,该数据集是作为这项工作的一部分而构建的,由来自Twitch的6种类型的游戏会话记录组成。总共考虑了351个不同的视频。我们表明,我们的预测方法在f1-score方面表现出良好的性能。除了评估不同的机器学习方法外,我们还研究了超参数对算法的影响。
{"title":"Automated Genre Classification for Gaming Videos","authors":"Steve Goering, Robert Steger, Rakesh Rao Ramachandra Rao, A. Raake","doi":"10.1109/MMSP48831.2020.9287122","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287122","url":null,"abstract":"Besides classical videos, videos of gaming matches, entire tournaments or individual sessions are streamed and viewed all over the world. The increased popularity of Twitch or YoutubeGaming shows the importance of additional research on gaming videos. One important pre-condition for live or offline encoding of gaming videos is the knowledge of game-specific properties. Knowing or automatically predicting the genre of a gaming video enables a more advanced and optimized encoding pipeline for streaming providers, especially because gaming videos of different genres vary a lot from classical 2D video, e.g., considering the CGI content, textures or camera motion. We describe several computer-vision based features that are optimized for speed and motivated by characteristics of popular games, to automatically predict the genre of a gaming video. Our prediction system uses random forest and gradient boosting trees as underlying machine-learning techniques, combined with feature selection. For the evaluation of our approach we use a dataset that was built as part of this work and consists of recorded gaming sessions for 6 genres from Twitch. In total 351 different videos are considered. We show that our prediction approach shows a good performance in terms of f1-score. Besides the evaluation of different machine-learning approaches, we additionally investigate the influence of the hyper-parameters for the algorithms.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115686579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Bi-directional intra prediction based measurement coding for compressive sensing images 基于双向内预测的压缩感知图像测量编码
Pub Date : 2020-09-21 DOI: 10.1109/MMSP48831.2020.9287074
Thuy Thi Thu Tran, Jirayu Peetakul, Chi Do-Kim Pham, Jinjia Zhou
This work proposes a bi-directional intra prediction-based measurement coding algorithm for compressive sensing images. Compressive sensing is capable of reducing the size of the sparse signals, in which the high-dimensional signals are represented by the under-determined linear measurements. In order to explore the spatial redundancy in measurements, the corresponding pixel domain information extracted using the structure of measurement matrix. Firstly, the mono-directional prediction modes (i.e. horizontal mode and vertical mode), which refer to the nearest information of neighboring pixel blocks, are obtained by the structure of the measurement matrix. Secondly, we design bi-directional intra prediction modes (i.e. Diagonal + Horizontal, Diagonal + Vertical) base on the already obtained mono-directional prediction modes. Experimental results show that this work improves 0.01 - 0.02 dB PSNR improvement and the birate reductions of on average 19%, up to 36% compared to the state-of-the-art.
本文提出了一种基于双向内预测的压缩感知图像测量编码算法。压缩感知能够减小稀疏信号的大小,其中高维信号由欠确定的线性测量值表示。为了探索测量中的空间冗余性,利用测量矩阵的结构提取相应的像素域信息。首先,通过测量矩阵的结构获得指向相邻像素块最近信息的单向预测模式(即水平模式和垂直模式);其次,在已有的单向预测模式的基础上,设计了双向预测模式(即对角+水平、对角+垂直)。实验结果表明,与现有技术相比,该技术提高了0.01 ~ 0.02 dB的PSNR,平均降低了19%的比特率,最高可达36%。
{"title":"Bi-directional intra prediction based measurement coding for compressive sensing images","authors":"Thuy Thi Thu Tran, Jirayu Peetakul, Chi Do-Kim Pham, Jinjia Zhou","doi":"10.1109/MMSP48831.2020.9287074","DOIUrl":"https://doi.org/10.1109/MMSP48831.2020.9287074","url":null,"abstract":"This work proposes a bi-directional intra prediction-based measurement coding algorithm for compressive sensing images. Compressive sensing is capable of reducing the size of the sparse signals, in which the high-dimensional signals are represented by the under-determined linear measurements. In order to explore the spatial redundancy in measurements, the corresponding pixel domain information extracted using the structure of measurement matrix. Firstly, the mono-directional prediction modes (i.e. horizontal mode and vertical mode), which refer to the nearest information of neighboring pixel blocks, are obtained by the structure of the measurement matrix. Secondly, we design bi-directional intra prediction modes (i.e. Diagonal + Horizontal, Diagonal + Vertical) base on the already obtained mono-directional prediction modes. Experimental results show that this work improves 0.01 - 0.02 dB PSNR improvement and the birate reductions of on average 19%, up to 36% compared to the state-of-the-art.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122180891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1