首页 > 最新文献

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)最新文献

英文 中文
Generalized dirichlet mixture matching projection for supervised linear dimensionality reduction of proportional data 比例数据有监督线性降维的广义dirichlet混合匹配投影
Pub Date : 2017-03-05 DOI: 10.1109/ICASSP.2017.7952668
Walid Masoudimansour, N. Bouguila
In this paper, a novel effective method to reduce the dimensionality of labeled proportional data is introduced. Most well-known existing linear dimensionality reduction methods rely on solving the generalized eigen value problem which fails in certain cases such as sparse data. The proposed algorithm is a linear method and uses a novel approach to the problem of dimensionality reduction to solve this problem while resulting higher classification rates. Data is assumed to be from two different classes where each class is matched to a mixture of generalized Dirichlet distributions after projection. Jeffrey divergence is then used as a dissimilarity measure between the projected classes to increase the inter-class variance. To find the optimal projection that yields the largest mutual information, genetic algorithm is used. The method is especially designed as a preprocessing step for binary classification, however, it can handle multi-modal data effectively due to the use of mixture models and therefore can be used for multi-class problems as well.
本文提出了一种新的有效的比例数据降维方法。大多数已知的线性降维方法依赖于解决广义特征值问题,而广义特征值问题在某些情况下(如稀疏数据)是行不通的。该算法是一种线性方法,采用了一种新颖的降维方法来解决降维问题,同时提高了分类率。假设数据来自两个不同的类,其中每个类在投影后与广义狄利克雷分布的混合相匹配。杰弗里散度然后被用作预测类之间的不相似性度量,以增加类间方差。为了找到产生最大互信息的最优投影,使用了遗传算法。该方法是专门为二值分类设计的预处理步骤,但由于混合模型的使用,它可以有效地处理多模态数据,因此也可以用于多类问题。
{"title":"Generalized dirichlet mixture matching projection for supervised linear dimensionality reduction of proportional data","authors":"Walid Masoudimansour, N. Bouguila","doi":"10.1109/ICASSP.2017.7952668","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952668","url":null,"abstract":"In this paper, a novel effective method to reduce the dimensionality of labeled proportional data is introduced. Most well-known existing linear dimensionality reduction methods rely on solving the generalized eigen value problem which fails in certain cases such as sparse data. The proposed algorithm is a linear method and uses a novel approach to the problem of dimensionality reduction to solve this problem while resulting higher classification rates. Data is assumed to be from two different classes where each class is matched to a mixture of generalized Dirichlet distributions after projection. Jeffrey divergence is then used as a dissimilarity measure between the projected classes to increase the inter-class variance. To find the optimal projection that yields the largest mutual information, genetic algorithm is used. The method is especially designed as a preprocessing step for binary classification, however, it can handle multi-modal data effectively due to the use of mixture models and therefore can be used for multi-class problems as well.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"197 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116106591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Mobile live streaming: Insights from the periscope service 移动直播:来自潜望镜服务的见解
Pub Date : 2016-09-23 DOI: 10.1109/MMSP.2016.7813395
Leonardo Favario, M. Siekkinen, E. Masala
Live video streaming from mobile devices is quickly becoming popular through services such as Periscope, Meerkat, and Facebook Live. Little is known, however, about how such services tackle the challenges of the live mobile streaming scenario. This work addresses such gap by investigating in details the characteristics of the Periscope service. A large number of publicly available streams have been captured and analyzed in depth, in particular studying the characteristics of the encoded streams and the communication evolution over time. Such an investigation allows to get an insight into key performance parameters such as bandwidth, latency, buffer levels and freezes, as well as the limits and strategies adopted by Periscope to deal with this challenging application scenario.
通过Periscope、Meerkat和Facebook Live等服务,移动设备的实时视频流正迅速流行起来。然而,人们对这些服务如何应对移动直播流媒体场景的挑战知之甚少。这项工作通过详细调查潜望镜服务的特点来解决这种差距。大量公开可用的数据流已经被捕获并进行了深入分析,特别是研究了编码数据流的特征和随时间的通信演变。这样的调查可以深入了解关键性能参数,如带宽、延迟、缓冲级别和冻结,以及Periscope为应对这种具有挑战性的应用场景所采用的限制和策略。
{"title":"Mobile live streaming: Insights from the periscope service","authors":"Leonardo Favario, M. Siekkinen, E. Masala","doi":"10.1109/MMSP.2016.7813395","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813395","url":null,"abstract":"Live video streaming from mobile devices is quickly becoming popular through services such as Periscope, Meerkat, and Facebook Live. Little is known, however, about how such services tackle the challenges of the live mobile streaming scenario. This work addresses such gap by investigating in details the characteristics of the Periscope service. A large number of publicly available streams have been captured and analyzed in depth, in particular studying the characteristics of the encoded streams and the communication evolution over time. Such an investigation allows to get an insight into key performance parameters such as bandwidth, latency, buffer levels and freezes, as well as the limits and strategies adopted by Periscope to deal with this challenging application scenario.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127940637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
MUDVA: A multi-sensory dataset for the vehicular CPS applications MUDVA:用于车载CPS应用的多感官数据集
Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813382
K. M. Alam, Mohammed Bin Hariz, Seyed Vahid Hosseinioun, M. Saini, Abdulmotaleb El Saddik
Vehicular Cyber-Physical System (VCPS) is a new trend in the research of the intelligent transport systems (ITS). In VCPS, vehicles work as a hub of sensors to collect interior and exterior information about the vehicle. Vehicles can use ad-hoc networking or 3G/LTE communication technology to share useful information with their neighboring vehicles or with the infrastructures to accomplish user safety, comfort, and entertainment tasks. In order to facilitate efficient sensor-services fusion in the VCPS applications, we need real life vehicular sensory datasets. While there has been many datasets containing vehicle mobility traces, there is hardly any that contains sensory information to be shared on the network. In this paper, we present a scenario specific modular dataset architecture along with some multi-sensory dataset modules. One of the dataset modules provides time synchronized multi-vehicle data including multi-view video, multi-directional sound, GPS, accelerometer, gyroscope, and magnetic field sensors. Each of the three vehicles recorded front, back, left, and right videos while moving closely in the suburban areas to let explore vehicular cooperative applications. Another module presents necessary tools and datasets to identify vehicular events such as acceleration, deceleration, turn, and no-turn events. We also present development details of a safety application using the presented datasets along with a list of other possible applications.
车辆信息物理系统(VCPS)是智能交通系统(ITS)研究的一个新趋势。在VCPS中,车辆作为传感器中心收集车辆的内部和外部信息。车辆可以使用ad-hoc网络或3G/LTE通信技术与相邻车辆或基础设施共享有用信息,以实现用户的安全、舒适和娱乐任务。为了促进VCPS应用中有效的传感器服务融合,我们需要真实的车辆传感器数据集。虽然已经有许多包含车辆移动轨迹的数据集,但几乎没有任何包含传感器信息的数据集可以在网络上共享。在本文中,我们提出了一个特定场景的模块化数据集架构以及一些多感官数据集模块。其中一个数据集模块提供时间同步的多车数据,包括多视角视频、多向声音、GPS、加速度计、陀螺仪和磁场传感器。三辆车在郊区近距离行驶时,每辆车都记录了前、后、左、右的视频,以探索车辆合作应用。另一个模块提供必要的工具和数据集来识别车辆事件,如加速、减速、转弯和禁止转弯事件。我们还介绍了使用所提供的数据集以及其他可能应用程序列表的安全应用程序的开发细节。
{"title":"MUDVA: A multi-sensory dataset for the vehicular CPS applications","authors":"K. M. Alam, Mohammed Bin Hariz, Seyed Vahid Hosseinioun, M. Saini, Abdulmotaleb El Saddik","doi":"10.1109/MMSP.2016.7813382","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813382","url":null,"abstract":"Vehicular Cyber-Physical System (VCPS) is a new trend in the research of the intelligent transport systems (ITS). In VCPS, vehicles work as a hub of sensors to collect interior and exterior information about the vehicle. Vehicles can use ad-hoc networking or 3G/LTE communication technology to share useful information with their neighboring vehicles or with the infrastructures to accomplish user safety, comfort, and entertainment tasks. In order to facilitate efficient sensor-services fusion in the VCPS applications, we need real life vehicular sensory datasets. While there has been many datasets containing vehicle mobility traces, there is hardly any that contains sensory information to be shared on the network. In this paper, we present a scenario specific modular dataset architecture along with some multi-sensory dataset modules. One of the dataset modules provides time synchronized multi-vehicle data including multi-view video, multi-directional sound, GPS, accelerometer, gyroscope, and magnetic field sensors. Each of the three vehicles recorded front, back, left, and right videos while moving closely in the suburban areas to let explore vehicular cooperative applications. Another module presents necessary tools and datasets to identify vehicular events such as acceleration, deceleration, turn, and no-turn events. We also present development details of a safety application using the presented datasets along with a list of other possible applications.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122596713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Adaptive frequency prior for frequency selective reconstruction of images from non-regular subsampling 非正则子采样图像的频率选择性重构自适应频率先验
Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813347
Jürgen Seiler, André Kaup
Image signals typically are defined on a rectangular two-dimensional grid. However, there exist scenarios where this is not fulfilled and where the image information only is available for a non-regular subset of pixel position. For processing, transmitting or displaying such an image signal, a re-sampling to a regular grid is required. Recently, Frequency Selective Reconstruction (FSR) has been proposed as a very effective sparsity-based algorithm for solving this under-determined problem. For this, FSR iteratively generates a model of the signal in the Fourier-domain. In this context, a fixed frequency prior inspired by the optical transfer function is used for favoring low-frequency content. However, this fixed prior is often too strict and may lead to a reduced reconstruction quality. To resolve this weakness, this paper proposes an adaptive frequency prior which takes the local density of the available samples into account. The proposed adaptive prior allows for a very high reconstruction quality, yielding gains of up to 0.6 dB PSNR over the fixed prior, independently of the density of the available samples. Compared to other state-of-the-art algorithms, visually noticeable gains of several dB are possible.
图像信号通常定义在矩形二维网格上。然而,在某些情况下,这并没有实现,并且图像信息只能用于像素位置的非规则子集。为了处理、传输或显示这样的图像信号,需要对规则网格进行重新采样。近年来,频率选择重建(FSR)作为一种非常有效的基于稀疏性的算法被提出来解决这一欠确定问题。为此,FSR迭代地在傅里叶域中生成信号的模型。在这种情况下,由光学传递函数激发的固定频率先验用于有利于低频内容。然而,这种固定的先验往往过于严格,可能导致重建质量下降。为了解决这一缺点,本文提出了一种考虑可用样本的局部密度的自适应频率先验。所提出的自适应先验允许非常高的重建质量,与固定先验相比,产生高达0.6 dB PSNR的增益,与可用样本的密度无关。与其他最先进的算法相比,可以实现几个dB的视觉显著增益。
{"title":"Adaptive frequency prior for frequency selective reconstruction of images from non-regular subsampling","authors":"Jürgen Seiler, André Kaup","doi":"10.1109/MMSP.2016.7813347","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813347","url":null,"abstract":"Image signals typically are defined on a rectangular two-dimensional grid. However, there exist scenarios where this is not fulfilled and where the image information only is available for a non-regular subset of pixel position. For processing, transmitting or displaying such an image signal, a re-sampling to a regular grid is required. Recently, Frequency Selective Reconstruction (FSR) has been proposed as a very effective sparsity-based algorithm for solving this under-determined problem. For this, FSR iteratively generates a model of the signal in the Fourier-domain. In this context, a fixed frequency prior inspired by the optical transfer function is used for favoring low-frequency content. However, this fixed prior is often too strict and may lead to a reduced reconstruction quality. To resolve this weakness, this paper proposes an adaptive frequency prior which takes the local density of the available samples into account. The proposed adaptive prior allows for a very high reconstruction quality, yielding gains of up to 0.6 dB PSNR over the fixed prior, independently of the density of the available samples. Compared to other state-of-the-art algorithms, visually noticeable gains of several dB are possible.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"940 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127004492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning patch-based anchors for face hallucination 学习基于补丁的面部幻觉锚
Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813386
Wei-Jen Ko, Y. Wang, Shao-Yi Chien
With the goal of increasing the resolution of face images, recent face hallucination methods advance learning techniques which observe training low and high-resolution patches for recovering the output image of interest. Since most existing patch-based face hallucination approaches do not consider the location information of the patches to be hallucinated, the resulting performance might be limited. In this paper, we propose an anchored patch-based hallucination method, which is able to exploit and identify image patches exhibiting structurally and spatially similar information. With these representative anchors observed, improved performance and computation efficiency can be achieved. Experimental results demonstrate that our proposed method achieves satisfactory performance and performs favorably against recent face hallucination approaches.
为了提高人脸图像的分辨率,最近的人脸幻觉方法推进了学习技术,通过观察训练的低分辨率和高分辨率斑块来恢复感兴趣的输出图像。由于大多数现有的基于小块的人脸幻觉方法没有考虑被幻觉小块的位置信息,因此产生的效果可能会受到限制。在本文中,我们提出了一种基于锚定补丁的幻觉方法,该方法能够利用和识别具有结构和空间相似信息的图像补丁。通过观察这些代表性的锚点,可以提高性能和计算效率。实验结果表明,我们提出的方法取得了令人满意的效果,并优于最近的人脸幻觉方法。
{"title":"Learning patch-based anchors for face hallucination","authors":"Wei-Jen Ko, Y. Wang, Shao-Yi Chien","doi":"10.1109/MMSP.2016.7813386","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813386","url":null,"abstract":"With the goal of increasing the resolution of face images, recent face hallucination methods advance learning techniques which observe training low and high-resolution patches for recovering the output image of interest. Since most existing patch-based face hallucination approaches do not consider the location information of the patches to be hallucinated, the resulting performance might be limited. In this paper, we propose an anchored patch-based hallucination method, which is able to exploit and identify image patches exhibiting structurally and spatially similar information. With these representative anchors observed, improved performance and computation efficiency can be achieved. Experimental results demonstrate that our proposed method achieves satisfactory performance and performs favorably against recent face hallucination approaches.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132206824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Movie shot selection preserving narrative properties 保留叙事属性的电影镜头选择
Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813397
Ioannis Mademlis, A. Tefas, N. Nikolaidis, I. Pitas
Automatic shot selection is an important aspect of movie summarization that is helpful both to producers and to audiences, e.g., for market promotion or browsing purposes. However, most of the related research has focused on shot selection based on low-level video content, which disregards semantic information, or on narrative properties extracted from text, which requires the movie script to be available. In this work, semantic shot selection based on the narrative prominence of movie characters in both the visual and the audio modalities is investigated, without the need for additional data such as a script. The output is a movie summary that only contains video frames from selected movie shots. Selection is controlled by a user-provided shot retention parameter, that removes key-frames/key-segments from the skim based on actor face appearances and speech instances. This novel process (Multimodal Shot Pruning, or MSP) is algebraically modelled as a multimodal matrix Column Subset Selection Problem, which is solved using an evolutionary computing approach.
自动镜头选择是电影摘要的一个重要方面,它对制片人和观众都有帮助,例如,用于市场推广或浏览目的。然而,大多数相关研究都集中在基于底层视频内容的镜头选择,忽略了语义信息,或者从文本中提取叙事属性,这需要有电影剧本。在这项工作中,基于电影人物在视觉和音频模式中的叙事突出性的语义镜头选择进行了研究,而不需要额外的数据,如脚本。输出是一个电影摘要,其中只包含来自选定电影镜头的视频帧。选择由用户提供的镜头保留参数控制,该参数根据演员的面部表情和语音实例从浏览中删除关键帧/关键段。这种新颖的过程(多模态Shot剪枝,或MSP)被代数建模为一个多模态矩阵列子集选择问题,并使用进化计算方法进行求解。
{"title":"Movie shot selection preserving narrative properties","authors":"Ioannis Mademlis, A. Tefas, N. Nikolaidis, I. Pitas","doi":"10.1109/MMSP.2016.7813397","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813397","url":null,"abstract":"Automatic shot selection is an important aspect of movie summarization that is helpful both to producers and to audiences, e.g., for market promotion or browsing purposes. However, most of the related research has focused on shot selection based on low-level video content, which disregards semantic information, or on narrative properties extracted from text, which requires the movie script to be available. In this work, semantic shot selection based on the narrative prominence of movie characters in both the visual and the audio modalities is investigated, without the need for additional data such as a script. The output is a movie summary that only contains video frames from selected movie shots. Selection is controlled by a user-provided shot retention parameter, that removes key-frames/key-segments from the skim based on actor face appearances and speech instances. This novel process (Multimodal Shot Pruning, or MSP) is algebraically modelled as a multimodal matrix Column Subset Selection Problem, which is solved using an evolutionary computing approach.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"266 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132300452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Adaptive color space transforms for 4:4:4 video coding considering uncorrelated noise among color components 考虑颜色分量间不相关噪声的4:4:4视频编码自适应色彩空间变换
Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813361
Kodai Kikuchi, T. Kajiyama, Kei Ogura, E. Miyashita
We propose a color space transform for 4:4:4 video coding. The uncorrelated noise contained in specific color component deteriorates compression efficiency. The proposed transform can limit the propagation of the uncorrelated noise of the color component to other color components by introducing zero coefficients in a transform matrix. Simulation results show that the proposed transform improves image quality after compression using extended JPEG in certain natural images. To achieve high image quality for any kind of image, an adaptive selection among conventional and the proposed color space transforms were performed by calculating the independency among color-difference components.
提出了一种适用于4:4:4视频编码的色彩空间变换方法。特定颜色分量中包含的不相关噪声降低了压缩效率。该变换通过在变换矩阵中引入零系数来限制颜色分量的不相关噪声向其他颜色分量的传播。仿真结果表明,采用扩展JPEG对某些自然图像进行压缩后,所提出的变换提高了图像质量。为了获得高质量的图像质量,通过计算色差分量之间的独立性,对传统的颜色空间变换和提出的颜色空间变换进行自适应选择。
{"title":"Adaptive color space transforms for 4:4:4 video coding considering uncorrelated noise among color components","authors":"Kodai Kikuchi, T. Kajiyama, Kei Ogura, E. Miyashita","doi":"10.1109/MMSP.2016.7813361","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813361","url":null,"abstract":"We propose a color space transform for 4:4:4 video coding. The uncorrelated noise contained in specific color component deteriorates compression efficiency. The proposed transform can limit the propagation of the uncorrelated noise of the color component to other color components by introducing zero coefficients in a transform matrix. Simulation results show that the proposed transform improves image quality after compression using extended JPEG in certain natural images. To achieve high image quality for any kind of image, an adaptive selection among conventional and the proposed color space transforms were performed by calculating the independency among color-difference components.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131591850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advanced residual coding for MPEG surround encoder 先进的残差编码的MPEG环绕编码器
Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813371
I. Elfitri, M. Sobirin, Fadhlur Rahman, R. Kurnia
MPEG Surround (MPS) has been widely known as both efficient technique for encoding multi-channel audio signals and rich-features audio standard. However, the generation of residual signal in the basis of a single module in MPS encoder can be considered as not optimal to compensate for error due to down-mixing process. In this paper, an improved residual coding method is proposed in order to ensure the down-mixing error can be optimally minimised particularly for MPS operation at high bit-rates. The distortion introduced during MPS encoding and decoding processes is first studied which then motivates for developing an approach which is more accurate in determining residual signals for better compensation for the distortion. A subjective test demonstrates that the MPS with improved residual coding can be competitive to Advanced Audio Coding (AAC) multi-channel for encoding 5-channel audio signals at bit-rates of 256 and 320 kb/s.
MPEG环绕(MPS)是一种高效的多声道音频信号编码技术,也是一种功能丰富的音频标准。然而,在MPS编码器中,基于单个模块产生的剩余信号可以被认为不是最优的,以补偿下混过程产生的误差。本文提出了一种改进的残差编码方法,以确保在高比特率下MPS操作时,下混频误差能达到最佳最小。首先研究了MPS编码和解码过程中引入的失真,然后激励开发一种更准确地确定残余信号以更好地补偿失真的方法。主观测试表明,改进残差编码的MPS可以在256和320 kb/s的比特率下编码5通道音频信号,与AAC多通道竞争。
{"title":"Advanced residual coding for MPEG surround encoder","authors":"I. Elfitri, M. Sobirin, Fadhlur Rahman, R. Kurnia","doi":"10.1109/MMSP.2016.7813371","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813371","url":null,"abstract":"MPEG Surround (MPS) has been widely known as both efficient technique for encoding multi-channel audio signals and rich-features audio standard. However, the generation of residual signal in the basis of a single module in MPS encoder can be considered as not optimal to compensate for error due to down-mixing process. In this paper, an improved residual coding method is proposed in order to ensure the down-mixing error can be optimally minimised particularly for MPS operation at high bit-rates. The distortion introduced during MPS encoding and decoding processes is first studied which then motivates for developing an approach which is more accurate in determining residual signals for better compensation for the distortion. A subjective test demonstrates that the MPS with improved residual coding can be competitive to Advanced Audio Coding (AAC) multi-channel for encoding 5-channel audio signals at bit-rates of 256 and 320 kb/s.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114631235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Magnetic resonance image classification using nonnegative matrix factorization and ensemble tree learning techniques 磁共振图像分类使用非负矩阵分解和集成树学习技术
Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813393
J. Ramírez, J. Górriz, Francisco J. Martínez-Murcia, F. Segovia, D. Salas-González
This paper shows a magnetic resonance image (MRI) classification technique based on nonnegative matrix factorization (NNMF) and ensemble tree learning methods. The system consists of a feature extraction process that applies NNMF to gray matter (GM) MRI first-order statistics of a number of sub-cortical structures and a learning process of an ensemble of decision trees. The ensembles are trained by means of boosting and bagging while their performance is compared in terms of the classification error and the received operating characteristics curve (ROC) using k-fold cross validation. The results show that NNMF is well suited for reducing the dimensionality of the input data without a penalty on the performance of the ensembles. The best performance was obtained by bagging in terms of convergence rate and minimum residual loss, especially for high complexity classification tasks (i.e. NC vs. MCI and MCI vs. AD.
提出了一种基于非负矩阵分解(NNMF)和集成树学习方法的磁共振图像分类技术。该系统包括一个特征提取过程,该过程将NNMF应用于许多皮层下结构的灰质(GM) MRI一阶统计量,以及决策树集合的学习过程。通过提升和套袋的方法对集合进行训练,同时使用k-fold交叉验证,根据分类误差和接收的工作特征曲线(ROC)对其性能进行比较。结果表明,NNMF非常适合于在不影响集成性能的情况下降低输入数据的维数。在收敛速度和最小残余损失方面,通过bagging获得了最佳性能,特别是对于高复杂性的分类任务(即NC与MCI和MCI与AD)。
{"title":"Magnetic resonance image classification using nonnegative matrix factorization and ensemble tree learning techniques","authors":"J. Ramírez, J. Górriz, Francisco J. Martínez-Murcia, F. Segovia, D. Salas-González","doi":"10.1109/MMSP.2016.7813393","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813393","url":null,"abstract":"This paper shows a magnetic resonance image (MRI) classification technique based on nonnegative matrix factorization (NNMF) and ensemble tree learning methods. The system consists of a feature extraction process that applies NNMF to gray matter (GM) MRI first-order statistics of a number of sub-cortical structures and a learning process of an ensemble of decision trees. The ensembles are trained by means of boosting and bagging while their performance is compared in terms of the classification error and the received operating characteristics curve (ROC) using k-fold cross validation. The results show that NNMF is well suited for reducing the dimensionality of the input data without a penalty on the performance of the ensembles. The best performance was obtained by bagging in terms of convergence rate and minimum residual loss, especially for high complexity classification tasks (i.e. NC vs. MCI and MCI vs. AD.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134034286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automatic camera self-calibration for immersive navigation of free viewpoint sports video 自动相机自校准沉浸式导航的自由视点体育视频
Pub Date : 2016-09-01 DOI: 10.1109/MMSP.2016.7813399
Qiang Yao, Hiroshi Sankoh, Keisuke Nonaka, S. Naito
In recent years, the demand of immersive experience has triggered a great revolution in the applications and formats of multimedia. Particularly, immersive navigation of free viewpoint sports video has become increasingly popular, and people would like to be able to actively select different viewpoints when watching sports videos to enhance the ultra realistic experience. In the practical realization of immersive navigation of free viewpoint video, the camera calibration is of vital importance. Especially, automatic camera calibration is very significant in real-time implementation and the accuracy of camera parameter directly determines the final experience of free viewpoint navigation. In this paper, we propose an automatic camera self-calibration method based on a field model for free viewpoint navigation in sports events. The proposed method is composed of three parts, namely, extraction of field lines in a camera image, calculation of crossing points, determination of the optimal camera parameter. Experimental results show that the camera parameter can be automatically estimated by the proposed method for a fixed camera, dynamic camera and multi-view cameras with high accuracy. Furthermore, immersive free viewpoint navigation in sports events can also be completely realized based on the camera parameter estimated by the proposed method.
近年来,沉浸式体验的需求引发了多媒体应用和格式的巨大变革。特别是自由视点体育视频的沉浸式导航越来越受欢迎,人们希望在观看体育视频时能够主动选择不同的视点,以增强超现实的体验。在自由视点视频沉浸式导航的实际实现中,摄像机标定是至关重要的。特别是相机自动标定在实时实现中具有十分重要的意义,相机参数的准确性直接决定了自由视点导航的最终体验。针对体育赛事中自由视点导航问题,提出了一种基于场模型的摄像机自动自标定方法。该方法由三部分组成,即提取相机图像中的场线,计算交叉点,确定最佳相机参数。实验结果表明,该方法可以对固定摄像机、动态摄像机和多视点摄像机进行高精度的参数自动估计。此外,基于该方法估计的摄像机参数,还可以完全实现体育赛事中的沉浸式自由视点导航。
{"title":"Automatic camera self-calibration for immersive navigation of free viewpoint sports video","authors":"Qiang Yao, Hiroshi Sankoh, Keisuke Nonaka, S. Naito","doi":"10.1109/MMSP.2016.7813399","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813399","url":null,"abstract":"In recent years, the demand of immersive experience has triggered a great revolution in the applications and formats of multimedia. Particularly, immersive navigation of free viewpoint sports video has become increasingly popular, and people would like to be able to actively select different viewpoints when watching sports videos to enhance the ultra realistic experience. In the practical realization of immersive navigation of free viewpoint video, the camera calibration is of vital importance. Especially, automatic camera calibration is very significant in real-time implementation and the accuracy of camera parameter directly determines the final experience of free viewpoint navigation. In this paper, we propose an automatic camera self-calibration method based on a field model for free viewpoint navigation in sports events. The proposed method is composed of three parts, namely, extraction of field lines in a camera image, calculation of crossing points, determination of the optimal camera parameter. Experimental results show that the camera parameter can be automatically estimated by the proposed method for a fixed camera, dynamic camera and multi-view cameras with high accuracy. Furthermore, immersive free viewpoint navigation in sports events can also be completely realized based on the camera parameter estimated by the proposed method.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134253894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1