首页 > 最新文献

2010 IEEE International Workshop on Multimedia Signal Processing最新文献

英文 中文
Reference frame modification methods in scalable video coding (SVC) 可扩展视频编码(SVC)中的参考帧修改方法
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662019
A. Naghdinezhad, F. Labeau
With the rapid development of multimedia technology, video transmission over error prone channels is widely used. Using predictive video coding can lead to temporal and spatial propagation of channel errors, which consequently results in high degradation in the quality of the received video. In order to address this problem different error resilient methods have been proposed. In this paper, a number of the error resilient methods based on reference frame modification are overviewed briefly and examined with scalable extension of H.264/AVC (SVC). We propose a new method based on hierarchical structure used in temporal scalable coding. Average gains of 0.76 dB over the improved generalized source channel prediction (IGSCP) method and 2.26 dB over normal coding are achieved.
随着多媒体技术的飞速发展,视频传输在易出错信道上的应用越来越广泛。使用预测视频编码会导致信道误差的时空传播,从而导致接收视频质量的严重下降。为了解决这一问题,人们提出了不同的抗误差方法。本文简要介绍了几种基于参考帧修改的误差复原方法,并对H.264/AVC (SVC)的可扩展扩展进行了研究。提出了一种基于层次结构的时间可伸缩编码方法。与改进的广义源信道预测(IGSCP)方法相比,平均增益为0.76 dB,比普通编码方法平均增益为2.26 dB。
{"title":"Reference frame modification methods in scalable video coding (SVC)","authors":"A. Naghdinezhad, F. Labeau","doi":"10.1109/MMSP.2010.5662019","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662019","url":null,"abstract":"With the rapid development of multimedia technology, video transmission over error prone channels is widely used. Using predictive video coding can lead to temporal and spatial propagation of channel errors, which consequently results in high degradation in the quality of the received video. In order to address this problem different error resilient methods have been proposed. In this paper, a number of the error resilient methods based on reference frame modification are overviewed briefly and examined with scalable extension of H.264/AVC (SVC). We propose a new method based on hierarchical structure used in temporal scalable coding. Average gains of 0.76 dB over the improved generalized source channel prediction (IGSCP) method and 2.26 dB over normal coding are achieved.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125568359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Controlling virtual world by the real world devices with an MPEG-V framework 用MPEG-V框架控制现实世界设备的虚拟世界
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662028
Seungju Han, Jae-Joon Han, Youngkyoo Hwang, Jungbae Kim, Won-Chul Bang, J. D. Kim, Chang-Yeong Kim
The recent online networked virtual worlds such as SecondLife, World of Warcraft and Lineage have been increasingly popular. A life-scale virtual world presentation and the intuitive interaction between the users and the virtual worlds would provide more natural and immersive experience for users. The emergence of novel interaction technologies such as sensing the facial expression and the motion of the users and the real world environments could be used to provide a strong connection between them. For the wide acceptance and use of the virtual world, a various type of novel interaction devices should have a unified interaction formats between the real world and the virtual world and interoperability among virtual worlds. Thus, MPEG-V Media Context and Control (ISO/IEC 23005) standardizes such connecting information. The paper provides an overview and its usage example of MPEG-V from the real world to the virtual world (R2V) on interfaces for controlling avatars and virtual objects in the virtual world by the real world devices. In particular, we investigate how the MPEG-V framework can be applied for the facial animation of an avatar in various types of virtual worlds.
最近在线网络虚拟世界,如SecondLife,魔兽世界和天堂越来越受欢迎。逼真的虚拟世界呈现,用户与虚拟世界的直观交互,将为用户提供更加自然、身临其境的体验。新的交互技术的出现,如感知用户的面部表情和运动,以及现实世界的环境,可以用来提供他们之间的强连接。为了虚拟世界的广泛接受和使用,各种类型的新型交互设备应该具有现实世界与虚拟世界之间统一的交互格式和虚拟世界之间的互操作性。因此,MPEG-V媒体环境和控制(ISO/IEC 23005)对这种连接信息进行了标准化。本文概述了MPEG-V从现实世界到虚拟世界(R2V)技术在现实世界设备控制虚拟世界中的虚拟人物和虚拟对象的接口上的应用实例。特别是,我们研究了如何将MPEG-V框架应用于各种虚拟世界中角色的面部动画。
{"title":"Controlling virtual world by the real world devices with an MPEG-V framework","authors":"Seungju Han, Jae-Joon Han, Youngkyoo Hwang, Jungbae Kim, Won-Chul Bang, J. D. Kim, Chang-Yeong Kim","doi":"10.1109/MMSP.2010.5662028","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662028","url":null,"abstract":"The recent online networked virtual worlds such as SecondLife, World of Warcraft and Lineage have been increasingly popular. A life-scale virtual world presentation and the intuitive interaction between the users and the virtual worlds would provide more natural and immersive experience for users. The emergence of novel interaction technologies such as sensing the facial expression and the motion of the users and the real world environments could be used to provide a strong connection between them. For the wide acceptance and use of the virtual world, a various type of novel interaction devices should have a unified interaction formats between the real world and the virtual world and interoperability among virtual worlds. Thus, MPEG-V Media Context and Control (ISO/IEC 23005) standardizes such connecting information. The paper provides an overview and its usage example of MPEG-V from the real world to the virtual world (R2V) on interfaces for controlling avatars and virtual objects in the virtual world by the real world devices. In particular, we investigate how the MPEG-V framework can be applied for the facial animation of an avatar in various types of virtual worlds.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127372020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Fusion of active and passive sensors for fast 3D capture 主动和被动传感器融合实现快速三维捕获
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5661996
Qingxiong Yang, K. Tan, Bruce Culbertson, J. Apostolopoulos
We envision a conference room of the future where depth sensing systems are able to capture the 3D position and pose of users, and enable users to interact with digital media and contents being shown on immersive displays. The key technical barrier is that current depth sensing systems are noisy, inaccurate, and unreliable. It is well understood that passive stereo fails in non-textured, featureless portions of a scene. Active sensors on the other hand are more accurate in these regions and tend to be noisy in highly textured regions. We propose a way to synergistically combine the two to create a state-of-the-art depth sensing system which runs in near real time. In contrast the only known previous method for fusion is slow and fails to take advantage of the complementary nature of the two types of sensors.
我们设想在未来的会议室里,深度传感系统能够捕捉用户的3D位置和姿势,并使用户能够与沉浸式显示器上显示的数字媒体和内容进行交互。关键的技术障碍是当前的深度传感系统存在噪声、不准确和不可靠的问题。众所周知,被动立体在场景中无纹理、无特征的部分是失败的。另一方面,有源传感器在这些区域更准确,而在高度纹理化的区域往往有噪声。我们提出了一种将两者协同结合的方法,以创建一个接近实时运行的最先进的深度传感系统。相比之下,以前唯一已知的融合方法速度很慢,而且无法利用两种传感器的互补性。
{"title":"Fusion of active and passive sensors for fast 3D capture","authors":"Qingxiong Yang, K. Tan, Bruce Culbertson, J. Apostolopoulos","doi":"10.1109/MMSP.2010.5661996","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5661996","url":null,"abstract":"We envision a conference room of the future where depth sensing systems are able to capture the 3D position and pose of users, and enable users to interact with digital media and contents being shown on immersive displays. The key technical barrier is that current depth sensing systems are noisy, inaccurate, and unreliable. It is well understood that passive stereo fails in non-textured, featureless portions of a scene. Active sensors on the other hand are more accurate in these regions and tend to be noisy in highly textured regions. We propose a way to synergistically combine the two to create a state-of-the-art depth sensing system which runs in near real time. In contrast the only known previous method for fusion is slow and fails to take advantage of the complementary nature of the two types of sensors.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114601450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 80
Visual quality of current coding technologies at high definition IPTV bitrates 当前高清IPTV码率编码技术的视觉质量
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662052
Christian Keimel, Julian Habigt, Tim Habigt, Martin Rothbucher, K. Diepold
High definition video over IP based networks (IPTV) has become a mainstay in today's consumer environment. In most applications, encoders conforming to the H.264/AVC standard are used. But even within one standard, often a wide range of coding tools are available that can deliver a vastly different visual quality. Therefore we evaluate in this contribution different coding technologies, using different encoder settings of H.264/AVC, but also a completely different encoder like Dirac. We cover a wide range of different bitrates from ADSL to VDSL and different content, with low and high demand on the encoders. As PSNR is not well suited to describe the perceived visual quality, we conducted extensive subject tests to determine the visual quality. Our results show that for currently common bitrates, the visual quality can be more than doubled, if the same coding technology, but different coding tools are used.
基于IP的网络高清视频(IPTV)已经成为当今消费环境的支柱。在大多数应用中,使用符合H.264/AVC标准的编码器。但是,即使在一个标准中,通常也有各种各样的编码工具可用,可以提供截然不同的视觉质量。因此,我们在此贡献中评估了不同的编码技术,使用H.264/AVC的不同编码器设置,以及完全不同的编码器,如Dirac。我们涵盖了从ADSL到VDSL的不同比特率和不同的内容,对编码器的要求有低有高。由于PSNR不能很好地描述感知到的视觉质量,我们进行了广泛的受试者测试来确定视觉质量。我们的研究结果表明,对于目前常见的比特率,如果使用相同的编码技术,但不同的编码工具,视觉质量可以提高一倍以上。
{"title":"Visual quality of current coding technologies at high definition IPTV bitrates","authors":"Christian Keimel, Julian Habigt, Tim Habigt, Martin Rothbucher, K. Diepold","doi":"10.1109/MMSP.2010.5662052","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662052","url":null,"abstract":"High definition video over IP based networks (IPTV) has become a mainstay in today's consumer environment. In most applications, encoders conforming to the H.264/AVC standard are used. But even within one standard, often a wide range of coding tools are available that can deliver a vastly different visual quality. Therefore we evaluate in this contribution different coding technologies, using different encoder settings of H.264/AVC, but also a completely different encoder like Dirac. We cover a wide range of different bitrates from ADSL to VDSL and different content, with low and high demand on the encoders. As PSNR is not well suited to describe the perceived visual quality, we conducted extensive subject tests to determine the visual quality. Our results show that for currently common bitrates, the visual quality can be more than doubled, if the same coding technology, but different coding tools are used.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"97 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122572232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Data hiding of motion information in chroma and luma samples for video compression 色度和亮度样本中运动信息的数据隐藏
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662022
Jean-Marc Thiesse, Joël Jung, M. Antonini
2010 appears to be the launching date for new compression activities intended to challenge the current video compression standard H.264/AVC. Several improvements of this standard are already known like competition-based motion vector prediction. However the targeted 50% bitrate saving for equivalent quality is not yet achieved. In this context, this paper proposes to reduce the signaling information resulting from this vector competition, by using data hiding techniques. As data hiding and video compression traditionally have contradictory goals, a study of data hiding is first performed. Then, an efficient way of using data hiding for video compression is proposed. The main idea is to hide the indices into appropriately selected chroma and luma transform coefficients. To minimize the prediction errors, the modification is performed via a rate-distortion optimization. Objective improvements (up to 2.3% bitrate saving) and subjective assess ment of chroma loss are reported and analyzed for several sequences.
2010年似乎是新的压缩活动的发布日期,旨在挑战当前的视频压缩标准H.264/AVC。这个标准的几个改进已经为人所知,比如基于竞争的运动矢量预测。然而,目标50%比特率节省同等质量尚未实现。在此背景下,本文提出通过使用数据隐藏技术来减少这种矢量竞争产生的信令信息。由于传统上数据隐藏和视频压缩的目标是相互矛盾的,因此首先对数据隐藏进行了研究。然后,提出了一种利用数据隐藏进行视频压缩的有效方法。主要思想是将指数隐藏到适当选择的色度和亮度变换系数中。为了使预测误差最小化,修改是通过率失真优化来执行的。客观改进(高达2.3%比特率节省)和主观评估色度损失报告和分析了几个序列。
{"title":"Data hiding of motion information in chroma and luma samples for video compression","authors":"Jean-Marc Thiesse, Joël Jung, M. Antonini","doi":"10.1109/MMSP.2010.5662022","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662022","url":null,"abstract":"2010 appears to be the launching date for new compression activities intended to challenge the current video compression standard H.264/AVC. Several improvements of this standard are already known like competition-based motion vector prediction. However the targeted 50% bitrate saving for equivalent quality is not yet achieved. In this context, this paper proposes to reduce the signaling information resulting from this vector competition, by using data hiding techniques. As data hiding and video compression traditionally have contradictory goals, a study of data hiding is first performed. Then, an efficient way of using data hiding for video compression is proposed. The main idea is to hide the indices into appropriately selected chroma and luma transform coefficients. To minimize the prediction errors, the modification is performed via a rate-distortion optimization. Objective improvements (up to 2.3% bitrate saving) and subjective assess ment of chroma loss are reported and analyzed for several sequences.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131246292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Private content identification: Performance-privacy-complexity trade-off 私有内容标识:性能-隐私-复杂性的权衡
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5661994
S. Voloshynovskiy, O. Koval, F. Beekhof, F. Farhadzadeh, T. Holotyak
In light of the recent development of multimedia and networking technologies, an exponentially increasing amount of content is available via various public services. That is why content identification attracts a lot of attention. One possible technology for content identification is based on digital fingerprinting. When trying to establish information-theoretic limits in this application, usually it is assumed that the codewords are of infinite length and that a jointly typical decoder is used in the analysis. These assumptions represent a certain over-generalization for the majority of practical applications. Consequently, the impact of the finite length on the mentioned limits remains an open and largely unexplored problem. Furthermore, leaking of privacy-related information to third parties due to storage, distribution and sharing of fingerprinting data represents an emerging research issue that should be addressed carefully. This paper contains an information-theoretic analysis of finite length digital fingerprinting under privacy constraints. A particular link between the considered setup and Forney's erasure/list decoding [1] is presented. Finally, complexity issues of reliable identification in large databases are addressed.
随着多媒体和网络技术的发展,通过各种公共服务提供的内容呈指数级增长。这就是内容识别吸引大量关注的原因。一种可能的内容识别技术是基于数字指纹。当试图在这种应用中建立信息理论限制时,通常假设码字是无限长的,并且在分析中使用联合典型解码器。对于大多数实际应用来说,这些假设都过于一般化了。因此,有限长度对上述极限的影响仍然是一个开放的和在很大程度上未被探索的问题。此外,由于指纹数据的存储、分发和共享,与隐私相关的信息泄露给第三方,这是一个应该认真解决的新兴研究问题。本文对隐私约束下的有限长度数字指纹进行了信息论分析。本文提出了所考虑的设置与Forney的擦除/列表解码[1]之间的特殊联系。最后,讨论了大型数据库中可靠识别的复杂性问题。
{"title":"Private content identification: Performance-privacy-complexity trade-off","authors":"S. Voloshynovskiy, O. Koval, F. Beekhof, F. Farhadzadeh, T. Holotyak","doi":"10.1109/MMSP.2010.5661994","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5661994","url":null,"abstract":"In light of the recent development of multimedia and networking technologies, an exponentially increasing amount of content is available via various public services. That is why content identification attracts a lot of attention. One possible technology for content identification is based on digital fingerprinting. When trying to establish information-theoretic limits in this application, usually it is assumed that the codewords are of infinite length and that a jointly typical decoder is used in the analysis. These assumptions represent a certain over-generalization for the majority of practical applications. Consequently, the impact of the finite length on the mentioned limits remains an open and largely unexplored problem. Furthermore, leaking of privacy-related information to third parties due to storage, distribution and sharing of fingerprinting data represents an emerging research issue that should be addressed carefully. This paper contains an information-theoretic analysis of finite length digital fingerprinting under privacy constraints. A particular link between the considered setup and Forney's erasure/list decoding [1] is presented. Finally, complexity issues of reliable identification in large databases are addressed.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131475558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hierarchical statistical model for object classification 用于对象分类的分层统计模型
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662071
A. Bakhtiari, N. Bouguila
In many applications it is necessary to be able to classify images in a database accurately and with acceptable speed. The main problem is to assign different images to right categories. The later problem becomes more challenging while dealing with large databases with many categories and subcategories. In this paper we propose a novel classification method based on an adopted hierarchical Dirichlet generative model, previously proposed for corpora document classification. In order to adopt the model to work with image data we use the bag of visual words model. We show that if properly applied the model can achieve adequate results for hierarchical image classification. Experimental results are presented and discussed to show the merits of the proposed approach.
在许多应用中,必须能够以可接受的速度准确地对数据库中的图像进行分类。主要问题是将不同的图像分配到正确的类别。在处理具有许多类别和子类别的大型数据库时,后一个问题变得更具挑战性。在本文中,我们提出了一种新的分类方法,该方法基于先前提出的用于语料库文档分类的分层狄利克雷生成模型。为了使该模型适用于图像数据,我们使用了视觉词包模型。结果表明,如果应用得当,该模型可以达到较好的图像分层分类效果。实验结果显示了该方法的优点。
{"title":"A hierarchical statistical model for object classification","authors":"A. Bakhtiari, N. Bouguila","doi":"10.1109/MMSP.2010.5662071","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662071","url":null,"abstract":"In many applications it is necessary to be able to classify images in a database accurately and with acceptable speed. The main problem is to assign different images to right categories. The later problem becomes more challenging while dealing with large databases with many categories and subcategories. In this paper we propose a novel classification method based on an adopted hierarchical Dirichlet generative model, previously proposed for corpora document classification. In order to adopt the model to work with image data we use the bag of visual words model. We show that if properly applied the model can achieve adequate results for hierarchical image classification. Experimental results are presented and discussed to show the merits of the proposed approach.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"503 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134031344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Color transfer for complex content images based on intrinsic component 基于内禀分量的复杂内容图像色彩转移
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662011
Wan-Chien Chiou, Yi-Lei Chen, Chiou-Ting Hsu
This paper proposes an automatic color transfer method for processing images with complex content based on intrinsic component. Although several automatic color transfer methods has been proposed by including region information and/or using multiple references, these methods tend to become ineffective when processing images with complex content and lighting variation. In this paper, our goal is to incorporate the idea of intrinsic component to better characterize the local organization within an image and to reduce the color-bleeding artifact across complex regions. Using intrinsic information, we first represent each image in region level and determine the best-matched reference region for each target region. Next, we conduct color transfer between the best-matched region pairs and perform weighted color transfer for pixels across complex regions in a de-correlated color space. Both subjective and objective evaluation of our experiments demonstrates that the proposed method outperforms the existing methods.
提出了一种基于内禀分量的复杂内容图像的自动色彩转移方法。虽然已经提出了几种包含区域信息和/或使用多个参考的自动色彩转移方法,但这些方法在处理复杂内容和光照变化的图像时往往无效。在本文中,我们的目标是结合内在成分的思想,以更好地表征图像中的局部组织,并减少跨复杂区域的变色工件。首先利用固有信息对图像进行区域级表示,并确定每个目标区域的最佳匹配参考区域。接下来,我们在最匹配的区域对之间进行颜色转移,并在去相关的颜色空间中对复杂区域的像素进行加权颜色转移。实验的主观和客观评价表明,本文提出的方法优于现有的方法。
{"title":"Color transfer for complex content images based on intrinsic component","authors":"Wan-Chien Chiou, Yi-Lei Chen, Chiou-Ting Hsu","doi":"10.1109/MMSP.2010.5662011","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662011","url":null,"abstract":"This paper proposes an automatic color transfer method for processing images with complex content based on intrinsic component. Although several automatic color transfer methods has been proposed by including region information and/or using multiple references, these methods tend to become ineffective when processing images with complex content and lighting variation. In this paper, our goal is to incorporate the idea of intrinsic component to better characterize the local organization within an image and to reduce the color-bleeding artifact across complex regions. Using intrinsic information, we first represent each image in region level and determine the best-matched reference region for each target region. Next, we conduct color transfer between the best-matched region pairs and perform weighted color transfer for pixels across complex regions in a de-correlated color space. Both subjective and objective evaluation of our experiments demonstrates that the proposed method outperforms the existing methods.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124033100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Real-time particle filtering with heuristics for 3D motion capture by monocular vision 基于启发式算法的单目三维运动捕捉实时粒子滤波
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662008
David Antonio Gómez Jáuregui, P. Horain, Manoj Kumar Rajagopal, S. S. Karri
Particle filtering is known as a robust approach for motion tracking by vision, at the cost of heavy computation in a high dimensional pose space. In this work, we describe a number of heuristics that we demonstrate to jointly improve robustness and real-time for motion capture. 3D human motion capture by monocular vision without markers can be achieved in realtime by registering a 3D articulated model on a video. First, we search the high-dimensional space of 3D poses by generating new hypotheses (or particles) with equivalent 2D projection by kinematic flipping. Second, we use a semi-deterministic particle prediction based on local optimization. Third, we deterministi-cally resample the probability distribution for a more efficient selection of particles. Particles (or poses) are evaluated using a match cost function and penalized with a Gaussian probability pose distribution learned off-line. In order to achieve real-time, measurement step is parallelized on GPU using the OpenCL API. We present experimental results demonstrating robust real-time 3D motion capture with a consumer computer and webcam.
粒子滤波是一种鲁棒的视觉运动跟踪方法,但代价是在高维姿态空间中进行大量的计算。在这项工作中,我们描述了一些启发式,我们展示了共同提高鲁棒性和实时性的运动捕捉。通过在视频上注册3D铰接模型,可以实现无标记的单目视觉三维人体运动捕捉。首先,我们通过运动学翻转生成具有等效二维投影的新假设(或粒子)来搜索三维姿态的高维空间。其次,我们使用了基于局部优化的半确定性粒子预测。第三,我们确定性地重新采样概率分布,以便更有效地选择粒子。粒子(或姿态)使用匹配代价函数进行评估,并使用离线学习的高斯概率姿态分布进行惩罚。为了实现实时性,采用OpenCL API在GPU上并行化测量步骤。我们提出的实验结果表明,鲁棒实时3D运动捕捉与消费电脑和网络摄像头。
{"title":"Real-time particle filtering with heuristics for 3D motion capture by monocular vision","authors":"David Antonio Gómez Jáuregui, P. Horain, Manoj Kumar Rajagopal, S. S. Karri","doi":"10.1109/MMSP.2010.5662008","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662008","url":null,"abstract":"Particle filtering is known as a robust approach for motion tracking by vision, at the cost of heavy computation in a high dimensional pose space. In this work, we describe a number of heuristics that we demonstrate to jointly improve robustness and real-time for motion capture. 3D human motion capture by monocular vision without markers can be achieved in realtime by registering a 3D articulated model on a video. First, we search the high-dimensional space of 3D poses by generating new hypotheses (or particles) with equivalent 2D projection by kinematic flipping. Second, we use a semi-deterministic particle prediction based on local optimization. Third, we deterministi-cally resample the probability distribution for a more efficient selection of particles. Particles (or poses) are evaluated using a match cost function and penalized with a Gaussian probability pose distribution learned off-line. In order to achieve real-time, measurement step is parallelized on GPU using the OpenCL API. We present experimental results demonstrating robust real-time 3D motion capture with a consumer computer and webcam.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124080466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
An improved foresighted resource reciprocation strategy for multimedia streaming applications 一种改进的多媒体流应用的预见资源交换策略
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662055
Ester Gutiérrez, Hyunggon Park, P. Frossard
In this paper, we present a solution to efficient multimedia streaming applications over P2P networks based on the foresighted resource reciprocation strategy. We study several priority functions that can explicitly consider the timing constraints and the importance of each data segment in terms of multimedia quality, and successfully incorporate them into the foresighted resource reciprocation strategy. This enables peers to enhance their multimedia streaming capability. The simulation results confirm that the proposed approach outperforms existing algorithms such as tit-for-tat in BitTorrent and BiToS solutions.
在本文中,我们提出了一种基于前瞻性资源交换策略的P2P网络上高效多媒体流应用的解决方案。我们研究了几个可以明确考虑时间约束和每个数据段在多媒体质量方面的重要性的优先级函数,并成功地将它们纳入前瞻性资源互惠策略中。这使对等体能够增强其多媒体流能力。仿真结果证实了该方法优于现有的BitTorrent和BiToS解决方案中的针锋相对算法。
{"title":"An improved foresighted resource reciprocation strategy for multimedia streaming applications","authors":"Ester Gutiérrez, Hyunggon Park, P. Frossard","doi":"10.1109/MMSP.2010.5662055","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662055","url":null,"abstract":"In this paper, we present a solution to efficient multimedia streaming applications over P2P networks based on the foresighted resource reciprocation strategy. We study several priority functions that can explicitly consider the timing constraints and the importance of each data segment in terms of multimedia quality, and successfully incorporate them into the foresighted resource reciprocation strategy. This enables peers to enhance their multimedia streaming capability. The simulation results confirm that the proposed approach outperforms existing algorithms such as tit-for-tat in BitTorrent and BiToS solutions.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114402849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2010 IEEE International Workshop on Multimedia Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1