首页 > 最新文献

2011 IEEE International Symposium on Multimedia最新文献

英文 中文
An Adaptive Inter Mode Decision for Multiview Video Coding 多视点视频编码的自适应模式间决策
Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.54
Wei Zhu, Peng Chen, Yayu Zheng, Jie Feng
Multiview video coding (MVC) plays an important role in 3D video system, while the huge computational complexity blocks its applications. This paper proposes an adaptive Inter mode decision algorithm to reduce the complexity of MVC. First, the selection of Inter modes is determined by using the textural region type of macro block (MB). Then, the estimation of small size Inter modes (Inter16x8, Inter8x16, and Inter8x8) is decided based on the motion homogenization of MB, which is predicted by utilizing the motion estimation results of Inter16x16 mode. Finally, the complexity of Inter8x8 mode estimation is progressively reduced by employing rate-distortion (RD) costs of estimated modes. As compared to the full mode decision in MVC reference software, the proposed algorithm achieved 71% encoding time saving on average with 0.026 dB peak signal-to-noise ratio loss and 0.74% bit rate increase.
多视图视频编码(MVC)在三维视频系统中占有重要地位,但巨大的计算复杂度阻碍了其应用。本文提出了一种自适应模式间决策算法,以降低MVC的复杂度。首先,利用宏块(MB)的纹理区域类型确定Inter模式的选择;然后,基于MB的运动均匀性确定小尺寸Inter模式(Inter16x8、Inter8x16和Inter8x8)的估计,并利用Inter16x16模式的运动估计结果进行预测。最后,利用估计模式的率失真代价逐步降低Inter8x8模式估计的复杂性。与MVC参考软件的全模式决策相比,该算法平均节省71%的编码时间,峰值信噪比损失0.026 dB,比特率提高0.74%。
{"title":"An Adaptive Inter Mode Decision for Multiview Video Coding","authors":"Wei Zhu, Peng Chen, Yayu Zheng, Jie Feng","doi":"10.1109/ISM.2011.54","DOIUrl":"https://doi.org/10.1109/ISM.2011.54","url":null,"abstract":"Multiview video coding (MVC) plays an important role in 3D video system, while the huge computational complexity blocks its applications. This paper proposes an adaptive Inter mode decision algorithm to reduce the complexity of MVC. First, the selection of Inter modes is determined by using the textural region type of macro block (MB). Then, the estimation of small size Inter modes (Inter16x8, Inter8x16, and Inter8x8) is decided based on the motion homogenization of MB, which is predicted by utilizing the motion estimation results of Inter16x16 mode. Finally, the complexity of Inter8x8 mode estimation is progressively reduced by employing rate-distortion (RD) costs of estimated modes. As compared to the full mode decision in MVC reference software, the proposed algorithm achieved 71% encoding time saving on average with 0.026 dB peak signal-to-noise ratio loss and 0.74% bit rate increase.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"163 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133255052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Rateless UEP Convolutional Code for Robust SVC/MGS Wireless Broadcasting 鲁棒SVC/MGS无线广播的无速率UEP卷积码
Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.51
Chung-hsuan Wang, J. Zao, Hsing-Min Chen, Pei-Lun Diao, Chih-Ming Chiu
Wireless broadcasting of scalable video coded medium grain scalable (SVC/MGS) bit streams requires unequal erasure protection (UEP) at the transport layer in order to ensure graceful degradation of playback video quality over a wide range of frame error rates. Modern wireless broadcasting systems even employ rate less fountain codes to aid the receivers in making inevitable trade¬offs among picture quality, channel throughput and playback latency. Designing a rate less UEP channel code fits for such an application posts distinct engineering challenges as the neces¬sary protection for SVC base and enhancement layers differ by orders of magnitude while their intradependent groups of pictures fluctuate notably in their sizes. In this paper, we present the design and implementation of a rate less UEP con¬volutional code that meets these demanding requirements. Use of this UEP channel code along with rate-distortion based network application layer unit extraction offer sufficient protection to SVC bit streams under different lossy conditions without the need to re-code the bit stream. We also investigated the differences in playback performance of SVC bit streams that were protected by rate less codes vs. con¬ventional Reed-Solomon codes. The comparison makes clear the advantages and the disadvantages of employing rate less codes in protecting wireless video broad-casting.
无线广播可伸缩视频编码中粒可伸缩(SVC/MGS)比特流需要传输层的不平等擦除保护(UEP),以确保在大范围的帧错误率下播放视频质量的优雅退化。现代无线广播系统甚至采用速率较低的喷泉码来帮助接收器在图像质量、信道吞吐量和播放延迟之间做出不可避免的权衡。设计一个速率更低的UEP信道代码适合这样的应用程序,这带来了明显的工程挑战,因为SVC基层和增强层的必要保护在数量级上有所不同,而它们的独立图像组在大小上有明显的波动。在本文中,我们提出了满足这些要求的低速率UEP卷积码的设计和实现。使用这种UEP信道码以及基于速率失真的网络应用层单元提取为不同有损条件下的SVC比特流提供了足够的保护,而无需重新编码比特流。我们还研究了受低速率码与传统Reed-Solomon码保护的SVC比特流在播放性能上的差异。通过比较,明确了采用低率码对无线视频广播进行保护的优缺点。
{"title":"A Rateless UEP Convolutional Code for Robust SVC/MGS Wireless Broadcasting","authors":"Chung-hsuan Wang, J. Zao, Hsing-Min Chen, Pei-Lun Diao, Chih-Ming Chiu","doi":"10.1109/ISM.2011.51","DOIUrl":"https://doi.org/10.1109/ISM.2011.51","url":null,"abstract":"Wireless broadcasting of scalable video coded medium grain scalable (SVC/MGS) bit streams requires unequal erasure protection (UEP) at the transport layer in order to ensure graceful degradation of playback video quality over a wide range of frame error rates. Modern wireless broadcasting systems even employ rate less fountain codes to aid the receivers in making inevitable trade¬offs among picture quality, channel throughput and playback latency. Designing a rate less UEP channel code fits for such an application posts distinct engineering challenges as the neces¬sary protection for SVC base and enhancement layers differ by orders of magnitude while their intradependent groups of pictures fluctuate notably in their sizes. In this paper, we present the design and implementation of a rate less UEP con¬volutional code that meets these demanding requirements. Use of this UEP channel code along with rate-distortion based network application layer unit extraction offer sufficient protection to SVC bit streams under different lossy conditions without the need to re-code the bit stream. We also investigated the differences in playback performance of SVC bit streams that were protected by rate less codes vs. con¬ventional Reed-Solomon codes. The comparison makes clear the advantages and the disadvantages of employing rate less codes in protecting wireless video broad-casting.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116231748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Camera Deployment for Video Panorama Generation in Wireless Visual Sensor Networks 无线视觉传感器网络中视频全景生成的摄像机部署
Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.105
Enes Yildiz, K. Akkaya, Esra Sisikoglu, M. Sir, Ismail Guneydas
In this paper, we tackle the problem of providing coverage for video panorama generation in Wireless Heterogeneous Visual Sensor Networks (VSNs) where cameras may have different price, resolution, Field-of-View (FoV) and Depth-of-Field (DoF). We utilize multi-perspective coverage (MPC) which refers to the coverage of a point from given disparate perspectives simultaneously. For a given minimum average resolution, area boundaries, and variety of camera sensors, we propose a deployment algorithm which minimizes the total cost while guaranteeing full MPC of the area (i.e., the coverage needed for video panorama generation) and the minimum required resolution. Specifically, the approach is based on a bi-level mixed integer program (MIP), which runs two models, namely master problem and sub-problem, iteratively. Master-problem provides coverage for initial set of identified points while meeting the minimum resolution requirement with minimum cost. Sub-problem which follows the master-problem finds an uncovered point and extends the set of points to be covered. It then sends this set back to the master-problem. Master-problem and sub-problem continue to run iteratively until sub-problem becomes infeasible, which means full MPC has been achieved with the resolution requirements. The numerical results show the superiority of our approach with respect to existing approaches.
在本文中,我们解决了在无线异构视觉传感器网络(VSNs)中为视频全景生成提供覆盖的问题,其中摄像机可能具有不同的价格,分辨率,视场(FoV)和景深(DoF)。我们利用多视角覆盖(MPC),这是指从给定的不同视角同时覆盖一个点。对于给定的最小平均分辨率,区域边界和各种相机传感器,我们提出了一种部署算法,该算法在保证区域的完整MPC(即视频全景生成所需的覆盖范围)和最小所需分辨率的同时最小化总成本。具体来说,该方法基于一个双级混合整数规划(MIP),迭代地运行主问题和子问题两个模型。主问题以最小的成本满足最小的分辨率要求,同时覆盖了最初确定的点集。跟随主问题的子问题找到一个未覆盖的点并扩展要覆盖的点集。然后它将这个集合发送回主问题。主问题和子问题继续迭代运行,直到子问题变得不可行的,这意味着在满足解决要求的情况下实现了完全的MPC。数值结果表明,该方法相对于现有方法具有优越性。
{"title":"Camera Deployment for Video Panorama Generation in Wireless Visual Sensor Networks","authors":"Enes Yildiz, K. Akkaya, Esra Sisikoglu, M. Sir, Ismail Guneydas","doi":"10.1109/ISM.2011.105","DOIUrl":"https://doi.org/10.1109/ISM.2011.105","url":null,"abstract":"In this paper, we tackle the problem of providing coverage for video panorama generation in Wireless Heterogeneous Visual Sensor Networks (VSNs) where cameras may have different price, resolution, Field-of-View (FoV) and Depth-of-Field (DoF). We utilize multi-perspective coverage (MPC) which refers to the coverage of a point from given disparate perspectives simultaneously. For a given minimum average resolution, area boundaries, and variety of camera sensors, we propose a deployment algorithm which minimizes the total cost while guaranteeing full MPC of the area (i.e., the coverage needed for video panorama generation) and the minimum required resolution. Specifically, the approach is based on a bi-level mixed integer program (MIP), which runs two models, namely master problem and sub-problem, iteratively. Master-problem provides coverage for initial set of identified points while meeting the minimum resolution requirement with minimum cost. Sub-problem which follows the master-problem finds an uncovered point and extends the set of points to be covered. It then sends this set back to the master-problem. Master-problem and sub-problem continue to run iteratively until sub-problem becomes infeasible, which means full MPC has been achieved with the resolution requirements. The numerical results show the superiority of our approach with respect to existing approaches.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128470090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Developing a Real-Time System for Measuring the Consumption of Seasoning 调味品消耗量实时测量系统的开发
Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.71
Mayumi Ueda, Takuya Funatomi, Atsushi Hashimoto, Takahiro Watanabe, M. Minoh
In this paper, we propose a real-time system for measuring the consumption of various types of seasonings. In our system, all seasonings are placed on a scale, and we continuously take images of these items using a camera. Our system estimates the consumption of each condiment by calculating the difference between the weight when the seasoning was picked up and the weight when it was placed back on the scale. Our system identifies the type of seasoning that was used by determining whether or not the seasoning was present on the scale. By using our system, users can automatically log their usage of seasoning. Then, they can adjust the seasoning according to their desired taste.
在本文中,我们提出了一种实时测量各种调味料消耗的系统。在我们的系统中,所有的调味料都被放在一个秤上,我们用相机不断地拍摄这些物品的图像。我们的系统通过计算拿起调味料时的重量与放回称上时的重量之差来估计每种调味料的消耗量。我们的系统通过确定调味料是否存在于天平上来识别所使用的调味料的类型。通过使用我们的系统,用户可以自动记录他们的调味料使用情况。然后,他们可以根据自己想要的味道来调整调味料。
{"title":"Developing a Real-Time System for Measuring the Consumption of Seasoning","authors":"Mayumi Ueda, Takuya Funatomi, Atsushi Hashimoto, Takahiro Watanabe, M. Minoh","doi":"10.1109/ISM.2011.71","DOIUrl":"https://doi.org/10.1109/ISM.2011.71","url":null,"abstract":"In this paper, we propose a real-time system for measuring the consumption of various types of seasonings. In our system, all seasonings are placed on a scale, and we continuously take images of these items using a camera. Our system estimates the consumption of each condiment by calculating the difference between the weight when the seasoning was picked up and the weight when it was placed back on the scale. Our system identifies the type of seasoning that was used by determining whether or not the seasoning was present on the scale. By using our system, users can automatically log their usage of seasoning. Then, they can adjust the seasoning according to their desired taste.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131973383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Delay-Aware Loss-Concealment Strategies for Real-Time Video Conferencing 实时视频会议的延迟感知丢失隐藏策略
Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.14
Jingxi Xu, B. Wah
One-way audiovisual quality and mouth-to-ear delay (MED) are two important quality metrics in the design of real-time video-conferencing systems, and their trade-offs have significant impact on the user-perceived quality. In this paper, we address one aspect of this larger problem by developing efficient loss-concealment schemes that optimize the one-way quality under given MED and network conditions. Our experimental results show that our approach can attain significant improvements over the LARDo reference scheme that does not consider MED in its optimization.
单向视听质量和口耳延迟(MED)是实时视频会议系统设计中的两个重要质量指标,它们的权衡对用户感知质量有重要影响。在本文中,我们通过开发有效的损失隐藏方案来解决这个更大问题的一个方面,该方案在给定的MED和网络条件下优化了单向质量。实验结果表明,与不考虑MED的LARDo参考方案相比,我们的方法可以获得显著的改进。
{"title":"Delay-Aware Loss-Concealment Strategies for Real-Time Video Conferencing","authors":"Jingxi Xu, B. Wah","doi":"10.1109/ISM.2011.14","DOIUrl":"https://doi.org/10.1109/ISM.2011.14","url":null,"abstract":"One-way audiovisual quality and mouth-to-ear delay (MED) are two important quality metrics in the design of real-time video-conferencing systems, and their trade-offs have significant impact on the user-perceived quality. In this paper, we address one aspect of this larger problem by developing efficient loss-concealment schemes that optimize the one-way quality under given MED and network conditions. Our experimental results show that our approach can attain significant improvements over the LARDo reference scheme that does not consider MED in its optimization.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133108633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Comprehensive Analysis on the Effects of Noise Estimation Strategies on Image Noise Artifact Suppression Performance 综合分析噪声估计策略对图像噪声伪影抑制性能的影响
Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.24
Angus Leigh, A. Wong, David A Clausi, P. Fieguth
In this paper, the effects of employing different noise estimation strategies on the performance of noise artifact suppression techniques in achieving high image quality has been investigated. Most literature on the subject tends to use the true noise level of the noisy image when performing noise artifact suppression. However, this approach does not reflect how such techniques would be used in practical situations where the true noise level is unknown, which is common in most image and video processing applications. Therefore, in practical situations, the noise level must first be estimated before a noise artifact suppression technique can be applied using the estimated noise level. Through a comprehensive analysis of different noise estimation strategies using empirical testing on a variety of images with different characteristics, the MAD wavelet noise estimation technique was found to be the overall preferred noise estimation technique for all popular noise artifact suppression techniques investigated (BM3D, bilateral, Neigh Shrink, BLS-GSM and non-local means). Furthermore, the BM3D noise artifact suppression technique, combined with the MAD wavelet noise estimation technique, was found to offer the best performance in achieving high image quality in situations where the noise level is unknown and must be estimated. The outcome of this research is clear recommendations that can be used in practise when suppressing noise artifacts exhibited in digital imagery and video.
本文研究了采用不同的噪声估计策略对噪声伪影抑制技术在实现高图像质量中的性能的影响。在进行噪声伪影抑制时,大多数关于该主题的文献倾向于使用噪声图像的真实噪声电平。然而,这种方法并不能反映这些技术如何在实际情况下使用,其中真实的噪声水平是未知的,这在大多数图像和视频处理应用中是常见的。因此,在实际情况下,必须首先估计噪声水平,然后才能使用估计的噪声水平应用噪声伪影抑制技术。通过对各种具有不同特征的图像进行经验测试,综合分析了不同的噪声估计策略,发现MAD小波噪声估计技术是所有流行的噪声伪影抑制技术(BM3D,双边,Neigh收缩,BLS-GSM和非局部方法)的总体首选噪声估计技术。此外,BM3D噪声伪影抑制技术与MAD小波噪声估计技术相结合,在噪声水平未知且必须估计的情况下,在实现高图像质量方面提供了最佳性能。这项研究的结果是明确的建议,可以在实践中使用,当抑制数字图像和视频中显示的噪声伪影。
{"title":"Comprehensive Analysis on the Effects of Noise Estimation Strategies on Image Noise Artifact Suppression Performance","authors":"Angus Leigh, A. Wong, David A Clausi, P. Fieguth","doi":"10.1109/ISM.2011.24","DOIUrl":"https://doi.org/10.1109/ISM.2011.24","url":null,"abstract":"In this paper, the effects of employing different noise estimation strategies on the performance of noise artifact suppression techniques in achieving high image quality has been investigated. Most literature on the subject tends to use the true noise level of the noisy image when performing noise artifact suppression. However, this approach does not reflect how such techniques would be used in practical situations where the true noise level is unknown, which is common in most image and video processing applications. Therefore, in practical situations, the noise level must first be estimated before a noise artifact suppression technique can be applied using the estimated noise level. Through a comprehensive analysis of different noise estimation strategies using empirical testing on a variety of images with different characteristics, the MAD wavelet noise estimation technique was found to be the overall preferred noise estimation technique for all popular noise artifact suppression techniques investigated (BM3D, bilateral, Neigh Shrink, BLS-GSM and non-local means). Furthermore, the BM3D noise artifact suppression technique, combined with the MAD wavelet noise estimation technique, was found to offer the best performance in achieving high image quality in situations where the noise level is unknown and must be estimated. The outcome of this research is clear recommendations that can be used in practise when suppressing noise artifacts exhibited in digital imagery and video.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133237847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
A Subjective Evaluation of 3D Iptv Broadcasting Implementations Considering Coding and Transmission Degradation 考虑编码和传输退化的3D Iptv广播实现的主观评价
Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.89
Pierre R. Lebreton, A. Raake, M. Barkowsky, P. Callet
This paper describes the results of a subjective test to assess current technology used for 3DTV broadcasting. As a first aspect, the performance of the currently deployed coding schemes was compared to state of the art algorithms. Our results show that down sampling and packing 3D stereoscopic videos according to the so called Side-By-Side format gives the highest perceived quality for a given bit rate. The second aspect of the study was to investigate how common 2D error concealment algorithms perform in case of 3D, and how their 3D-related performance compares with the 2D case. The results provide information on whether binocular suppression or binocular rivalries play the most important role for 3D video quality under transmission error. The results indicate that binocular rivalries and related visual discomfort are the dominant factors. Another aspect of the paper is a comparison of the test results with results from different labs to evaluate the repeatability of a subjective experiment in the 3D case, and to compare the employed test methodologies. Here, the study shows the variation between observers when they are rating visual discomfort and illustrates the difficulty to evaluate this new dimension.
本文描述了一项主观测试的结果,以评估目前用于3DTV广播的技术。作为第一个方面,将当前部署的编码方案的性能与最先进的算法进行比较。我们的研究结果表明,根据所谓的并排格式进行采样和包装3D立体视频可以在给定的比特率下获得最高的感知质量。研究的第二个方面是调查常见的2D错误隐藏算法在3D情况下的表现,以及它们的3D相关性能与2D情况的比较。研究结果揭示了在传输误差下,双目抑制和双目竞争对3D视频质量的影响是最重要的。结果表明,双眼竞争和相关的视觉不适是主要因素。本文的另一个方面是测试结果与来自不同实验室的结果进行比较,以评估在3D情况下主观实验的可重复性,并比较所采用的测试方法。在这里,研究显示了观察者在评估视觉不适时的差异,并说明了评估这个新维度的难度。
{"title":"A Subjective Evaluation of 3D Iptv Broadcasting Implementations Considering Coding and Transmission Degradation","authors":"Pierre R. Lebreton, A. Raake, M. Barkowsky, P. Callet","doi":"10.1109/ISM.2011.89","DOIUrl":"https://doi.org/10.1109/ISM.2011.89","url":null,"abstract":"This paper describes the results of a subjective test to assess current technology used for 3DTV broadcasting. As a first aspect, the performance of the currently deployed coding schemes was compared to state of the art algorithms. Our results show that down sampling and packing 3D stereoscopic videos according to the so called Side-By-Side format gives the highest perceived quality for a given bit rate. The second aspect of the study was to investigate how common 2D error concealment algorithms perform in case of 3D, and how their 3D-related performance compares with the 2D case. The results provide information on whether binocular suppression or binocular rivalries play the most important role for 3D video quality under transmission error. The results indicate that binocular rivalries and related visual discomfort are the dominant factors. Another aspect of the paper is a comparison of the test results with results from different labs to evaluate the repeatability of a subjective experiment in the 3D case, and to compare the employed test methodologies. Here, the study shows the variation between observers when they are rating visual discomfort and illustrates the difficulty to evaluate this new dimension.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"7 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113957580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Saliency Detection Using Region-Based Incremental Center-Surround Distance 基于区域增量中心-环绕距离的显著性检测
Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.47
Minwoo Park, Mrityunjay Kumar, A. Loui
A new method to detect salient region(s) in images is proposed in this paper. The proposed approach, which is inspired by object-based visual attention theory, segments the input image into coherent regions and measures region-based center-surround distance (RBCSD), which is a distance between region attributes such as color histograms found in each region and its surrounding region. Furthermore, segmented regions are merged such that the RBCSD of the merged region is greater than the individual RBCSD of the component regions through region-based incremental center surround distance (RBCSD+I) process. Due to this RBCSD+I process, merged regions may contain incoherent color regions, which improves the robustness of the proposed approach. The key advantages of the proposed algorithm are: (1) it provides a salient region with plausible object boundaries, (2) it is robust to color incoherency present in the salient region, and (3) it is computationally efficient. Extensive qualitative and quantitative evaluation of the proposed algorithm on widely used data sets and comparison with the existing saliency detection approaches clearly indicates the feasibility and efficiency of the proposed approach.
提出了一种检测图像显著区域的新方法。该方法受基于对象的视觉注意理论的启发,将输入图像分割成连贯的区域,并测量基于区域的中心-周围距离(RBCSD),即每个区域的颜色直方图等区域属性与其周围区域之间的距离。进一步,通过基于区域的增量中心环绕距离(RBCSD+I)过程,对分割后的区域进行合并,使合并后区域的RBCSD大于组成区域的单个RBCSD。由于这种RBCSD+I过程,合并的区域可能包含不一致的颜色区域,从而提高了该方法的鲁棒性。该算法的主要优点是:(1)提供具有合理目标边界的显著区域;(2)对显著区域中存在的颜色不相干具有鲁棒性;(3)计算效率高。在广泛使用的数据集上对所提出算法进行了广泛的定性和定量评估,并与现有的显著性检测方法进行了比较,清楚地表明所提出方法的可行性和有效性。
{"title":"Saliency Detection Using Region-Based Incremental Center-Surround Distance","authors":"Minwoo Park, Mrityunjay Kumar, A. Loui","doi":"10.1109/ISM.2011.47","DOIUrl":"https://doi.org/10.1109/ISM.2011.47","url":null,"abstract":"A new method to detect salient region(s) in images is proposed in this paper. The proposed approach, which is inspired by object-based visual attention theory, segments the input image into coherent regions and measures region-based center-surround distance (RBCSD), which is a distance between region attributes such as color histograms found in each region and its surrounding region. Furthermore, segmented regions are merged such that the RBCSD of the merged region is greater than the individual RBCSD of the component regions through region-based incremental center surround distance (RBCSD+I) process. Due to this RBCSD+I process, merged regions may contain incoherent color regions, which improves the robustness of the proposed approach. The key advantages of the proposed algorithm are: (1) it provides a salient region with plausible object boundaries, (2) it is robust to color incoherency present in the salient region, and (3) it is computationally efficient. Extensive qualitative and quantitative evaluation of the proposed algorithm on widely used data sets and comparison with the existing saliency detection approaches clearly indicates the feasibility and efficiency of the proposed approach.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127847382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Scale-Optimized Textons for Image Categorization and Segmentation 用于图像分类和分割的尺度优化文本
Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.48
Yousun Kang, A. Sugimoto
Texton is a representative dense visual word and it has proven its effectiveness in categorizing materials as well as generic object classes. Despite its success and popularity, no prior work has tackled the problem of its scale optimization for a given image data and associated object category. We propose scale-optimized textons to learn the best scale for each object in a scene, and incorporate them into image categorization and segmentation. Our textonization process produces a scale-optimized codebook of visual words. We approach the scale-optimization problem of textons by using the scene-context scale in each image, which is the effective scale of local context to classify an image pixel in a scene. We perform the textonization process using the randomized decision forest which is a powerful tool with high computational efficiency in vision applications. Our experiments using MSRC and VOC 2007 segmentation dataset show that our scale-optimized textons improve the performance of image categorization and segmentation.
Texton是一个具有代表性的密集视觉词,它已经证明了它在材料分类和一般对象分类方面的有效性。尽管它取得了成功和普及,但之前没有工作解决了给定图像数据和相关对象类别的尺度优化问题。我们提出了尺度优化的文本来学习场景中每个物体的最佳尺度,并将其纳入图像分类和分割中。我们的文本化过程产生了一个规模优化的视觉单词码本。我们利用每幅图像中的场景-上下文尺度(scene-context scale)来解决文本的尺度优化问题,该尺度是局部上下文对场景中图像像素进行分类的有效尺度。我们使用随机决策森林来执行文本化过程,随机决策森林是视觉应用中计算效率高的强大工具。我们使用MSRC和VOC 2007分割数据集进行的实验表明,我们的尺度优化文本提高了图像分类和分割的性能。
{"title":"Scale-Optimized Textons for Image Categorization and Segmentation","authors":"Yousun Kang, A. Sugimoto","doi":"10.1109/ISM.2011.48","DOIUrl":"https://doi.org/10.1109/ISM.2011.48","url":null,"abstract":"Texton is a representative dense visual word and it has proven its effectiveness in categorizing materials as well as generic object classes. Despite its success and popularity, no prior work has tackled the problem of its scale optimization for a given image data and associated object category. We propose scale-optimized textons to learn the best scale for each object in a scene, and incorporate them into image categorization and segmentation. Our textonization process produces a scale-optimized codebook of visual words. We approach the scale-optimization problem of textons by using the scene-context scale in each image, which is the effective scale of local context to classify an image pixel in a scene. We perform the textonization process using the randomized decision forest which is a powerful tool with high computational efficiency in vision applications. Our experiments using MSRC and VOC 2007 segmentation dataset show that our scale-optimized textons improve the performance of image categorization and segmentation.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128560569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Exploiting Text-Related Features for Content-based Image Retrieval 利用文本相关特征进行基于内容的图像检索
Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.21
Georg Schroth, S. Hilsenbeck, Robert Huitl, F. Schweiger, E. Steinbach
Distinctive visual cues are of central importance for image retrieval applications, in particular, in the context of visual location recognition. While in indoor environments typically only few distinctive features can be found, outdoors dynamic objects and clutter significantly impair the retrieval performance. We present an approach which exploits text, a major source of information for humans during orientation and navigation, without the need for error-prone optical character recognition. To this end, characters are detected and described using robust feature descriptors like SURF. By quantizing them into several hundred visual words we consider the distinctive appearance of the characters rather than reducing the set of possible features to an alphabet. Writings in images are transformed to strings of visual words termed visual phrases, which provide significantly improved distinctiveness when compared to individual features. An approximate string matching is performed using N-grams, which can be efficiently combined with an inverted file structure to cope with large datasets. An experimental evaluation on three different datasets shows significant improvement of the retrieval performance while reducing the size of the database by two orders of magnitude compared to state-of-the-art. Its low computational complexity makes the approach particularly suited for mobile image retrieval applications.
独特的视觉线索对图像检索应用至关重要,特别是在视觉位置识别的背景下。在室内环境中,通常只能找到很少的特征,而在室外,动态物体和杂波严重影响了检索性能。我们提出了一种利用文本的方法,这是人类在方向和导航过程中的主要信息来源,而不需要容易出错的光学字符识别。为此,使用SURF等鲁棒特征描述符检测和描述字符。通过将它们量化为几百个视觉单词,我们考虑的是字符的独特外观,而不是将可能的特征集简化为字母表。图像中的文字被转换为称为视觉短语的视觉单词串,与单个特征相比,它提供了显着提高的独特性。使用n -gram执行近似字符串匹配,它可以有效地与反向文件结构相结合,以应对大型数据集。在三个不同的数据集上进行的实验评估表明,与最先进的数据库相比,该方法在将数据库大小减少两个数量级的同时,显著提高了检索性能。它的低计算复杂度使得该方法特别适合于移动图像检索应用。
{"title":"Exploiting Text-Related Features for Content-based Image Retrieval","authors":"Georg Schroth, S. Hilsenbeck, Robert Huitl, F. Schweiger, E. Steinbach","doi":"10.1109/ISM.2011.21","DOIUrl":"https://doi.org/10.1109/ISM.2011.21","url":null,"abstract":"Distinctive visual cues are of central importance for image retrieval applications, in particular, in the context of visual location recognition. While in indoor environments typically only few distinctive features can be found, outdoors dynamic objects and clutter significantly impair the retrieval performance. We present an approach which exploits text, a major source of information for humans during orientation and navigation, without the need for error-prone optical character recognition. To this end, characters are detected and described using robust feature descriptors like SURF. By quantizing them into several hundred visual words we consider the distinctive appearance of the characters rather than reducing the set of possible features to an alphabet. Writings in images are transformed to strings of visual words termed visual phrases, which provide significantly improved distinctiveness when compared to individual features. An approximate string matching is performed using N-grams, which can be efficiently combined with an inverted file structure to cope with large datasets. An experimental evaluation on three different datasets shows significant improvement of the retrieval performance while reducing the size of the database by two orders of magnitude compared to state-of-the-art. Its low computational complexity makes the approach particularly suited for mobile image retrieval applications.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"125 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117333846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
期刊
2011 IEEE International Symposium on Multimedia
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1