首页 > 最新文献

2014 IEEE Visual Communications and Image Processing Conference最新文献

英文 中文
Statistical reconstruction for predictive video coding 预测视频编码的统计重构
Pub Date : 2014-12-01 DOI: 10.1109/VCIP.2014.7051624
Catarina Brites, Vitor Gomes, J. Ascenso, F. Pereira
Substantial rate-distortion (RD) gains have been achieved in video coding standards by increasing the encoder complexity while maintaining the decoder complexity the lowest possible. On the other hand, the alternative distributed video coding (DVC) approach proposes to exploit the video redundancy mostly at the decoder side, keeping the encoder as simple as possible. One of the most characteristic DVC tools is the statistical reconstruction of the DCT coefficients, which plays a similar role to the inverse scalar quantization (ISQ) in predictive codecs. The main objective of this paper is to propose a statistical reconstruction approach for predictive coding (notably the H.264/AVC standard) as a substitute to ISQ, thus creating a coding architecture with a mix of predictive and distributed coding tools. Experimental results show that the proposed statistical reconstruction solution allows achieving Bjontegaard bitrate savings up to 2.4% regarding the ISQ based H.264/AVC High profile codec.
在视频编码标准中,通过增加编码器复杂性的同时保持尽可能低的解码器复杂性,实现了大量的率失真(RD)增益。另一方面,替代性分布式视频编码(DVC)方法主要在解码器端利用视频冗余,使编码器尽可能简单。最具特色的DVC工具之一是DCT系数的统计重构,其作用类似于预测编解码器中的逆标量量化(ISQ)。本文的主要目标是提出一种预测编码(特别是H.264/AVC标准)的统计重构方法,作为ISQ的替代品,从而创建一个混合预测和分布式编码工具的编码体系结构。实验结果表明,对于基于ISQ的H.264/AVC高规格编解码器,所提出的统计重构方案可以实现高达2.4%的比特率节省。
{"title":"Statistical reconstruction for predictive video coding","authors":"Catarina Brites, Vitor Gomes, J. Ascenso, F. Pereira","doi":"10.1109/VCIP.2014.7051624","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051624","url":null,"abstract":"Substantial rate-distortion (RD) gains have been achieved in video coding standards by increasing the encoder complexity while maintaining the decoder complexity the lowest possible. On the other hand, the alternative distributed video coding (DVC) approach proposes to exploit the video redundancy mostly at the decoder side, keeping the encoder as simple as possible. One of the most characteristic DVC tools is the statistical reconstruction of the DCT coefficients, which plays a similar role to the inverse scalar quantization (ISQ) in predictive codecs. The main objective of this paper is to propose a statistical reconstruction approach for predictive coding (notably the H.264/AVC standard) as a substitute to ISQ, thus creating a coding architecture with a mix of predictive and distributed coding tools. Experimental results show that the proposed statistical reconstruction solution allows achieving Bjontegaard bitrate savings up to 2.4% regarding the ISQ based H.264/AVC High profile codec.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126447728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low latency DASH based streaming over LTE 基于LTE的低延迟DASH流
Pub Date : 2014-12-01 DOI: 10.1109/VCIP.2014.7051489
Y. Sanchez, E. Grinshpun, David W. Faucher, T. Schierl, Sameerkumar Sharma
Dynamic Adaptive Streaming over HTTP (DASH) is becoming the de facto technique for video delivery, especially for VoD services. Although 3GPP has specified carriage of DASH over eMBMS for Live Streaming, eMBMS is not available everywhere (operators are starting service rollout) and it is only worthwhile for reasonably large number of users due to the static SFN resource allocation for those services. Thus, Live Streaming using DASH over unicast connections is still necessary, which may suffer from playback interruptions when the throughput varies since the low end-to-end latency for live streaming requires small buffers. In order to cope with network throughput variations in mobile networks, we propose the usage of scalable video coding, combining it with parallel TCP connections and prioritizing the most important data of the scalable video. We show that using LTE non-GBR bearers for prioritization playback interruptions can be avoided.
基于HTTP的动态自适应流(DASH)正在成为视频传输的实际技术,特别是VoD服务。尽管3GPP已经指定了在eMBMS上传输DASH用于直播,但eMBMS并不是在任何地方都可用(运营商正在开始推出服务),而且由于这些服务的静态SFN资源分配,它只对相当多的用户有价值。因此,在单播连接上使用DASH的直播流仍然是必要的,当吞吐量变化时,可能会出现播放中断,因为直播流的低端到端延迟需要较小的缓冲区。为了应对移动网络中网络吞吐量的变化,我们提出使用可扩展的视频编码,将其与并行TCP连接相结合,并优先考虑可扩展视频中最重要的数据。我们表明,使用LTE非gbr承载优先级回放中断可以避免。
{"title":"Low latency DASH based streaming over LTE","authors":"Y. Sanchez, E. Grinshpun, David W. Faucher, T. Schierl, Sameerkumar Sharma","doi":"10.1109/VCIP.2014.7051489","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051489","url":null,"abstract":"Dynamic Adaptive Streaming over HTTP (DASH) is becoming the de facto technique for video delivery, especially for VoD services. Although 3GPP has specified carriage of DASH over eMBMS for Live Streaming, eMBMS is not available everywhere (operators are starting service rollout) and it is only worthwhile for reasonably large number of users due to the static SFN resource allocation for those services. Thus, Live Streaming using DASH over unicast connections is still necessary, which may suffer from playback interruptions when the throughput varies since the low end-to-end latency for live streaming requires small buffers. In order to cope with network throughput variations in mobile networks, we propose the usage of scalable video coding, combining it with parallel TCP connections and prioritizing the most important data of the scalable video. We show that using LTE non-GBR bearers for prioritization playback interruptions can be avoided.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125727950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Online video object classification using fast similarity network fusion 基于快速相似网络融合的在线视频对象分类
Pub Date : 2014-12-01 DOI: 10.1109/VCIP.2014.7051577
Xianlong Lu, Chongyang Zhang, Xiaokang Yang
In this paper, we propose one online video object classification algorithm using fast Similarity Network Fusion (SNF). By constructing sample-similarity network for each data type and then efficiently fusing these networks into one single similarity network that represents the full spectrum of underlying data, SNF can efficiently identify subtypes among existing samples by clustering and predict labels for new samples based on the constructed network, which make it distinct in data integration or classification. The main problem of data online classification using SNF is its complexity. The proposed fast SNF (FSNF) in this work consists of two main steps: dividing the matrix into two parts and replacing the main part of testing matrix using the same part of training matrix. Since the main computation in SNF is to get the main part of matrix, this replacement can reduce most of the computation load. From the experiments based on online surveillance video object classification, it can be observed that: compared with SNF, the proposed FSNF can gain 16 times speed increasing with only 0.5%-0.6% accuracy losing; FSNF also significantly outperforms the existing traditional algorithms in classification accuracy.
提出了一种基于快速相似网络融合(SNF)的在线视频目标分类算法。SNF通过为每种数据类型构建样本相似网络,然后将这些网络有效地融合成一个代表底层数据全谱的单一相似网络,通过聚类有效地识别现有样本中的亚型,并基于构建的网络预测新样本的标签,使其在数据集成或分类上具有独特性。使用SNF进行数据在线分类的主要问题是其复杂性。本文提出的快速SNF (FSNF)包括两个主要步骤:将矩阵分成两部分,并用训练矩阵的相同部分替换测试矩阵的主要部分。由于SNF的主要计算是获取矩阵的主要部分,这种替换可以减少大部分的计算负荷。从基于在线监控视频目标分类的实验中可以观察到:与SNF相比,所提出的FSNF在准确率仅损失0.5% ~ 0.6%的情况下,速度提高了16倍;FSNF在分类精度上也明显优于现有的传统算法。
{"title":"Online video object classification using fast similarity network fusion","authors":"Xianlong Lu, Chongyang Zhang, Xiaokang Yang","doi":"10.1109/VCIP.2014.7051577","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051577","url":null,"abstract":"In this paper, we propose one online video object classification algorithm using fast Similarity Network Fusion (SNF). By constructing sample-similarity network for each data type and then efficiently fusing these networks into one single similarity network that represents the full spectrum of underlying data, SNF can efficiently identify subtypes among existing samples by clustering and predict labels for new samples based on the constructed network, which make it distinct in data integration or classification. The main problem of data online classification using SNF is its complexity. The proposed fast SNF (FSNF) in this work consists of two main steps: dividing the matrix into two parts and replacing the main part of testing matrix using the same part of training matrix. Since the main computation in SNF is to get the main part of matrix, this replacement can reduce most of the computation load. From the experiments based on online surveillance video object classification, it can be observed that: compared with SNF, the proposed FSNF can gain 16 times speed increasing with only 0.5%-0.6% accuracy losing; FSNF also significantly outperforms the existing traditional algorithms in classification accuracy.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129858855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Subjective evaluation and statistical analysis for improved frame-loss error concealment of 3D videos 改进的3D视频丢帧错误隐藏的主观评价与统计分析
Pub Date : 2014-12-01 DOI: 10.1109/VCIP.2014.7051534
M. Hasan, J. Arnold, M. Frater
Broadcasting of high definition stereoscopic 3D videos is growing rapidly because of increased demand in the mass consumer market. In spite of increasing consumer interest, poor quality, crosstalk or side effects and lack of defined broadcast standards has hampered the advancement of 3D displays. Real time transmission of 3DTV sequences over packet based networks may result in visual quality degradation due to packet loss and delay. In conventional 2D videos, different extrapolation and directional interpolation strategies have been used for concealing the missing blocks but in 3D, this is still an emerging field of research. Also, subjective testing is the most direct way to evaluate 3D quality and human perception on the concealed videos. This paper reviews state-of-the-art error concealment strategies and proposes a low complexity frame loss concealment method for a video decoder. Subjective testing on common 3D video sequences and its statistical comparison with existing concealment methods shows that the proposed method is efficient for the concealment of error frames of stereoscopic videos in terms of visual comfort and 3D quality.
由于大众消费市场的需求增加,高清晰度立体3D视频的广播正在迅速增长。尽管消费者的兴趣日益浓厚,但质量差、相声或副作用以及缺乏明确的广播标准阻碍了3D显示器的发展。在基于分组的网络上实时传输3DTV序列可能会由于丢包和延迟而导致视觉质量下降。在传统的2D视频中,已经使用了不同的外推和方向插值策略来隐藏缺失的块,但在3D中,这仍然是一个新兴的研究领域。主观测试是评价隐藏视频的三维质量和人类感知的最直接的方法。本文回顾了当前的错误隐藏策略,提出了一种低复杂度的视频解码器帧丢失隐藏方法。对常见的三维视频序列进行主观测试,并与现有的隐藏方法进行统计比较,结果表明该方法在视觉舒适度和三维质量方面对立体视频错误帧的隐藏是有效的。
{"title":"Subjective evaluation and statistical analysis for improved frame-loss error concealment of 3D videos","authors":"M. Hasan, J. Arnold, M. Frater","doi":"10.1109/VCIP.2014.7051534","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051534","url":null,"abstract":"Broadcasting of high definition stereoscopic 3D videos is growing rapidly because of increased demand in the mass consumer market. In spite of increasing consumer interest, poor quality, crosstalk or side effects and lack of defined broadcast standards has hampered the advancement of 3D displays. Real time transmission of 3DTV sequences over packet based networks may result in visual quality degradation due to packet loss and delay. In conventional 2D videos, different extrapolation and directional interpolation strategies have been used for concealing the missing blocks but in 3D, this is still an emerging field of research. Also, subjective testing is the most direct way to evaluate 3D quality and human perception on the concealed videos. This paper reviews state-of-the-art error concealment strategies and proposes a low complexity frame loss concealment method for a video decoder. Subjective testing on common 3D video sequences and its statistical comparison with existing concealment methods shows that the proposed method is efficient for the concealment of error frames of stereoscopic videos in terms of visual comfort and 3D quality.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128827087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
An area-efficient 4/8/16/32-point inverse DCT architecture for UHDTV HEVC decoder 一种用于超高清电视HEVC解码器的面积高效的4/8/16/32点反向DCT架构
Pub Date : 2014-12-01 DOI: 10.1109/VCIP.2014.7051538
Heming Sun, Dajiang Zhou, Jiayi Zhu, S. Kimura, S. Goto
This paper presents a new VLSI architecture for HEVC inverse discrete cosine transform (TDCT). Compared to prior arts, this work reduces hardware cost by: reducing computational logic of 1-D IDCTs with a reordered parallel-in serial-out (RPISO) scheme that shares the inputs of the butterfly structure; and reducing the area of the transpose buffer with a cyclic memory organization that achieves 100% I/O utilization of the SRAMs. In the implementation of a unified 4/8/16/32-point IDCT, the proposed schemes demonstrate 35% and 62% reduction of logic and memory costs, respectively. The IDCT implementation can support real-time decoding of 4K×2K 60fps video with a total hardware cost of 357,250um2 on 2-D IDCT and 80,988um2 on transpose memory in 90nm process.
提出了一种新的用于HEVC反离散余弦变换(TDCT)的VLSI结构。与现有技术相比,这项工作通过以下方式降低了硬件成本:通过共享蝶形结构输入的重排序并行输入串行输出(RPISO)方案减少了1-D idct的计算逻辑;以及使用循环存储器组织减少转置缓冲区的面积,该循环存储器组织实现所述sram的100% I/O利用率。在实现统一的4/8/16/32点IDCT时,所提出的方案分别降低了35%和62%的逻辑和存储成本。IDCT实现可以支持4K×2K 60fps视频的实时解码,2d IDCT的总硬件成本为357,250um2,转置存储器的总硬件成本为80,988um2。
{"title":"An area-efficient 4/8/16/32-point inverse DCT architecture for UHDTV HEVC decoder","authors":"Heming Sun, Dajiang Zhou, Jiayi Zhu, S. Kimura, S. Goto","doi":"10.1109/VCIP.2014.7051538","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051538","url":null,"abstract":"This paper presents a new VLSI architecture for HEVC inverse discrete cosine transform (TDCT). Compared to prior arts, this work reduces hardware cost by: reducing computational logic of 1-D IDCTs with a reordered parallel-in serial-out (RPISO) scheme that shares the inputs of the butterfly structure; and reducing the area of the transpose buffer with a cyclic memory organization that achieves 100% I/O utilization of the SRAMs. In the implementation of a unified 4/8/16/32-point IDCT, the proposed schemes demonstrate 35% and 62% reduction of logic and memory costs, respectively. The IDCT implementation can support real-time decoding of 4K×2K 60fps video with a total hardware cost of 357,250um2 on 2-D IDCT and 80,988um2 on transpose memory in 90nm process.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121952433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Stereo correspondence using an assisted discrete cosine transform method 立体对应使用辅助离散余弦变换方法
Pub Date : 2014-12-01 DOI: 10.1109/VCIP.2014.7051509
Edward Rosales, L. Guan
In this paper, a stereo matching algorithm using a window based frequency comparison method is formulated. The algorithm works with a local matching stereo model where a normalized cost function between frequency components and intensity values is used. The algorithm determines matching points in a stereo pair and uses a weighted cost function to determine the true disparity of the stereo pair. Unlike classical stereo correspondence algorithms that determine initial disparity maps through window based color intensity comparisons, the proposed algorithm uses window based frequency comparisons to exemplify the ability of frequency components to accurately find high detailed segments of the image. The algorithm is evaluated on the Middlebury data sets, and shows that it is noise and distortion resistant similar to the work in [1], thus allowing for higher reliability during comparisons. Additionally, this provides an advantage over typical color intensity comparisons as noise present in an image may cause mismatching when color intensity comparisons are executed.
本文提出了一种基于窗口的频率比较方法的立体匹配算法。该算法在局部匹配立体模型中工作,其中频率分量和强度值之间使用归一化代价函数。该算法确定立体对中的匹配点,并使用加权代价函数确定立体对的真实视差。与传统的立体对应算法通过基于窗口的颜色强度比较来确定初始视差图不同,该算法使用基于窗口的频率比较来举例说明频率分量准确找到图像高细节部分的能力。该算法在Middlebury数据集上进行了评估,并表明它与[1]中的工作相似,具有抗噪声和抗失真能力,从而在比较过程中具有更高的可靠性。此外,这比典型的颜色强度比较提供了一个优势,因为当执行颜色强度比较时,图像中存在的噪声可能导致不匹配。
{"title":"Stereo correspondence using an assisted discrete cosine transform method","authors":"Edward Rosales, L. Guan","doi":"10.1109/VCIP.2014.7051509","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051509","url":null,"abstract":"In this paper, a stereo matching algorithm using a window based frequency comparison method is formulated. The algorithm works with a local matching stereo model where a normalized cost function between frequency components and intensity values is used. The algorithm determines matching points in a stereo pair and uses a weighted cost function to determine the true disparity of the stereo pair. Unlike classical stereo correspondence algorithms that determine initial disparity maps through window based color intensity comparisons, the proposed algorithm uses window based frequency comparisons to exemplify the ability of frequency components to accurately find high detailed segments of the image. The algorithm is evaluated on the Middlebury data sets, and shows that it is noise and distortion resistant similar to the work in [1], thus allowing for higher reliability during comparisons. Additionally, this provides an advantage over typical color intensity comparisons as noise present in an image may cause mismatching when color intensity comparisons are executed.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127243309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variable length dominant Gabor local binary pattern (VLD-GLBP) for face recognition 可变长度显性Gabor局部二值模式(VLD-GLBP)人脸识别
Pub Date : 2014-12-01 DOI: 10.1109/VCIP.2014.7051511
Jun Liu, Xiaojun Jing, Songlin Sun, Zifeng Lian
Gabor filters are one of the most successful methods for face recognition. However they dramatically increase the data volume for face representation. To extract compact and distinctive information, we propose the Variable Length Dominant Gabor Local Binary Pattern (VLD-GLBP) for face recognition. It significantly reduces the face representation data volume whereas the performance is comparable to that of the complex state-of-the-art techniques. Specifically, local binary pattern (LBP) features are first computed from the Gabor images. Then, the most frequently occurred patterns are extracted to form VLD-GLBP. Finally the distance between VLD-GLBPs is computed to realize the face image classification. The experiment results on FERET database verify the efficiency of the proposed VLD-GLBP method.
Gabor滤波器是人脸识别中最成功的方法之一。然而,它们极大地增加了人脸表示的数据量。为了提取紧凑和独特的信息,我们提出了可变长度显性Gabor局部二值模式(VLD-GLBP)用于人脸识别。它大大减少了人脸表示数据量,而性能可与最先进的复杂技术相媲美。具体来说,首先从Gabor图像中计算局部二值模式(LBP)特征。然后,提取最频繁出现的模式,形成VLD-GLBP。最后计算vld - glbp之间的距离,实现人脸图像分类。在FERET数据库上的实验结果验证了该方法的有效性。
{"title":"Variable length dominant Gabor local binary pattern (VLD-GLBP) for face recognition","authors":"Jun Liu, Xiaojun Jing, Songlin Sun, Zifeng Lian","doi":"10.1109/VCIP.2014.7051511","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051511","url":null,"abstract":"Gabor filters are one of the most successful methods for face recognition. However they dramatically increase the data volume for face representation. To extract compact and distinctive information, we propose the Variable Length Dominant Gabor Local Binary Pattern (VLD-GLBP) for face recognition. It significantly reduces the face representation data volume whereas the performance is comparable to that of the complex state-of-the-art techniques. Specifically, local binary pattern (LBP) features are first computed from the Gabor images. Then, the most frequently occurred patterns are extracted to form VLD-GLBP. Finally the distance between VLD-GLBPs is computed to realize the face image classification. The experiment results on FERET database verify the efficiency of the proposed VLD-GLBP method.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117340437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Packet scheduling in multicamera capture systems 多摄像机捕获系统中的数据包调度
Pub Date : 2014-12-01 DOI: 10.1109/VCIP.2014.7051575
L. Toni, Thomas Maugey, P. Frossard
In multiview video services, multiple cameras acquire the same scene from different perspectives, which results in correlated video streams. This generates large amounts of highly redundant data, which need to be properly handled during encoding and transmission of the multi-view data. In this work, we study coding and transmission strategies in multicamera sets, where correlated sources need to be sent to a central server through a bottleneck channel, and eventually delivered to interactive clients. We propose a dynamic correlation-aware packet scheduling optimization under delay, bandwidth, and interactivity constraints. A novel trellis-based solution permits to formally decompose the multivariate optimization problem, thereby significantly reducing the computation complexity. Simulation results show the gain of the proposed algorithm compared to baseline scheduling policies.
在多视点视频业务中,多个摄像机从不同的角度获取相同的场景,从而产生相关的视频流。这就产生了大量的高度冗余的数据,需要在多视图数据的编码和传输过程中进行适当的处理。在这项工作中,我们研究了多摄像机集中的编码和传输策略,其中相关源需要通过瓶颈通道发送到中央服务器,并最终交付给交互式客户端。在时延、带宽和交互性约束下,提出了一种动态关联感知的分组调度优化方法。一种新的基于网格的求解方法可以形式化地分解多元优化问题,从而大大降低了计算复杂度。仿真结果表明,与基准调度策略相比,该算法的增益较大。
{"title":"Packet scheduling in multicamera capture systems","authors":"L. Toni, Thomas Maugey, P. Frossard","doi":"10.1109/VCIP.2014.7051575","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051575","url":null,"abstract":"In multiview video services, multiple cameras acquire the same scene from different perspectives, which results in correlated video streams. This generates large amounts of highly redundant data, which need to be properly handled during encoding and transmission of the multi-view data. In this work, we study coding and transmission strategies in multicamera sets, where correlated sources need to be sent to a central server through a bottleneck channel, and eventually delivered to interactive clients. We propose a dynamic correlation-aware packet scheduling optimization under delay, bandwidth, and interactivity constraints. A novel trellis-based solution permits to formally decompose the multivariate optimization problem, thereby significantly reducing the computation complexity. Simulation results show the gain of the proposed algorithm compared to baseline scheduling policies.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127791091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Derived disparity vector based NBDV for 3D-AVC 基于视差矢量的3D-AVC NBDV导出
Pub Date : 2014-12-01 DOI: 10.1109/VCIP.2014.7051517
Xin Zhao, Ying Chen, Li Zhang
In the 3D video extension of H.264/AVC, namely 3D-AVC, Neighboring Based Disparity Vector (NBDV) derivation has been proposed to support multiview/stereo compatibility, therefore texture views can be decoded independently to depth views. NBDV generates a disparity vector for the current macroblock (MB) using the motion information of neighboring blocks, especially those coded with motion vectors pointing to inter-view reference pictures. In 3D-AVC, NBDV has been utilized to access minimum number of spatial and temporal neighboring blocks, therefore there is a high probability that NBDV does not derive an efficient disparity vector. This paper introduces a derived disparity vector scheme, wherein only one disparity vector derived from NBDV is maintained for the whole slice and it is used as the disparity vector of the current MB if NBDV does not derive one from neighboring blocks. Simulation results show that the proposed method provides 3.6% bit rate reduction for multiview coding.
在H.264/AVC的3D视频扩展中,即3D-AVC,提出了基于邻域视差向量(NBDV)的派生,以支持多视图/立体兼容,因此纹理视图可以独立解码为深度视图。NBDV利用相邻块的运动信息生成当前宏块(MB)的视差向量,特别是那些被运动向量编码指向跨视图参考图像的宏块。在3D-AVC中,NBDV已被用于访问最小数量的时空相邻块,因此NBDV很可能无法推导出有效的视差向量。本文介绍了一种派生视差矢量方案,该方案在整个切片中只保留一个由NBDV派生的视差矢量,如果NBDV不从相邻块派生视差矢量,则将其作为当前MB的视差矢量。仿真结果表明,该方法可将多视图编码的码率降低3.6%。
{"title":"Derived disparity vector based NBDV for 3D-AVC","authors":"Xin Zhao, Ying Chen, Li Zhang","doi":"10.1109/VCIP.2014.7051517","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051517","url":null,"abstract":"In the 3D video extension of H.264/AVC, namely 3D-AVC, Neighboring Based Disparity Vector (NBDV) derivation has been proposed to support multiview/stereo compatibility, therefore texture views can be decoded independently to depth views. NBDV generates a disparity vector for the current macroblock (MB) using the motion information of neighboring blocks, especially those coded with motion vectors pointing to inter-view reference pictures. In 3D-AVC, NBDV has been utilized to access minimum number of spatial and temporal neighboring blocks, therefore there is a high probability that NBDV does not derive an efficient disparity vector. This paper introduces a derived disparity vector scheme, wherein only one disparity vector derived from NBDV is maintained for the whole slice and it is used as the disparity vector of the current MB if NBDV does not derive one from neighboring blocks. Simulation results show that the proposed method provides 3.6% bit rate reduction for multiview coding.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116894680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Intrinsic flexibility exploiting for scalable video streaming over multi-channel wireless networks 在多通道无线网络上开发可扩展视频流的内在灵活性
Pub Date : 2014-12-01 DOI: 10.1109/VCIP.2014.7051503
Ruixiao Yao, Yanwei Liu, Jinxia Liu, Pinghua Zhao, S. Ci
Scalable video has natural advantages in adapting to the multi-channel wireless networks. And some existing works tried to further optimize the scalable video transmission by combining the crude layer-importance mapping with some extrinsic techniques, such as Forward Error Correction (FEC) and Adaptive Modulation and Coding (AMC). However, the intrinsic flexibility of scalable video streaming over the multichannel wireless networks was neglected. In this paper, we try to exploit the intrinsic flexibility by firstly analyzing the priorities of H.264/SVC video data at the network abstraction layer unit (NALU) level, and then designing the priority-validity delivery scheme for the scalable video streaming. With this strategy, the sub-stream extraction is intelligently adjusted according to the delivery history, and the more important data in a group of pictures (GOP) will be delivered through the more reliable channels. Experimental results also validate the strategy's effectiveness in improving the objective quality and perceptual experience of the received video.
可扩展视频在适应多通道无线网络方面具有天然的优势。现有的一些研究尝试将粗糙的层重要性映射与前向纠错(FEC)、自适应调制编码(AMC)等外在技术相结合,进一步优化可扩展视频传输。然而,在多通道无线网络中,可扩展视频流的内在灵活性被忽视了。本文首先从网络抽象层单元(NALU)层面分析了H.264/SVC视频数据的优先级,并在此基础上设计了可扩展视频流的优先级有效性传输方案,以充分利用H.264/SVC视频数据固有的灵活性。通过该策略,可以根据交付历史智能调整子流提取,使一组图片(GOP)中更重要的数据通过更可靠的渠道交付。实验结果也验证了该策略在提高接收视频的客观质量和感知体验方面的有效性。
{"title":"Intrinsic flexibility exploiting for scalable video streaming over multi-channel wireless networks","authors":"Ruixiao Yao, Yanwei Liu, Jinxia Liu, Pinghua Zhao, S. Ci","doi":"10.1109/VCIP.2014.7051503","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051503","url":null,"abstract":"Scalable video has natural advantages in adapting to the multi-channel wireless networks. And some existing works tried to further optimize the scalable video transmission by combining the crude layer-importance mapping with some extrinsic techniques, such as Forward Error Correction (FEC) and Adaptive Modulation and Coding (AMC). However, the intrinsic flexibility of scalable video streaming over the multichannel wireless networks was neglected. In this paper, we try to exploit the intrinsic flexibility by firstly analyzing the priorities of H.264/SVC video data at the network abstraction layer unit (NALU) level, and then designing the priority-validity delivery scheme for the scalable video streaming. With this strategy, the sub-stream extraction is intelligently adjusted according to the delivery history, and the more important data in a group of pictures (GOP) will be delivered through the more reliable channels. Experimental results also validate the strategy's effectiveness in improving the objective quality and perceptual experience of the received video.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131651113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2014 IEEE Visual Communications and Image Processing Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1