基于多线索目标检测的语义视频压缩性能评价

Noor M. Al-Shakarji, F. Bunyak, H. Aliakbarpour, G. Seetharaman, K. Palaniappan
{"title":"基于多线索目标检测的语义视频压缩性能评价","authors":"Noor M. Al-Shakarji, F. Bunyak, H. Aliakbarpour, G. Seetharaman, K. Palaniappan","doi":"10.1109/AIPR47015.2019.9174601","DOIUrl":null,"url":null,"abstract":"Video compression becomes a very important task during real-time aerial surveillance scenarios where limited communication bandwidth and on-board storage greatly restrict air-to-ground and air-to-air communications. In these cases, efficient handling of video data is needed to ensure optimum storage, smoother video transmission, fast and reliable video analysis. Conventional video compression schemes were typically designed for human visual perception rather than automated video analytics. Information loss and artifacts introduced during image/video compression impose serious limitations on the performance of automated video analytics tasks. These limitations are further increased in aerial imagery due to complex background and small size of objects. In this paper, we describe and evaluate a salient region estimation pipeline for aerial imagery to enable adaptive bit-rate allocation during video compression. The salient regions are estimated using a multi-cue moving vehicle detection pipeline, which synergistically fuses complementary appearance and motion cues using deep learning-based object detection and flux tensor-based spatio-temporal filtering approaches. Adaptive compression results using the described multi-cue saliency estimation pipeline are compared against conventional MPEG and JPEG encoding in terms of compression ratio, image quality, and impact on automated video analytics operations. Experimental results on ABQ urban aerial video dataset [1] show that incorporation of contextual information enables high semantic compression ratios of over 2000:1 while preserving image quality for the regions of interest. The proposed pipeline enables better utilization of the limited bandwidth of the air-to-ground or air-to-air network links.","PeriodicalId":167075,"journal":{"name":"2019 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Performance Evaluation of Semantic Video Compression using Multi-cue Object Detection\",\"authors\":\"Noor M. Al-Shakarji, F. Bunyak, H. Aliakbarpour, G. Seetharaman, K. Palaniappan\",\"doi\":\"10.1109/AIPR47015.2019.9174601\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Video compression becomes a very important task during real-time aerial surveillance scenarios where limited communication bandwidth and on-board storage greatly restrict air-to-ground and air-to-air communications. In these cases, efficient handling of video data is needed to ensure optimum storage, smoother video transmission, fast and reliable video analysis. Conventional video compression schemes were typically designed for human visual perception rather than automated video analytics. Information loss and artifacts introduced during image/video compression impose serious limitations on the performance of automated video analytics tasks. These limitations are further increased in aerial imagery due to complex background and small size of objects. In this paper, we describe and evaluate a salient region estimation pipeline for aerial imagery to enable adaptive bit-rate allocation during video compression. The salient regions are estimated using a multi-cue moving vehicle detection pipeline, which synergistically fuses complementary appearance and motion cues using deep learning-based object detection and flux tensor-based spatio-temporal filtering approaches. Adaptive compression results using the described multi-cue saliency estimation pipeline are compared against conventional MPEG and JPEG encoding in terms of compression ratio, image quality, and impact on automated video analytics operations. Experimental results on ABQ urban aerial video dataset [1] show that incorporation of contextual information enables high semantic compression ratios of over 2000:1 while preserving image quality for the regions of interest. The proposed pipeline enables better utilization of the limited bandwidth of the air-to-ground or air-to-air network links.\",\"PeriodicalId\":167075,\"journal\":{\"name\":\"2019 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIPR47015.2019.9174601\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIPR47015.2019.9174601","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

在实时空中监视场景中,有限的通信带宽和机载存储极大地限制了空对地和空对空通信,视频压缩成为一项非常重要的任务。在这些情况下,需要对视频数据进行有效的处理,以保证最佳的存储,更流畅的视频传输,快速可靠的视频分析。传统的视频压缩方案通常是为人类视觉感知而不是自动视频分析而设计的。在图像/视频压缩过程中引入的信息丢失和伪影严重限制了自动视频分析任务的性能。由于复杂的背景和小尺寸的对象,这些限制在航空成像中进一步增加。在本文中,我们描述和评估了一个显著区域估计管道,用于航空图像,以实现视频压缩过程中的自适应比特率分配。使用多线索移动车辆检测管道估计突出区域,该管道使用基于深度学习的物体检测和基于通量张量的时空滤波方法协同融合互补的外观和运动线索。使用所描述的多线索显著性估计管道的自适应压缩结果与传统的MPEG和JPEG编码在压缩比、图像质量和对自动视频分析操作的影响方面进行了比较。在ABQ城市航拍视频数据集[1]上的实验结果表明,上下文信息的结合可以实现超过2000:1的高语义压缩比,同时保持感兴趣区域的图像质量。拟议的管道能够更好地利用空对地或空对空网络链路的有限带宽。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Performance Evaluation of Semantic Video Compression using Multi-cue Object Detection
Video compression becomes a very important task during real-time aerial surveillance scenarios where limited communication bandwidth and on-board storage greatly restrict air-to-ground and air-to-air communications. In these cases, efficient handling of video data is needed to ensure optimum storage, smoother video transmission, fast and reliable video analysis. Conventional video compression schemes were typically designed for human visual perception rather than automated video analytics. Information loss and artifacts introduced during image/video compression impose serious limitations on the performance of automated video analytics tasks. These limitations are further increased in aerial imagery due to complex background and small size of objects. In this paper, we describe and evaluate a salient region estimation pipeline for aerial imagery to enable adaptive bit-rate allocation during video compression. The salient regions are estimated using a multi-cue moving vehicle detection pipeline, which synergistically fuses complementary appearance and motion cues using deep learning-based object detection and flux tensor-based spatio-temporal filtering approaches. Adaptive compression results using the described multi-cue saliency estimation pipeline are compared against conventional MPEG and JPEG encoding in terms of compression ratio, image quality, and impact on automated video analytics operations. Experimental results on ABQ urban aerial video dataset [1] show that incorporation of contextual information enables high semantic compression ratios of over 2000:1 while preserving image quality for the regions of interest. The proposed pipeline enables better utilization of the limited bandwidth of the air-to-ground or air-to-air network links.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Automated Segmentation of Nucleus, Cytoplasm and Background of Cervical Cells from Pap-smear Images using a Trainable Pixel Level Classifier Improving Industrial Safety Gear Detection through Re-ID conditioned Detector Internet of Things Anomaly Detection using Machine Learning Evaluation of Generative Adversarial Network Performance Based on Direct Analysis of Generated Images GLSNet: Global and Local Streams Network for 3D Point Cloud Classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1