基于时空线索融合的显著性提取及其在视频压缩中的应用

Cognitive Robotics Pub Date : 2022-01-01 DOI:10.1016/j.cogr.2022.06.003

Ke Li , Zhonghua Luo , Tong Zhang , Yinglan Ruan , Dan Zhou

{"title":"基于时空线索融合的显著性提取及其在视频压缩中的应用","authors":"Ke Li , Zhonghua Luo , Tong Zhang , Yinglan Ruan , Dan Zhou","doi":"10.1016/j.cogr.2022.06.003","DOIUrl":null,"url":null,"abstract":"<div><p>Extracting salient regions plays an important role in computer vision tasks, e.g., object detection, recognition and video compression. Previous saliency detection study is mostly conducted on individual frames and tends to extract saliency with spatial cues. The development of various motion feature further extends the saliency concept to the motion saliency from videos. In contrast to image-based saliency extraction, video-based saliency extraction is more challenging due to the complicated distractors, e.g., the background dynamics and shadows. In this paper, we propose a novel saliency extraction method by fusing temporal and spatial cues. In specific, the long-term and short-term variations are comprehensively fused to extract the temporal cue, which is then utilized to establish the background guidance for generating the spatial cue. Herein, the long-term variations and spatial cues jointly highlight the contrast between objects and the background, which can solve the problem caused by shadows. The short-term variations contribute to the removal of background dynamics. Spatiotemporal cues are fully exploited to constrain the saliency extraction across frames. The saliency extraction performance of our method is demonstrated by comparing it to both unsupervised and supervised methods. Moreover, this novel saliency extraction model is applied in the video compression tasks, helping to accelerate the video compression task and achieve a larger PSNR value for the region of interest (ROI).</p></div>","PeriodicalId":100288,"journal":{"name":"Cognitive Robotics","volume":"2 ","pages":"Pages 177-185"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667241322000131/pdfft?md5=181cb8030eca6d4778b64500c49f1fa8&pid=1-s2.0-S2667241322000131-main.pdf","citationCount":"1","resultStr":"{\"title\":\"Spatiotemporal cue fusion-based saliency extraction and its application in video compression\",\"authors\":\"Ke Li , Zhonghua Luo , Tong Zhang , Yinglan Ruan , Dan Zhou\",\"doi\":\"10.1016/j.cogr.2022.06.003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Extracting salient regions plays an important role in computer vision tasks, e.g., object detection, recognition and video compression. Previous saliency detection study is mostly conducted on individual frames and tends to extract saliency with spatial cues. The development of various motion feature further extends the saliency concept to the motion saliency from videos. In contrast to image-based saliency extraction, video-based saliency extraction is more challenging due to the complicated distractors, e.g., the background dynamics and shadows. In this paper, we propose a novel saliency extraction method by fusing temporal and spatial cues. In specific, the long-term and short-term variations are comprehensively fused to extract the temporal cue, which is then utilized to establish the background guidance for generating the spatial cue. Herein, the long-term variations and spatial cues jointly highlight the contrast between objects and the background, which can solve the problem caused by shadows. The short-term variations contribute to the removal of background dynamics. Spatiotemporal cues are fully exploited to constrain the saliency extraction across frames. The saliency extraction performance of our method is demonstrated by comparing it to both unsupervised and supervised methods. Moreover, this novel saliency extraction model is applied in the video compression tasks, helping to accelerate the video compression task and achieve a larger PSNR value for the region of interest (ROI).</p></div>\",\"PeriodicalId\":100288,\"journal\":{\"name\":\"Cognitive Robotics\",\"volume\":\"2 \",\"pages\":\"Pages 177-185\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2667241322000131/pdfft?md5=181cb8030eca6d4778b64500c49f1fa8&pid=1-s2.0-S2667241322000131-main.pdf\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cognitive Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667241322000131\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Robotics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667241322000131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

突出区域的提取在目标检测、识别和视频压缩等计算机视觉任务中起着重要的作用。以往的显著性检测研究大多是针对单个帧进行的，并且倾向于利用空间线索提取显著性。各种运动特征的发展进一步将显著性概念从视频扩展到运动显著性。与基于图像的显著性提取相比，由于背景动态和阴影等复杂的干扰因素，基于视频的显著性提取更具挑战性。本文提出了一种融合时空线索的显著性提取方法。具体而言，将长期和短期变化综合融合提取时间线索，然后利用时间线索建立生成空间线索的背景指导。其中，长期变化和空间线索共同突出了物体与背景的对比，可以解决阴影带来的问题。短期变化有助于消除背景动态。充分利用时空线索来约束帧间的显著性提取。通过与无监督和有监督方法的比较，证明了该方法的显著性提取性能。此外，将该显著性提取模型应用于视频压缩任务中，有助于加快视频压缩任务的速度，获得更大的感兴趣区域(ROI)的PSNR值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Spatiotemporal cue fusion-based saliency extraction and its application in video compression

Extracting salient regions plays an important role in computer vision tasks, e.g., object detection, recognition and video compression. Previous saliency detection study is mostly conducted on individual frames and tends to extract saliency with spatial cues. The development of various motion feature further extends the saliency concept to the motion saliency from videos. In contrast to image-based saliency extraction, video-based saliency extraction is more challenging due to the complicated distractors, e.g., the background dynamics and shadows. In this paper, we propose a novel saliency extraction method by fusing temporal and spatial cues. In specific, the long-term and short-term variations are comprehensively fused to extract the temporal cue, which is then utilized to establish the background guidance for generating the spatial cue. Herein, the long-term variations and spatial cues jointly highlight the contrast between objects and the background, which can solve the problem caused by shadows. The short-term variations contribute to the removal of background dynamics. Spatiotemporal cues are fully exploited to constrain the saliency extraction across frames. The saliency extraction performance of our method is demonstrated by comparing it to both unsupervised and supervised methods. Moreover, this novel saliency extraction model is applied in the video compression tasks, helping to accelerate the video compression task and achieve a larger PSNR value for the region of interest (ROI).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Cognitive Robotics

CiteScore

8.40

自引率

0.00%

发文量