首页 > 最新文献

2021 IEEE International Conference on Image Processing (ICIP)最新文献

英文 中文
Weakly-Supervised Multiple Object Tracking Via A Masked Center Point Warping Loss 基于掩蔽中心点扭曲损失的弱监督多目标跟踪
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506732
Sungjoon Yoon, Kyujin Shim, Kayoung Park, Changick Kim
Multiple object tracking (MOT), a popular subject in computer vision with broad application areas, aims to detect and track multiple objects across an input video. However, recent learning-based MOT methods require strong supervision on both the bounding box and the ID of each object for every frame used during training, which induces a heightened cost for obtaining labeled data. In this paper, we propose a weakly-supervised MOT framework that enables the accurate tracking of multiple objects while being trained without object ID ground truth labels. Our model is trained only with the bounding box information with a novel masked warping loss that drives the network to indirectly learn how to track objects through a video. Specifically, valid object center points in the current frame are warped with the predicted offset vector and enforced to be equal to the valid object center points in the previous frame. With this approach, we obtain an MOT accuracy on par with those of the state-of-the-art fully supervised MOT models, which use both the bounding boxes and object ID as ground truth labels, on the MOT17 dataset.
多目标跟踪(MOT)是计算机视觉领域的一门热门学科,具有广泛的应用领域,旨在检测和跟踪输入视频中的多个目标。然而,最近基于学习的MOT方法需要对训练过程中使用的每一帧的边界框和每个对象的ID进行强监督,这导致获得标记数据的成本增加。在本文中,我们提出了一个弱监督的MOT框架,该框架能够在没有对象ID地面真值标签的情况下对多个对象进行准确跟踪。我们的模型仅使用边界框信息进行训练,该信息带有一种新颖的掩盖扭曲损失,该扭曲损失驱动网络间接学习如何通过视频跟踪对象。具体来说,当前帧中的有效对象中心点被预测的偏移向量扭曲,并强制与前一帧中的有效对象中心点相等。通过这种方法,我们获得了与最先进的完全监督MOT模型相当的MOT精度,该模型在MOT17数据集上使用边界框和对象ID作为地面真值标签。
{"title":"Weakly-Supervised Multiple Object Tracking Via A Masked Center Point Warping Loss","authors":"Sungjoon Yoon, Kyujin Shim, Kayoung Park, Changick Kim","doi":"10.1109/ICIP42928.2021.9506732","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506732","url":null,"abstract":"Multiple object tracking (MOT), a popular subject in computer vision with broad application areas, aims to detect and track multiple objects across an input video. However, recent learning-based MOT methods require strong supervision on both the bounding box and the ID of each object for every frame used during training, which induces a heightened cost for obtaining labeled data. In this paper, we propose a weakly-supervised MOT framework that enables the accurate tracking of multiple objects while being trained without object ID ground truth labels. Our model is trained only with the bounding box information with a novel masked warping loss that drives the network to indirectly learn how to track objects through a video. Specifically, valid object center points in the current frame are warped with the predicted offset vector and enforced to be equal to the valid object center points in the previous frame. With this approach, we obtain an MOT accuracy on par with those of the state-of-the-art fully supervised MOT models, which use both the bounding boxes and object ID as ground truth labels, on the MOT17 dataset.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"84 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113992394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Modeling Image Quality Score Distribution Using Alpha Stable Model 用Alpha稳定模型建模图像质量评分分布
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506196
Yixuan Gao, Xiongkuo Min, Wenhan Zhu, Xiao-Ping Zhang, Guangtao Zhai
In recent years, image quality is generally described by a mean opinion score (MOS). However, we observe that an image’s quality ratings given by a group of subjects may not follow a Gaussian distribution and the image quality can not be fully described by a MOS. In this paper, we propose to describe the image quality using a parameterized distribution rather than a MOS, and an objective method is also proposed to predict the image quality score distribution (IQSD). Specifically, we selected 100 images from the LIVE database and invited a large group of subjects to evaluate the quality of these images. By analyzing the subjective quality ratings, we find that the IQSD can be well modeled by an alpha stable model and this model can reflect much more information than MOS. Therefore, we propose an algorithm to model the IQSD described by an alpha stable model. Features are extracted from images based on natural scene statistics and support vector regressors are trained to predict the IQSD described by an alpha stable model. We validate the proposed IQSD prediction model on the collected subjective quality ratings. Experimental results verify the effectiveness of the proposed algorithm in modeling the IQSD.
近年来,图像质量通常用平均意见评分(MOS)来描述。然而,我们观察到一组受试者给出的图像质量评级可能不遵循高斯分布,并且图像质量不能被MOS完全描述。本文提出了用参数化分布来描述图像质量,而不是用最小最小值来描述图像质量,并提出了一种预测图像质量分数分布(IQSD)的客观方法。具体来说,我们从LIVE数据库中选择了100张图像,并邀请了一大批受试者来评估这些图像的质量。通过对主观质量评分的分析,我们发现一个稳定的alpha模型可以很好地描述智商差异,并且该模型可以反映更多的信息。因此,我们提出了一种由α稳定模型描述的IQSD建模算法。基于自然场景统计从图像中提取特征,并训练支持向量回归器来预测由α稳定模型描述的IQSD。我们在收集的主观质量评分上验证了提出的IQSD预测模型。实验结果验证了该算法对IQSD建模的有效性。
{"title":"Modeling Image Quality Score Distribution Using Alpha Stable Model","authors":"Yixuan Gao, Xiongkuo Min, Wenhan Zhu, Xiao-Ping Zhang, Guangtao Zhai","doi":"10.1109/ICIP42928.2021.9506196","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506196","url":null,"abstract":"In recent years, image quality is generally described by a mean opinion score (MOS). However, we observe that an image’s quality ratings given by a group of subjects may not follow a Gaussian distribution and the image quality can not be fully described by a MOS. In this paper, we propose to describe the image quality using a parameterized distribution rather than a MOS, and an objective method is also proposed to predict the image quality score distribution (IQSD). Specifically, we selected 100 images from the LIVE database and invited a large group of subjects to evaluate the quality of these images. By analyzing the subjective quality ratings, we find that the IQSD can be well modeled by an alpha stable model and this model can reflect much more information than MOS. Therefore, we propose an algorithm to model the IQSD described by an alpha stable model. Features are extracted from images based on natural scene statistics and support vector regressors are trained to predict the IQSD described by an alpha stable model. We validate the proposed IQSD prediction model on the collected subjective quality ratings. Experimental results verify the effectiveness of the proposed algorithm in modeling the IQSD.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"360 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124530462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Class Specific Interpretability in CNN Using Causal Analysis 用因果分析法分析CNN的类可解释性
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506118
Ankit Yadu, P. Suhas, N. Sinha
A singular problem that mars the wide applicability of machine learning (ML) models is the lack of generalizability and interpretability. The ML community is increasingly working on bridging this gap. Prominent among them are methods that study causal significance of features, with techniques such as Average Causal Effect (ACE). In this paper, our objective is to utilize the causal analysis framework to measure the significance level of the features in binary classification task. Towards this, we propose a novel ACE-based metric called “Absolute area under ACE (A-ACE)” which computes the area of the absolute value of the ACE across different permissible levels of intervention. The performance of the proposed metric is illustrated on (i) ILSVRC (Imagenet) dataset and (ii) MNIST data set $(sim 42000$ images) by considering pair-wise binary classification problem. Encouraging results have been observed on these two datasets. The computed metric values are found to be higher - peak performance of 10x higher than other for ILSVRC dataset and 50% higher than others for MNIST dataset - at precisely those locations that human intuition would mark as distinguishing regions. The method helps to capture the quantifiable metric which represents the distinction between the classes learnt by the model. This metric aids in visual explanation of the model’s prediction and thus, makes the model more trustworthy.
影响机器学习(ML)模型广泛适用性的一个单一问题是缺乏泛化性和可解释性。ML社区正越来越多地致力于弥合这一差距。其中最突出的是研究特征因果意义的方法,如平均因果效应(ACE)技术。在本文中,我们的目标是利用因果分析框架来衡量二元分类任务中特征的显著性水平。为此,我们提出了一种新的基于ACE的度量,称为“ACE下的绝对面积(a -ACE)”,它计算不同允许干预水平下ACE绝对值的面积。通过考虑对二元分类问题,在(i) ILSVRC (Imagenet)数据集和(ii) MNIST数据集$(sim 42000$ images)上说明了所提出度量的性能。在这两个数据集上已经观察到令人鼓舞的结果。计算的度量值被发现更高——ILSVRC数据集的峰值性能比其他数据集高10倍,MNIST数据集的峰值性能比其他数据集高50%——正是在人类直觉标记为区分区域的位置。该方法有助于捕获可量化的度量,该度量表示模型所学习的类之间的区别。这个度量有助于模型预测的可视化解释,因此,使模型更值得信赖。
{"title":"Class Specific Interpretability in CNN Using Causal Analysis","authors":"Ankit Yadu, P. Suhas, N. Sinha","doi":"10.1109/ICIP42928.2021.9506118","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506118","url":null,"abstract":"A singular problem that mars the wide applicability of machine learning (ML) models is the lack of generalizability and interpretability. The ML community is increasingly working on bridging this gap. Prominent among them are methods that study causal significance of features, with techniques such as Average Causal Effect (ACE). In this paper, our objective is to utilize the causal analysis framework to measure the significance level of the features in binary classification task. Towards this, we propose a novel ACE-based metric called “Absolute area under ACE (A-ACE)” which computes the area of the absolute value of the ACE across different permissible levels of intervention. The performance of the proposed metric is illustrated on (i) ILSVRC (Imagenet) dataset and (ii) MNIST data set $(sim 42000$ images) by considering pair-wise binary classification problem. Encouraging results have been observed on these two datasets. The computed metric values are found to be higher - peak performance of 10x higher than other for ILSVRC dataset and 50% higher than others for MNIST dataset - at precisely those locations that human intuition would mark as distinguishing regions. The method helps to capture the quantifiable metric which represents the distinction between the classes learnt by the model. This metric aids in visual explanation of the model’s prediction and thus, makes the model more trustworthy.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124536793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Depth-Assisted Joint Detection Network For Monocular 3d Object Detection 用于单目三维物体检测的深度辅助联合检测网络
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506647
Jianjun Lei, Ting Guo, Bo Peng, Chuanbo Yu
In the past few years, monocular 3D object detection has attracted increasing attention due to the merit of low cost and wide range of applications. In this paper, a depth-assisted joint detection network (MonoDAJD) is proposed for monocular 3D object detection. Specifically, a consistency-aware joint detection mechanism is proposed to jointly detect objects in the image and depth map, and exploit the localization information from the depth detection stream to optimize the detection results. To obtain more accurate 3D bounding boxes, an orientation-embedded NMS is designed by introducing the orientation confidence prediction and embedding the orientation confidence into the traditional NMS. Experimental results on the widely used KITTI benchmark demonstrate that the proposed method achieves promising performance compared with the state-of-the-art monocular 3D object detection methods.
近年来,单目三维目标检测以其低廉的成本和广泛的应用范围受到越来越多的关注。本文提出了一种用于单目三维目标检测的深度辅助联合检测网络(MonoDAJD)。具体而言,提出了一种一致性感知的联合检测机制,对图像和深度图中的目标进行联合检测,并利用深度检测流中的定位信息优化检测结果。为了获得更精确的三维边界框,通过引入方向置信度预测,将方向置信度嵌入到传统NMS中,设计了一种嵌入方向的NMS。在广泛使用的KITTI基准上的实验结果表明,与目前最先进的单目三维目标检测方法相比,该方法具有良好的性能。
{"title":"Depth-Assisted Joint Detection Network For Monocular 3d Object Detection","authors":"Jianjun Lei, Ting Guo, Bo Peng, Chuanbo Yu","doi":"10.1109/ICIP42928.2021.9506647","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506647","url":null,"abstract":"In the past few years, monocular 3D object detection has attracted increasing attention due to the merit of low cost and wide range of applications. In this paper, a depth-assisted joint detection network (MonoDAJD) is proposed for monocular 3D object detection. Specifically, a consistency-aware joint detection mechanism is proposed to jointly detect objects in the image and depth map, and exploit the localization information from the depth detection stream to optimize the detection results. To obtain more accurate 3D bounding boxes, an orientation-embedded NMS is designed by introducing the orientation confidence prediction and embedding the orientation confidence into the traditional NMS. Experimental results on the widely used KITTI benchmark demonstrate that the proposed method achieves promising performance compared with the state-of-the-art monocular 3D object detection methods.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124222499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
FAST and Efficient Microlens-Based Motion Search for Plenoptic Video Coding 基于微透镜的快速高效全光视频编码运动搜索
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506117
T. N. Huu, V. V. Duong, B. Jeon
The motion estimation which plays an important role in video coding requires much computation for encoding. In this paper, from the ray motion characteristics in the lenslet plenoptic video, we derive a new motion search model and propose a fast and efficient microlens-based motion search method. Theoretical analysis and experimental results have verified the new model and demonstrated its efficiency in search. Under the HEVC random-access configuration, we achieve not only substantial encoding time reduction (56.7%), but also bitrate saving of 1.3% on average compared to relevant existing works. Under the low delay configuration, the performances are 23.3% and 2.3%, respectively for encoding time reduction and bitrate saving.
运动估计在视频编码中起着重要的作用,需要大量的编码计算。本文从微透镜全光学视频的光线运动特征出发,推导了一种新的运动搜索模型,提出了一种快速高效的基于微透镜的运动搜索方法。理论分析和实验结果验证了该模型的有效性。在HEVC随机存取配置下,我们不仅实现了较大幅度的编码时间减少(56.7%),而且比特率比现有相关工作平均节省1.3%。在低延迟配置下,编码时间和比特率分别减少23.3%和2.3%。
{"title":"FAST and Efficient Microlens-Based Motion Search for Plenoptic Video Coding","authors":"T. N. Huu, V. V. Duong, B. Jeon","doi":"10.1109/ICIP42928.2021.9506117","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506117","url":null,"abstract":"The motion estimation which plays an important role in video coding requires much computation for encoding. In this paper, from the ray motion characteristics in the lenslet plenoptic video, we derive a new motion search model and propose a fast and efficient microlens-based motion search method. Theoretical analysis and experimental results have verified the new model and demonstrated its efficiency in search. Under the HEVC random-access configuration, we achieve not only substantial encoding time reduction (56.7%), but also bitrate saving of 1.3% on average compared to relevant existing works. Under the low delay configuration, the performances are 23.3% and 2.3%, respectively for encoding time reduction and bitrate saving.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127706131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Physiological Monitoring Of Front-Line Caregivers For Cv-19 Symptoms: Multi-Resolution Analysis & Convolutional-Recurrent Networks 一线护理人员对Cv-19症状的生理监测:多分辨率分析和卷积-循环网络
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506495
O. Dehzangi, P. Jeihouni, V. Finomore, A. Rezai
Due to easy transmission of the COVID-19, a crucial step is the effective screening of the front-line caregivers are one of the most vulnerable populations for early signs and symptoms, resembling the onset of the disease. Our aim in this paper is to track a combination of biomarkers in our ubiquitous experimental setup to monitor the human participants’ operating system to predict the likelihood of the viral infection symptoms during the next 2 days using a mobile app, and an unobtrusive wearable ring to track their physiological indicators and self-reported symptoms. we propose a multi-resolution signal processing and modeling method to effectively characterize the changes in those physiological indicators. In this way, we decompose the 1-D input windowed time-series in multi-resolution (i.e. 2-D spectro-temporal) space. Then, we fitted our proposed deep learning architecture that combines recurrent neural network (RNN) and convolutional neural network (CNN) to incorporate and model the sequence of multi-resolution snapshots in 3-D time-series space. The CNN is used to objectify the underlying features in each of the 2D spectro-temporal snapshots, while the RNN is utilized to track the temporal dynamic behavior of the snapshot sequences to predict the patients’ COVID-19 related symptoms. As the experimental results show, our proposed architecture with the best configuration achieves 87.53% and 95.12% average accuracy in predicting the COVID-19 related symptoms.
由于COVID-19容易传播,关键的一步是对最脆弱的人群之一的一线护理人员进行有效筛查,以发现类似于疾病发病的早期体征和症状。我们在本文中的目标是在我们无处不在的实验设置中跟踪生物标志物的组合,以监测人类参与者的操作系统,使用移动应用程序预测未来2天内病毒感染症状的可能性,并使用不显眼的可穿戴环来跟踪他们的生理指标和自我报告的症状。我们提出了一种多分辨率信号处理和建模方法来有效地表征这些生理指标的变化。通过这种方式,我们将一维输入加窗时间序列分解为多分辨率(即二维光谱-时间)空间。然后,我们拟合了我们提出的结合递归神经网络(RNN)和卷积神经网络(CNN)的深度学习架构,以合并和建模三维时间序列空间中的多分辨率快照序列。CNN用于客观化每个二维光谱-时间快照中的底层特征,RNN用于跟踪快照序列的时间动态行为,以预测患者的COVID-19相关症状。实验结果表明,在最佳配置下,我们提出的架构对COVID-19相关症状的预测平均准确率分别达到87.53%和95.12%。
{"title":"Physiological Monitoring Of Front-Line Caregivers For Cv-19 Symptoms: Multi-Resolution Analysis & Convolutional-Recurrent Networks","authors":"O. Dehzangi, P. Jeihouni, V. Finomore, A. Rezai","doi":"10.1109/ICIP42928.2021.9506495","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506495","url":null,"abstract":"Due to easy transmission of the COVID-19, a crucial step is the effective screening of the front-line caregivers are one of the most vulnerable populations for early signs and symptoms, resembling the onset of the disease. Our aim in this paper is to track a combination of biomarkers in our ubiquitous experimental setup to monitor the human participants’ operating system to predict the likelihood of the viral infection symptoms during the next 2 days using a mobile app, and an unobtrusive wearable ring to track their physiological indicators and self-reported symptoms. we propose a multi-resolution signal processing and modeling method to effectively characterize the changes in those physiological indicators. In this way, we decompose the 1-D input windowed time-series in multi-resolution (i.e. 2-D spectro-temporal) space. Then, we fitted our proposed deep learning architecture that combines recurrent neural network (RNN) and convolutional neural network (CNN) to incorporate and model the sequence of multi-resolution snapshots in 3-D time-series space. The CNN is used to objectify the underlying features in each of the 2D spectro-temporal snapshots, while the RNN is utilized to track the temporal dynamic behavior of the snapshot sequences to predict the patients’ COVID-19 related symptoms. As the experimental results show, our proposed architecture with the best configuration achieves 87.53% and 95.12% average accuracy in predicting the COVID-19 related symptoms.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126553712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Two-Stream Boosted TCRNet for Range-Tolerant Infra-Red Target Detection 用于距离容忍红外目标检测的双流增强TCRNet
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506170
Shah Hassan, Abhijit Mahalanobis
The detection of vehicular targets in infra-red imagery is a challenging task, both due to the relatively few pixels on target and the false alarms produced by the surrounding terrain clutter. It has been previously shown [1] that a relatively simple network (known as TCRNet) can outperform conventional deep CNNs for such applications by maximizing a target to clutter ratio (TCR) metric. In this paper, we introduce a new form of the network (referred to as TCRNet-2) that further improves the performance by first processing target and clutter information in two parallel channels and then combining them to optimize the TCR metric. We also show that the overall performance can be considerably improved by boosting the performance of a primary TCRNet-2 detector, with a secondary network that enhances discrimination between targets and clutter in the false alarm space of the primary network. We analyze the performance of the proposed networks using a publicly available data set of infra-red images of targets in natural terrain. It is shown that the TCRNet-2 and its boosted version yield considerably better performance than the original TCRNet over a wide range of distances, in both day and night conditions.
红外图像中车辆目标的检测是一项具有挑战性的任务,因为目标上的像素相对较少,而且周围的地形杂波会产生假警报。先前的研究表明,一个相对简单的网络(称为TCRNet)可以通过最大化目标杂波比(TCR)指标,在此类应用中优于传统的深度cnn。在本文中,我们引入了一种新的网络形式(称为TCRNet-2),该网络首先在两个并行信道中处理目标和杂波信息,然后将它们组合在一起以优化TCR度量,从而进一步提高了性能。我们还表明,通过提高主TCRNet-2探测器的性能,可以大大提高整体性能,而辅助网络可以增强主网络虚警空间中目标和杂波之间的区分。我们使用公开可用的自然地形目标红外图像数据集来分析所提出网络的性能。在白天和夜间条件下,TCRNet-2及其增强版本在大范围距离上比原始TCRNet产生更好的性能。
{"title":"Two-Stream Boosted TCRNet for Range-Tolerant Infra-Red Target Detection","authors":"Shah Hassan, Abhijit Mahalanobis","doi":"10.1109/ICIP42928.2021.9506170","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506170","url":null,"abstract":"The detection of vehicular targets in infra-red imagery is a challenging task, both due to the relatively few pixels on target and the false alarms produced by the surrounding terrain clutter. It has been previously shown [1] that a relatively simple network (known as TCRNet) can outperform conventional deep CNNs for such applications by maximizing a target to clutter ratio (TCR) metric. In this paper, we introduce a new form of the network (referred to as TCRNet-2) that further improves the performance by first processing target and clutter information in two parallel channels and then combining them to optimize the TCR metric. We also show that the overall performance can be considerably improved by boosting the performance of a primary TCRNet-2 detector, with a secondary network that enhances discrimination between targets and clutter in the false alarm space of the primary network. We analyze the performance of the proposed networks using a publicly available data set of infra-red images of targets in natural terrain. It is shown that the TCRNet-2 and its boosted version yield considerably better performance than the original TCRNet over a wide range of distances, in both day and night conditions.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128108658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Simtrojan: Stealthy Backdoor Attack 辛特洛伊:秘密后门攻击
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506313
Yankun Ren, Longfei Li, Jun Zhou
Recent researches indicate deep learning models are vulnerable to adversarial attacks. Backdoor attack, also called trojan attack, is a variant of adversarial attacks. An malicious attacker can inject backdoor to models in training phase. As a result, the backdoor model performs normally on clean samples and can be triggered by a backdoor pattern to recognize backdoor samples as a wrong target label specified by the attacker. However, the vanilla backdoor attack method causes a measurable difference between clean and backdoor samples in latent space. Several state-of-the-art defense methods utilize this to identify backdoor samples. In this paper, we propose a novel backdoor attack method called SimTrojan, which aims to inject backdoor in models stealthily. Specifically, SimTrojan makes clean and backdoor samples have indistinguishable representations in latent space to evade current defense methods. Experiments demonstrate that SimTrojan achieves a high attack success rate and is undetectable by state-of-the-art defense methods. The study suggests the urgency of building more effective defense methods.
最近的研究表明,深度学习模型容易受到对抗性攻击。后门攻击又称木马攻击,是对抗性攻击的一种变体。恶意攻击者可以在训练阶段给模型注入后门。因此,后门模型在干净样本上执行正常,并且可以由后门模式触发,将后门样本识别为攻击者指定的错误目标标签。然而,香草后门攻击方法在潜在空间中导致干净样本和后门样本之间的可测量差异。一些最先进的防御方法利用它来识别后门样本。在本文中,我们提出了一种新的后门攻击方法SimTrojan,其目的是在模型中隐秘地注入后门。具体而言,SimTrojan使干净样本和后门样本在潜在空间中具有不可区分的表示,以逃避当前的防御方法。实验证明,SimTrojan具有很高的攻击成功率,并且是最先进的防御方法无法检测到的。这项研究表明,迫切需要建立更有效的防御方法。
{"title":"Simtrojan: Stealthy Backdoor Attack","authors":"Yankun Ren, Longfei Li, Jun Zhou","doi":"10.1109/ICIP42928.2021.9506313","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506313","url":null,"abstract":"Recent researches indicate deep learning models are vulnerable to adversarial attacks. Backdoor attack, also called trojan attack, is a variant of adversarial attacks. An malicious attacker can inject backdoor to models in training phase. As a result, the backdoor model performs normally on clean samples and can be triggered by a backdoor pattern to recognize backdoor samples as a wrong target label specified by the attacker. However, the vanilla backdoor attack method causes a measurable difference between clean and backdoor samples in latent space. Several state-of-the-art defense methods utilize this to identify backdoor samples. In this paper, we propose a novel backdoor attack method called SimTrojan, which aims to inject backdoor in models stealthily. Specifically, SimTrojan makes clean and backdoor samples have indistinguishable representations in latent space to evade current defense methods. Experiments demonstrate that SimTrojan achieves a high attack success rate and is undetectable by state-of-the-art defense methods. The study suggests the urgency of building more effective defense methods.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"2010 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125628202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Learning to Restore Images Degraded by Atmospheric Turbulence Using Uncertainty 利用不确定性学习恢复大气湍流退化的图像
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506614
R. Yasarla, Vishal M. Patel
Atmospheric turbulence can significantly degrade the quality of images acquired by long-range imaging systems by causing spatially and temporally random fluctuations in the index of refraction of the atmosphere. Variations in the refractive index causes the captured images to be geometrically distorted and blurry. Hence, it is important to compensate for the visual degradation in images caused by atmospheric turbulence. In this paper, we propose a deep learning-based approach for restring a single image degraded by atmospheric turbulence. We make use of the epistemic uncertainty based on Monte Carlo dropouts to capture regions in the image where the network is having hard time restoring. The estimated uncertainty maps are then used to guide the network to obtain the restored image. Extensive experiments are conducted on synthetic and real images to show the significance of the proposed work.
大气湍流会引起大气折射率在空间和时间上的随机波动,从而显著降低远程成像系统获得的图像质量。折射率的变化导致捕获的图像在几何上扭曲和模糊。因此,对大气湍流引起的图像视觉退化进行补偿是很重要的。在本文中,我们提出了一种基于深度学习的方法来恢复被大气湍流退化的单幅图像。我们利用基于蒙特卡罗dropouts的认知不确定性来捕获图像中网络难以恢复的区域。然后使用估计的不确定性映射来引导网络获得恢复后的图像。在合成图像和真实图像上进行了大量的实验,以显示所提出工作的意义。
{"title":"Learning to Restore Images Degraded by Atmospheric Turbulence Using Uncertainty","authors":"R. Yasarla, Vishal M. Patel","doi":"10.1109/ICIP42928.2021.9506614","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506614","url":null,"abstract":"Atmospheric turbulence can significantly degrade the quality of images acquired by long-range imaging systems by causing spatially and temporally random fluctuations in the index of refraction of the atmosphere. Variations in the refractive index causes the captured images to be geometrically distorted and blurry. Hence, it is important to compensate for the visual degradation in images caused by atmospheric turbulence. In this paper, we propose a deep learning-based approach for restring a single image degraded by atmospheric turbulence. We make use of the epistemic uncertainty based on Monte Carlo dropouts to capture regions in the image where the network is having hard time restoring. The estimated uncertainty maps are then used to guide the network to obtain the restored image. Extensive experiments are conducted on synthetic and real images to show the significance of the proposed work.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"0104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125741304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Spcr: semi-supervised point cloud instance segmentation with perturbation consistency regularization 基于扰动一致性正则化的半监督点云实例分割
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506359
Yongbin Liao, Hongyuan Zhu, Tao Chen, Jiayuan Fan
Point cloud instance segmentation is steadily improving with the development of deep learning. However, current progress is hindered by the expensive cost of collecting dense point cloud labels. To this end, we propose the first semi-supervised point cloud instance segmentation architecture, which is called semi-supervised point cloud instance segmentation with perturbation consistency regularization (SPCR). It is capable to alleviate the data-hungry bottleneck of existing strongly supervised methods. Specifically, SPCR enforces an invariance of the predictions over different perturbations applied to the input point clouds. We firstly introduce various perturbation schemes on inputs to force the network to be robust and easily generalized to the unseen and unlabeled data. Further, perturbation consistency regularization is then conducted on predicted instance masks from various transformed inputs to provide self-supervision for network learning. Extensive experiments on the challenging ScanNet v2 dataset demonstrate our method can achieve competitive performance compared with the state-of-the-art of fully supervised methods.
随着深度学习的发展,点云实例分割技术也在不断改进。然而,目前的进展受到收集密集点云标签的昂贵成本的阻碍。为此,我们提出了第一个半监督点云实例分割体系结构,即微扰一致性正则化(SPCR)半监督点云实例分割。它能够缓解现有强监督方法的数据饥渴瓶颈。具体地说,SPCR对施加于输入点云的不同扰动施加了预测的不变性。我们首先在输入上引入各种摄动方案,以使网络具有鲁棒性,并且易于推广到未见过的和未标记的数据。然后,对来自各种转换输入的预测实例掩码进行扰动一致性正则化,为网络学习提供自监督。在具有挑战性的ScanNet v2数据集上进行的大量实验表明,与最先进的完全监督方法相比,我们的方法可以实现具有竞争力的性能。
{"title":"Spcr: semi-supervised point cloud instance segmentation with perturbation consistency regularization","authors":"Yongbin Liao, Hongyuan Zhu, Tao Chen, Jiayuan Fan","doi":"10.1109/ICIP42928.2021.9506359","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506359","url":null,"abstract":"Point cloud instance segmentation is steadily improving with the development of deep learning. However, current progress is hindered by the expensive cost of collecting dense point cloud labels. To this end, we propose the first semi-supervised point cloud instance segmentation architecture, which is called semi-supervised point cloud instance segmentation with perturbation consistency regularization (SPCR). It is capable to alleviate the data-hungry bottleneck of existing strongly supervised methods. Specifically, SPCR enforces an invariance of the predictions over different perturbations applied to the input point clouds. We firstly introduce various perturbation schemes on inputs to force the network to be robust and easily generalized to the unseen and unlabeled data. Further, perturbation consistency regularization is then conducted on predicted instance masks from various transformed inputs to provide self-supervision for network learning. Extensive experiments on the challenging ScanNet v2 dataset demonstrate our method can achieve competitive performance compared with the state-of-the-art of fully supervised methods.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132480191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 IEEE International Conference on Image Processing (ICIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1