首页 > 最新文献

2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)最新文献

英文 中文
Injecting Sparsity in Anomaly Detection for Efficient Inference 在异常检测中注入稀疏性以实现高效推理
Bokyeung Lee, Hanseok Ko
Anomaly detection in the video is a challenging problem in computer vision tasks. Deep networks recently have been successfully applied and achieved competitive performance in anomaly detection. Modern deep networks employ many modules which extract important features. The anomaly detection approaches just developed network architecture and inserted additional networks to improve performance, however, these methods generally require a tremendous amount of computational load and training parameters. Because of limitations in the real world such as field equipment, mobile system, etc., reducing the number of trainable parameters and model capacity is an important issue in anomaly detection. Moreover, the method, which improves the performance of the anomaly detection algorithm, should be developed without additional trainable parameters. In this paper, we propose a sparsity injecting module which reinforces the feature representation of the existing model and presents the abnormality score function using sparsity. In experimental results, our sparsity injecting module improves the performance of state-of-the-art methods without additional trainable parameters.
视频异常检测是计算机视觉任务中的一个具有挑战性的问题。近年来,深度网络在异常检测中得到了成功的应用,并取得了良好的效果。现代深度网络采用许多提取重要特征的模块。异常检测方法只是通过开发网络架构和插入额外的网络来提高性能,但这些方法通常需要大量的计算负荷和训练参数。由于受到现场设备、移动系统等现实世界的限制,减少可训练参数的数量和模型容量是异常检测中的一个重要问题。此外,该方法应在不增加可训练参数的情况下开发,以提高异常检测算法的性能。本文提出了一种稀疏性注入模块,该模块加强了现有模型的特征表示,并利用稀疏性来表示异常评分函数。在实验结果中,我们的稀疏注入模块在没有额外可训练参数的情况下提高了最先进方法的性能。
{"title":"Injecting Sparsity in Anomaly Detection for Efficient Inference","authors":"Bokyeung Lee, Hanseok Ko","doi":"10.1109/AVSS52988.2021.9663843","DOIUrl":"https://doi.org/10.1109/AVSS52988.2021.9663843","url":null,"abstract":"Anomaly detection in the video is a challenging problem in computer vision tasks. Deep networks recently have been successfully applied and achieved competitive performance in anomaly detection. Modern deep networks employ many modules which extract important features. The anomaly detection approaches just developed network architecture and inserted additional networks to improve performance, however, these methods generally require a tremendous amount of computational load and training parameters. Because of limitations in the real world such as field equipment, mobile system, etc., reducing the number of trainable parameters and model capacity is an important issue in anomaly detection. Moreover, the method, which improves the performance of the anomaly detection algorithm, should be developed without additional trainable parameters. In this paper, we propose a sparsity injecting module which reinforces the feature representation of the existing model and presents the abnormality score function using sparsity. In experimental results, our sparsity injecting module improves the performance of state-of-the-art methods without additional trainable parameters.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125730193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
TrichTrack: Multi-Object Tracking of Small-Scale Trichogramma Wasps 三目跟踪:小尺度赤眼蜂的多目标跟踪
Vishal Pani, M. Bernet, Vincent Calcagno, L. V. Oudenhove, F. Brémond
Trichogramma wasps behaviors are studied extensively due to their effectiveness as biological control agents across the globe. However, to our knowledge, the field of intra/inter-species Trichogramma behavior is yet to be explored thoroughly. To study these behaviors it is crucial to identify and track Trichogramma individuals over a long period in a lab setup. For this, we propose a robust tracking pipeline named TrichTrack. Due to the unavailability of labeled data, we train our detector using an iterative weakly supervised method. We also use a weakly supervised method to train a Re-Identification (ReID) network by leveraging noisy tracklet sampling. This enables us to distinguish Trichogramma individuals that are indistinguishable from human eyes. We also develop a two-staged tracking module that filters out the easy association to improve its efficiency. Our method outperforms existing insect trackers on most of the MOTMetrics, specifically on ID switches and fragmentations.
由于赤眼蜂作为生物防治剂的有效性,其行为在全球范围内得到了广泛的研究。然而,据我们所知,赤眼蜂的种内/种间行为领域尚未得到彻底的探索。为了研究这些行为,在实验室设置中长期识别和跟踪赤眼蜂个体是至关重要的。为此,我们提出了一个健壮的跟踪管道,名为TrichTrack。由于标记数据的不可用性,我们使用迭代弱监督方法训练检测器。我们还使用弱监督方法利用噪声轨道采样来训练再识别(ReID)网络。这使我们能够区分与人眼无法区分的赤眼蜂个体。我们还开发了一个两阶段的跟踪模块,过滤掉容易的关联以提高其效率。我们的方法在大多数MOTMetrics上优于现有的昆虫跟踪器,特别是在ID开关和片段上。
{"title":"TrichTrack: Multi-Object Tracking of Small-Scale Trichogramma Wasps","authors":"Vishal Pani, M. Bernet, Vincent Calcagno, L. V. Oudenhove, F. Brémond","doi":"10.1109/AVSS52988.2021.9663814","DOIUrl":"https://doi.org/10.1109/AVSS52988.2021.9663814","url":null,"abstract":"Trichogramma wasps behaviors are studied extensively due to their effectiveness as biological control agents across the globe. However, to our knowledge, the field of intra/inter-species Trichogramma behavior is yet to be explored thoroughly. To study these behaviors it is crucial to identify and track Trichogramma individuals over a long period in a lab setup. For this, we propose a robust tracking pipeline named TrichTrack. Due to the unavailability of labeled data, we train our detector using an iterative weakly supervised method. We also use a weakly supervised method to train a Re-Identification (ReID) network by leveraging noisy tracklet sampling. This enables us to distinguish Trichogramma individuals that are indistinguishable from human eyes. We also develop a two-staged tracking module that filters out the easy association to improve its efficiency. Our method outperforms existing insect trackers on most of the MOTMetrics, specifically on ID switches and fragmentations.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128597866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
ARPD: Anchor-free Rotation-aware People Detection using Topview Fisheye Camera ARPD:使用Topview鱼眼相机的无锚旋转感知人物检测
Quan Minh Nguyen, Bang Le Van, Can Nguyen, Anh Le, Viet Dung Nguyen
People detection in top-view, fish-eye images is challenging as people in fish-eye images often appear in arbitrary directions and are distorted differently. Due to this unique radial geometry, axis-aligned people detectors often work poorly on fish-eye frames. Recent works account for this variability by modifying existing anchor-based detectors or relying on complex pre/post-processing. Anchor-based methods spread a set of pre-defined bounding boxes on the input image, most of which are invalid. In addition to being inefficient, this approach could lead to a significant imbalance between the positive and negative anchor boxes. In this work, we propose ARPD, a single-stage anchor-free fully convolutional network to detect arbitrarily rotated people in fish-eye images. Our network uses keypoint estimation to find the center point of each object and regress the object’s other properties directly. To capture the various orientation of people in fish-eye cameras, in addition to the center and size, ARPD also predicts the angle of each bounding box. We also propose a periodic loss function that accounts for angle periodicity and relieves the difficulty of learning small-angle oscillations. Experimental results show that our method competes favorably with state-of-the-art algorithms while running significantly faster.
俯视图鱼眼图像中的人物检测具有挑战性,因为鱼眼图像中的人物往往出现在任意方向并且扭曲程度不同。由于这种独特的径向几何形状,轴线对准的人检测器通常在鱼眼框架上工作得很差。最近的研究通过修改现有的基于锚点的检测器或依赖复杂的预处理/后处理来解释这种可变性。基于锚点的方法在输入图像上散布一组预定义的边界框,其中大多数是无效的。除了效率低下之外,这种方法还可能导致正锚框和负锚框之间的严重失衡。在这项工作中,我们提出了ARPD,一种单阶段无锚点的全卷积网络,用于检测鱼眼图像中任意旋转的人。我们的网络使用关键点估计来找到每个对象的中心点,并直接回归对象的其他属性。为了在鱼眼相机中捕捉人的各种方向,除了中心和大小,ARPD还预测了每个边界框的角度。我们还提出了一个考虑角度周期性的周期损失函数,减轻了学习小角度振荡的困难。实验结果表明,我们的方法与最先进的算法相比具有优势,同时运行速度显著提高。
{"title":"ARPD: Anchor-free Rotation-aware People Detection using Topview Fisheye Camera","authors":"Quan Minh Nguyen, Bang Le Van, Can Nguyen, Anh Le, Viet Dung Nguyen","doi":"10.1109/AVSS52988.2021.9663768","DOIUrl":"https://doi.org/10.1109/AVSS52988.2021.9663768","url":null,"abstract":"People detection in top-view, fish-eye images is challenging as people in fish-eye images often appear in arbitrary directions and are distorted differently. Due to this unique radial geometry, axis-aligned people detectors often work poorly on fish-eye frames. Recent works account for this variability by modifying existing anchor-based detectors or relying on complex pre/post-processing. Anchor-based methods spread a set of pre-defined bounding boxes on the input image, most of which are invalid. In addition to being inefficient, this approach could lead to a significant imbalance between the positive and negative anchor boxes. In this work, we propose ARPD, a single-stage anchor-free fully convolutional network to detect arbitrarily rotated people in fish-eye images. Our network uses keypoint estimation to find the center point of each object and regress the object’s other properties directly. To capture the various orientation of people in fish-eye cameras, in addition to the center and size, ARPD also predicts the angle of each bounding box. We also propose a periodic loss function that accounts for angle periodicity and relieves the difficulty of learning small-angle oscillations. Experimental results show that our method competes favorably with state-of-the-art algorithms while running significantly faster.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"363 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115967043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
DAM: Dissimilarity Attention Module for Weakly-supervised Video Anomaly Detection 弱监督视频异常检测的不相似注意模块
Snehashis Majhi, Srijan Das, F. Brémond
Video anomaly detection under weak supervision is complicated due to the difficulties in identifying the anomaly and normal instances during training, hence, resulting in non-optimal margin of separation. In this paper, we propose a framework consisting of Dissimilarity Attention Module (DAM) to discriminate the anomaly instances from normal ones both at feature level and score level. In order to decide instances to be normal or anomaly, DAM takes local spatio-temporal (i.e. clips within a video) dissimilarities into account rather than the global temporal context of a video. This allows the framework to detect anomalies in real-time (i.e. online) scenarios without the need of extra window buffer time. Further more, we adopt two-variants of DAM for learning the dissimilarities between successive video clips. The proposed framework along with DAM is validated on two large scale anomaly detection datasets i.e. UCF-Crime and ShanghaiTech, outperforming the online state-of-the-art approaches by 1.5% and 3.4% respectively. The source code and models will be available at https://github.com/snehashismajhi/DAM-Anomaly-Detection
弱监督下的视频异常检测非常复杂,因为在训练过程中很难识别异常和正常实例,从而导致分离余量不理想。在本文中,我们提出了一个由不相似注意模块(DAM)组成的框架,用于在特征水平和分数水平上区分异常实例和正常实例。为了确定实例是正常的还是异常的,DAM考虑了局部时空(即视频中的片段)的差异,而不是视频的全局时间背景。这允许框架在不需要额外的窗口缓冲时间的情况下实时(即在线)检测异常。此外,我们采用DAM的两种变体来学习连续视频片段之间的差异。在两个大型异常检测数据集(UCF-Crime和ShanghaiTech)上验证了所提出的框架和DAM,分别比在线最先进的方法高出1.5%和3.4%。源代码和模型可以在https://github.com/snehashismajhi/DAM-Anomaly-Detection上获得
{"title":"DAM: Dissimilarity Attention Module for Weakly-supervised Video Anomaly Detection","authors":"Snehashis Majhi, Srijan Das, F. Brémond","doi":"10.1109/AVSS52988.2021.9663810","DOIUrl":"https://doi.org/10.1109/AVSS52988.2021.9663810","url":null,"abstract":"Video anomaly detection under weak supervision is complicated due to the difficulties in identifying the anomaly and normal instances during training, hence, resulting in non-optimal margin of separation. In this paper, we propose a framework consisting of Dissimilarity Attention Module (DAM) to discriminate the anomaly instances from normal ones both at feature level and score level. In order to decide instances to be normal or anomaly, DAM takes local spatio-temporal (i.e. clips within a video) dissimilarities into account rather than the global temporal context of a video. This allows the framework to detect anomalies in real-time (i.e. online) scenarios without the need of extra window buffer time. Further more, we adopt two-variants of DAM for learning the dissimilarities between successive video clips. The proposed framework along with DAM is validated on two large scale anomaly detection datasets i.e. UCF-Crime and ShanghaiTech, outperforming the online state-of-the-art approaches by 1.5% and 3.4% respectively. The source code and models will be available at https://github.com/snehashismajhi/DAM-Anomaly-Detection","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116733548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Geometry-Based Person Re-Identification in Fisheye Stereo 基于几何的鱼眼立体人物再识别
Joshua Bone, Mertcan Cokbas, M. Tezcan, J. Konrad, P. Ishwar
Person re-identification using rectilinear cameras has been thoroughly researched to date. However, the topic has received little attention for fisheye cameras and the few developed methods are appearance-based. We propose a geometry-based approach to re-identification for overhead fisheye cameras with overlapping fields of view. The main idea is that a person visible in two camera views is uniquely located in the view of one camera given their height and location in the other camera’s view. We develop a height-dependent mathematical relationship between these locations using the unified spherical model for omnidirectional cameras. We also propose a new fisheye-camera calibration method and a novel automated approach to calibration-data collection. Finally, we propose four re-identification algorithms that leverage geometric constraints and demonstrate their excellent accuracy, which vastly exceeds that of a state-of-the-art appearance-based method, on a fisheye-camera dataset we collected.
迄今为止,利用直线摄像机对人的再识别进行了深入的研究。然而,这个话题很少受到鱼眼相机的关注,并且少数开发的方法是基于外观的。我们提出了一种基于几何的方法来重新识别具有重叠视场的头顶鱼眼相机。主要思想是,在两个摄像头视图中可见的人在一个摄像头的视图中是唯一的,给定他们在另一个摄像头视图中的高度和位置。我们利用全向相机的统一球面模型,在这些位置之间建立了高度相关的数学关系。我们还提出了一种新的鱼眼相机标定方法和一种新的自动标定数据采集方法。最后,在我们收集的鱼眼相机数据集上,我们提出了四种利用几何约束的重新识别算法,并证明了它们出色的准确性,远远超过了最先进的基于外观的方法。
{"title":"Geometry-Based Person Re-Identification in Fisheye Stereo","authors":"Joshua Bone, Mertcan Cokbas, M. Tezcan, J. Konrad, P. Ishwar","doi":"10.1109/AVSS52988.2021.9663745","DOIUrl":"https://doi.org/10.1109/AVSS52988.2021.9663745","url":null,"abstract":"Person re-identification using rectilinear cameras has been thoroughly researched to date. However, the topic has received little attention for fisheye cameras and the few developed methods are appearance-based. We propose a geometry-based approach to re-identification for overhead fisheye cameras with overlapping fields of view. The main idea is that a person visible in two camera views is uniquely located in the view of one camera given their height and location in the other camera’s view. We develop a height-dependent mathematical relationship between these locations using the unified spherical model for omnidirectional cameras. We also propose a new fisheye-camera calibration method and a novel automated approach to calibration-data collection. Finally, we propose four re-identification algorithms that leverage geometric constraints and demonstrate their excellent accuracy, which vastly exceeds that of a state-of-the-art appearance-based method, on a fisheye-camera dataset we collected.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114320855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Action Recognition with Fusion of Multiple Graph Convolutional Networks 基于多图卷积网络融合的动作识别
Camille Maurice, F. Lerasle
We propose two light-weight and specialized Spatio-Temporal Graph Convolutional Networks (ST-GCNs): one for actions characterized by the motion of the human body and a novel one we especially design to recognize particular objects configurations during human actions execution. We propose a late-fusion strategy of the predictions of both graphs networks to get the most out of the two and to clear out ambiguities in the action classification. This modular approach enables us to reduce memory cost and training times. Moreover we also propose the same late fusion mechanism to further improve the performance using a Bayesian approach. We show results on 2 public datasets: CAD-120 and Watch-n-Patch. Our late-fusion mechanism yields performance gains in accuracy of respectively + 21 percentage points (pp), + 7 pp on Watch-n-Patch and CAD-120 compared to the individual graphs. Our approach outperforms most of the significant existing approaches.
我们提出了两个轻量级的专用时空图卷积网络(ST-GCNs):一个用于以人体运动为特征的动作,另一个是我们特别设计的用于识别人类动作执行过程中特定物体配置的新颖网络。我们提出了两种图网络预测的后期融合策略,以最大限度地利用两种图网络,并消除动作分类中的歧义。这种模块化方法使我们能够减少内存成本和训练时间。此外,我们还提出了相同的后期融合机制,以进一步提高贝叶斯方法的性能。我们展示了两个公共数据集的结果:CAD-120和Watch-n-Patch。与单个图相比,我们的后期融合机制在Watch-n-Patch和CAD-120上的精度分别提高了21个百分点(pp)和7个百分点。我们的方法优于大多数重要的现有方法。
{"title":"Action Recognition with Fusion of Multiple Graph Convolutional Networks","authors":"Camille Maurice, F. Lerasle","doi":"10.1109/AVSS52988.2021.9663765","DOIUrl":"https://doi.org/10.1109/AVSS52988.2021.9663765","url":null,"abstract":"We propose two light-weight and specialized Spatio-Temporal Graph Convolutional Networks (ST-GCNs): one for actions characterized by the motion of the human body and a novel one we especially design to recognize particular objects configurations during human actions execution. We propose a late-fusion strategy of the predictions of both graphs networks to get the most out of the two and to clear out ambiguities in the action classification. This modular approach enables us to reduce memory cost and training times. Moreover we also propose the same late fusion mechanism to further improve the performance using a Bayesian approach. We show results on 2 public datasets: CAD-120 and Watch-n-Patch. Our late-fusion mechanism yields performance gains in accuracy of respectively + 21 percentage points (pp), + 7 pp on Watch-n-Patch and CAD-120 compared to the individual graphs. Our approach outperforms most of the significant existing approaches.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127252977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Real-time Super-Resolution for Surveillance Thermal Cameras using optimized pipeline on Embedded Edge Device 基于嵌入式边缘设备优化流水线的实时超分辨率监控热像仪
Prayushi Mathur, A. Singh, Syed Azeemuddin, Jayram Adoni, Prasad Adireddy
The avenue of deep learning is scarcely explored in the domain of thermal imaging. Recovering a high-resolution output from images and videos is a classical problem in many computer vision applications. In this paper, we propose an optimized pipeline for a real-time video super-resolution task using thermal camera on embedded edge device. To tackle the challenges, we make contributions in the following several aspects: 1) comparative study of selected deep learning super-resolution models; 2) constructing and optimizing an end-to-end inference pipeline; 3) using cutting edge technology to integrate the whole workflow; 4) a real-time performance was achieved using less data; 5) we have also experimented the entire pipeline on our custom thermal dataset. As a consequence, the chosen model was able to achieve a real-time speed of over 29, 36 and 45 high FPS; 32.9dB/0.889, 31.86dB/0.801 and 30.94dB/0.728 PSNR/SSIM values for 2x, 3x and 4x scaling factors respectively.
在热成像领域,深度学习的途径很少被探索。从图像和视频中恢复高分辨率输出是许多计算机视觉应用中的经典问题。本文提出了一种基于嵌入式边缘设备的热像仪实时视频超分辨率任务的优化流水线。为了应对这些挑战,我们在以下几个方面做出了贡献:1)选择深度学习超分辨率模型的比较研究;2)构建和优化端到端推理管道;3)采用尖端技术整合整个工作流程;4)使用较少的数据实现实时性;5)我们还在自定义热数据集上实验了整个管道。因此,选择的模型能够实现超过29,36和45的高FPS的实时速度;分别为32.9dB/0.889、31.86dB/0.801和30.94dB/0.728。
{"title":"A Real-time Super-Resolution for Surveillance Thermal Cameras using optimized pipeline on Embedded Edge Device","authors":"Prayushi Mathur, A. Singh, Syed Azeemuddin, Jayram Adoni, Prasad Adireddy","doi":"10.1109/AVSS52988.2021.9663831","DOIUrl":"https://doi.org/10.1109/AVSS52988.2021.9663831","url":null,"abstract":"The avenue of deep learning is scarcely explored in the domain of thermal imaging. Recovering a high-resolution output from images and videos is a classical problem in many computer vision applications. In this paper, we propose an optimized pipeline for a real-time video super-resolution task using thermal camera on embedded edge device. To tackle the challenges, we make contributions in the following several aspects: 1) comparative study of selected deep learning super-resolution models; 2) constructing and optimizing an end-to-end inference pipeline; 3) using cutting edge technology to integrate the whole workflow; 4) a real-time performance was achieved using less data; 5) we have also experimented the entire pipeline on our custom thermal dataset. As a consequence, the chosen model was able to achieve a real-time speed of over 29, 36 and 45 high FPS; 32.9dB/0.889, 31.86dB/0.801 and 30.94dB/0.728 PSNR/SSIM values for 2x, 3x and 4x scaling factors respectively.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124852218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An Efficient And Robust Framework For Collaborative Monocular Visual Slam 一种高效且稳健的协同单目视觉Slam框架
Dipanjan Das, Soumyadip Maity, B. Dhara
Visual SLAM (VSLAM) has shown remarkable performance in robot navigation and its practical applicability can be enriched by building a multi-robot collaboration framework called Visual collaborative SLAM (CoSLAM). CoSLAM extends the usage of SLAM for navigating in larger areas for certain applications like inspection etc. using multiple vehicles which not only saves time but also power. Visual CoSLAM framework suffers from problems like i) Robot can start from anywhere in the scene using their own VSLAM which save both time and power ii) making the framework independent of the choice of SLAM for greater applicability of different SLAMs, iii) avoiding collision with other robots by a robust merging of two noisy maps, when the visual overlap is detected. Very few works are available in the literature which addresses the above problems in a single framework in a practical sense. In this paper, we present a framework for CoSLAM using monocular cameras addressing all the above problems. Unlike existing systems which work only on ORB SLAM, our framework is truly independent of SLAMs. We propose a deep learning based algorithm to find out the visually overlapped scene required for merging two or more 3D maps. Our Map Merging is robust in presence of outliers as we compute similarity transforms using both structural information as well as camera-camera relationships and choose one based on a statistical inference. Experimental results show that our framework is robust and works well for any individual SLAM where we demonstrate our result on ORB and EdgeSLAM which are prototypical extremes methods for map merging in a CoSLAM framework.
视觉SLAM (Visual collaborative SLAM, VSLAM)在机器人导航中表现出了显著的性能,通过构建多机器人协作框架视觉协同SLAM (Visual collaborative SLAM, CoSLAM)可以丰富其实际应用。CoSLAM将SLAM的使用扩展到更大的区域,用于某些应用,如检查等,使用多辆车不仅节省了时间,还节省了电力。Visual CoSLAM框架存在以下问题:i)机器人可以使用自己的VSLAM从场景中的任何地方开始,这既节省了时间又节省了功率;ii)使框架独立于SLAM的选择,以提高不同SLAM的适用性;iii)当检测到视觉重叠时,通过鲁棒合并两个噪声地图来避免与其他机器人碰撞。文献中很少有作品在实际意义上以单一框架解决上述问题。在本文中,我们提出了一个使用单目相机的CoSLAM框架来解决上述所有问题。与只在ORB SLAM上工作的现有系统不同,我们的框架真正独立于SLAM。我们提出了一种基于深度学习的算法来找出合并两个或多个3D地图所需的视觉重叠场景。我们的地图合并在异常值存在时是鲁棒的,因为我们使用结构信息和相机-相机关系计算相似性变换,并根据统计推断选择一个。实验结果表明,我们的框架是鲁棒的,适用于任何单独的SLAM,我们在ORB和EdgeSLAM上展示了我们的结果,这是CoSLAM框架中地图合并的典型极端方法。
{"title":"An Efficient And Robust Framework For Collaborative Monocular Visual Slam","authors":"Dipanjan Das, Soumyadip Maity, B. Dhara","doi":"10.1109/AVSS52988.2021.9663736","DOIUrl":"https://doi.org/10.1109/AVSS52988.2021.9663736","url":null,"abstract":"Visual SLAM (VSLAM) has shown remarkable performance in robot navigation and its practical applicability can be enriched by building a multi-robot collaboration framework called Visual collaborative SLAM (CoSLAM). CoSLAM extends the usage of SLAM for navigating in larger areas for certain applications like inspection etc. using multiple vehicles which not only saves time but also power. Visual CoSLAM framework suffers from problems like i) Robot can start from anywhere in the scene using their own VSLAM which save both time and power ii) making the framework independent of the choice of SLAM for greater applicability of different SLAMs, iii) avoiding collision with other robots by a robust merging of two noisy maps, when the visual overlap is detected. Very few works are available in the literature which addresses the above problems in a single framework in a practical sense. In this paper, we present a framework for CoSLAM using monocular cameras addressing all the above problems. Unlike existing systems which work only on ORB SLAM, our framework is truly independent of SLAMs. We propose a deep learning based algorithm to find out the visually overlapped scene required for merging two or more 3D maps. Our Map Merging is robust in presence of outliers as we compute similarity transforms using both structural information as well as camera-camera relationships and choose one based on a statistical inference. Experimental results show that our framework is robust and works well for any individual SLAM where we demonstrate our result on ORB and EdgeSLAM which are prototypical extremes methods for map merging in a CoSLAM framework.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122645400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Dataset and Baseline Models to Detect Human Postural States Robustly against Irregular Postures 基于数据集和基线模型的人体姿态状态鲁棒检测
K. Bae, Kimin Yun, Jungchan Cho, Yuseok Bae
In many visual applications, we often encounter people with irregular postures, such as lying down. Many approaches adopted two-step methods to handle a person with irregular postures: 1) person detection and 2) posture prediction based on the detected person. However, it is challenging to detect irregular postures because the existing detectors were trained with datasets consisting of most upright postures. Therefore, we propose a new Irregular Human Posture (IHP) dataset to handle various postures captured from real-world surveillance cameras. The IHP dataset provides sufficient annotations to understand the posture of person, including segmentation, keypoints, and postural states. This paper also provides two baseline net-works for postural state estimation of the people trained on the IHP dataset. Moreover, we show that our baseline networks effectively detect the people with irregular postures that may be in an urgent situation in a surveillance environment.
在许多视觉应用中,我们经常会遇到姿势不规则的人,比如躺着。许多方法采用两步方法来处理不规则姿势的人:1)人检测和2)基于检测到的人的姿势预测。然而,由于现有的检测器是由大多数直立姿势组成的数据集训练的,因此检测不规则姿势具有挑战性。因此,我们提出了一个新的不规则人体姿势(IHP)数据集来处理从现实世界的监控摄像机捕获的各种姿势。IHP数据集提供了足够的注释来理解人的姿势,包括分割、关键点和姿势状态。本文还提供了两个基线网络,用于在IHP数据集上训练的人的姿势状态估计。此外,我们表明,我们的基线网络有效地检测到在监视环境中可能处于紧急情况的不规则姿势的人。
{"title":"The Dataset and Baseline Models to Detect Human Postural States Robustly against Irregular Postures","authors":"K. Bae, Kimin Yun, Jungchan Cho, Yuseok Bae","doi":"10.1109/AVSS52988.2021.9663782","DOIUrl":"https://doi.org/10.1109/AVSS52988.2021.9663782","url":null,"abstract":"In many visual applications, we often encounter people with irregular postures, such as lying down. Many approaches adopted two-step methods to handle a person with irregular postures: 1) person detection and 2) posture prediction based on the detected person. However, it is challenging to detect irregular postures because the existing detectors were trained with datasets consisting of most upright postures. Therefore, we propose a new Irregular Human Posture (IHP) dataset to handle various postures captured from real-world surveillance cameras. The IHP dataset provides sufficient annotations to understand the posture of person, including segmentation, keypoints, and postural states. This paper also provides two baseline net-works for postural state estimation of the people trained on the IHP dataset. Moreover, we show that our baseline networks effectively detect the people with irregular postures that may be in an urgent situation in a surveillance environment.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124062887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-Pedestrian Tracking with Clusters 基于集群的多行人跟踪
Daniel Stadler, J. Beyerer
One of the biggest challenges in multi-pedestrian tracking arises in crowds, where missing detections can lead to wrong track-detection assignments, especially under heavy occlusion. In order to identify such situations, we cluster tracks and detections based on their overlaps and introduce different cluster states depending on the number of detections and tracks in a cluster. On the basis of this strategy, we make the following contributions. First, we propose a cluster-aware non-maximum suppression (CA-NMS) that leverages temporal information from tracks applying an increased IoU threshold in clusters with severe occlusion to reduce the number of missed detections, while at the same time limiting the number of duplicate detections. Second, for clusters with very high overlaps where detections are missing even with the CA-NMS, we utilize past track information to correct wrong assignments when missed targets are re-detected after occlusion. Furthermore, we propose a new tracking pipeline that combines the paradigms of tracking-by-detection and regression-based tracking to improve the association performance in crowded scenes. Putting all together, our tracker achieves competitive results w.r.t. the state-of-the-art on three multi-pedestrian tracking benchmarks. Our framework is analyzed with extensive ablative experiments and the impact of the proposed tracking components on the performance is evaluated.
多行人跟踪的最大挑战之一出现在人群中,其中缺失的检测可能导致错误的轨迹检测分配,特别是在严重遮挡的情况下。为了识别这种情况,我们基于轨道和检测的重叠来聚类,并根据集群中检测和轨道的数量引入不同的集群状态。基于这一战略,我们做出以下贡献。首先,我们提出了一个集群感知的非最大抑制(CA-NMS),它利用轨道的时间信息,在严重遮挡的集群中应用增加的IoU阈值来减少错过检测的数量,同时限制重复检测的数量。其次,对于具有非常高重叠的集群,即使使用CA-NMS也会丢失检测,我们利用过去的轨迹信息来纠正错误的分配,当遮挡后重新检测到丢失的目标时。此外,我们提出了一种新的跟踪管道,该管道结合了基于检测的跟踪和基于回归的跟踪范式,以提高拥挤场景下的关联性能。综上所述,我们的跟踪器在三个多行人跟踪基准上取得了最先进的成绩。通过大量的烧蚀实验分析了我们的框架,并评估了所提出的跟踪组件对性能的影响。
{"title":"Multi-Pedestrian Tracking with Clusters","authors":"Daniel Stadler, J. Beyerer","doi":"10.1109/AVSS52988.2021.9663829","DOIUrl":"https://doi.org/10.1109/AVSS52988.2021.9663829","url":null,"abstract":"One of the biggest challenges in multi-pedestrian tracking arises in crowds, where missing detections can lead to wrong track-detection assignments, especially under heavy occlusion. In order to identify such situations, we cluster tracks and detections based on their overlaps and introduce different cluster states depending on the number of detections and tracks in a cluster. On the basis of this strategy, we make the following contributions. First, we propose a cluster-aware non-maximum suppression (CA-NMS) that leverages temporal information from tracks applying an increased IoU threshold in clusters with severe occlusion to reduce the number of missed detections, while at the same time limiting the number of duplicate detections. Second, for clusters with very high overlaps where detections are missing even with the CA-NMS, we utilize past track information to correct wrong assignments when missed targets are re-detected after occlusion. Furthermore, we propose a new tracking pipeline that combines the paradigms of tracking-by-detection and regression-based tracking to improve the association performance in crowded scenes. Putting all together, our tracker achieves competitive results w.r.t. the state-of-the-art on three multi-pedestrian tracking benchmarks. Our framework is analyzed with extensive ablative experiments and the impact of the proposed tracking components on the performance is evaluated.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116648325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1