首页 > 最新文献

2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)最新文献

英文 中文
Reducing Viral Transmission through AI-based Crowd Monitoring and Social Distancing Analysis 通过基于人工智能的人群监测和社交距离分析减少病毒传播
Benjamin Fraser, Brendan Copp, Gurpreet Singh, Orhan Keyvan, Tongfei Bian, Valentin Sonntag, Yang Xing, Weisi Guo, A. Tsourdos
This paper explores multi-person pose estimation for reducing the risk of airborne pathogens. The recent COVID-19 pandemic highlights these risks in a globally connected world. We developed several techniques which analyse CCTV inputs for crowd analysis. The framework utilised automated homography from pose feature positions to determine interpersonal distance. It also incorporates mask detection by using pose features for an image classification pipeline. A further model predicts the behaviour of each person by using their estimated pose features. We combine the models to assess transmission risk based on recent scientific literature. A custom dashboard displays a risk density heat-map in real time. This system could improve public space management and reduce transmission in future pandemics. This context agnostic system and has many applications for other crowd monitoring problems.
本文探讨了减少空气传播病原体风险的多人姿态估计方法。最近的COVID-19大流行凸显了全球互联世界中的这些风险。我们开发了几种技术来分析人群分析的闭路电视输入。该框架利用姿势特征位置的自动单应性来确定人际距离。它还通过使用姿态特征对图像分类管道进行掩码检测。进一步的模型通过估计每个人的姿势特征来预测他们的行为。我们根据最近的科学文献将这些模型结合起来评估传播风险。自定义仪表板实时显示风险密度热图。这一系统可以改善公共空间管理,减少未来大流行的传播。这种与环境无关的系统在其他人群监控问题上有很多应用。
{"title":"Reducing Viral Transmission through AI-based Crowd Monitoring and Social Distancing Analysis","authors":"Benjamin Fraser, Brendan Copp, Gurpreet Singh, Orhan Keyvan, Tongfei Bian, Valentin Sonntag, Yang Xing, Weisi Guo, A. Tsourdos","doi":"10.1109/MFI55806.2022.9913843","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913843","url":null,"abstract":"This paper explores multi-person pose estimation for reducing the risk of airborne pathogens. The recent COVID-19 pandemic highlights these risks in a globally connected world. We developed several techniques which analyse CCTV inputs for crowd analysis. The framework utilised automated homography from pose feature positions to determine interpersonal distance. It also incorporates mask detection by using pose features for an image classification pipeline. A further model predicts the behaviour of each person by using their estimated pose features. We combine the models to assess transmission risk based on recent scientific literature. A custom dashboard displays a risk density heat-map in real time. This system could improve public space management and reduce transmission in future pandemics. This context agnostic system and has many applications for other crowd monitoring problems.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115955834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harmonic Functions for Three-Dimensional Shape Estimation in Cylindrical Coordinates 圆柱坐标下三维形状估计的调和函数
Tim Baur, J. Reuter, Antonio Zea, U. Hanebeck
With the high resolution of modern sensors such as multilayer LiDARs, estimating the 3D shape in an extended object tracking procedure is possible. In recent years, 3D shapes have been estimated in spherical coordinates using Gaussian processes, spherical double Fourier series or spherical harmonics. However, observations have shown that in many scenarios only a few measurements are obtained from top or bottom surfaces, leading to error-prone estimates in spherical coordinates. Therefore, in this paper we propose to estimate the shape in cylindrical coordinates instead, applying harmonic functions. Specifically, we derive an expansion for 3D shapes in cylindrical coordinates by solving a boundary value problem for the Laplace equation. This shape representation is then integrated in a plain greedy association model and compared to shape estimation procedures in spherical coordinates. Since the shape representation is only integrated in a basic estimator, the results are preliminary and a detailed discussion for future work is presented at the end of the paper.
随着现代传感器(如多层激光雷达)的高分辨率,在扩展的物体跟踪过程中估计3D形状是可能的。近年来,利用高斯过程、球双傅立叶级数或球谐波在球坐标下估计三维形状。然而,观察表明,在许多情况下,仅从顶部或底部表面获得少量测量结果,导致球坐标估计容易出错。因此,在本文中,我们建议用调和函数代替在柱坐标中估计形状。具体地说,我们通过求解拉普拉斯方程的边值问题,导出了在柱坐标下三维形状的展开式。然后将这种形状表示集成到一个简单的贪婪关联模型中,并与球坐标下的形状估计过程进行比较。由于形状表示仅集成在一个基本估计量中,因此结果是初步的,并在论文的最后对未来的工作进行了详细的讨论。
{"title":"Harmonic Functions for Three-Dimensional Shape Estimation in Cylindrical Coordinates","authors":"Tim Baur, J. Reuter, Antonio Zea, U. Hanebeck","doi":"10.1109/MFI55806.2022.9913858","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913858","url":null,"abstract":"With the high resolution of modern sensors such as multilayer LiDARs, estimating the 3D shape in an extended object tracking procedure is possible. In recent years, 3D shapes have been estimated in spherical coordinates using Gaussian processes, spherical double Fourier series or spherical harmonics. However, observations have shown that in many scenarios only a few measurements are obtained from top or bottom surfaces, leading to error-prone estimates in spherical coordinates. Therefore, in this paper we propose to estimate the shape in cylindrical coordinates instead, applying harmonic functions. Specifically, we derive an expansion for 3D shapes in cylindrical coordinates by solving a boundary value problem for the Laplace equation. This shape representation is then integrated in a plain greedy association model and compared to shape estimation procedures in spherical coordinates. Since the shape representation is only integrated in a basic estimator, the results are preliminary and a detailed discussion for future work is presented at the end of the paper.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133995905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Vision-enhanced GNSS-based environmental context detection for autonomous vehicle navigation 基于视觉增强gnss的自动驾驶汽车导航环境上下文检测
Florent Feriol, Yoko Watanabe, Damien Vivet
Context-adaptive navigation is currently considered as one of the potential solutions to achieve a more precise and robust positioning. The goal would be to adapt the sensor parameters and the navigation filter structure so that it takes into account the context-dependant sensor performance, notably GNSS signal degradations. For that, a reliable context detection is essential. This paper proposes a GNSS-based environmental context detector which classifies the environment surrounding a vehicle into four classes: canyon, open-sky, trees and urban. A support-vector machine classifier is trained on our database collected around Toulouse. We first show the classification results of a model based on GNSS data only, revealing its limitation to distinguish trees and urban contexts. For addressing this issue, this paper proposes the vision-enhanced model by adding satellite visibility information from sky segmentation on fisheye camera images. Compared to the GNSS-only model, the proposed vision-enhanced model significantly improved the classification performance and raised an average F1-score from 78% to 86%.
上下文自适应导航目前被认为是实现更精确、更稳健定位的潜在解决方案之一。目标是调整传感器参数和导航滤波器结构,使其考虑到与上下文相关的传感器性能,特别是GNSS信号的退化。为此,可靠的上下文检测至关重要。本文提出了一种基于gnss的环境上下文检测器,将车辆周围环境分为峡谷、露天、树木和城市四类。在图卢兹附近收集的数据库上训练了一个支持向量机分类器。我们首先展示了仅基于GNSS数据的模型的分类结果,揭示了其在区分树木和城市背景方面的局限性。为了解决这一问题,本文提出了在鱼眼相机图像上加入天空分割的卫星能见度信息的视觉增强模型。与仅gnss模型相比,本文提出的视觉增强模型显著提高了分类性能,将平均f1分从78%提高到86%。
{"title":"Vision-enhanced GNSS-based environmental context detection for autonomous vehicle navigation","authors":"Florent Feriol, Yoko Watanabe, Damien Vivet","doi":"10.1109/MFI55806.2022.9913867","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913867","url":null,"abstract":"Context-adaptive navigation is currently considered as one of the potential solutions to achieve a more precise and robust positioning. The goal would be to adapt the sensor parameters and the navigation filter structure so that it takes into account the context-dependant sensor performance, notably GNSS signal degradations. For that, a reliable context detection is essential. This paper proposes a GNSS-based environmental context detector which classifies the environment surrounding a vehicle into four classes: canyon, open-sky, trees and urban. A support-vector machine classifier is trained on our database collected around Toulouse. We first show the classification results of a model based on GNSS data only, revealing its limitation to distinguish trees and urban contexts. For addressing this issue, this paper proposes the vision-enhanced model by adding satellite visibility information from sky segmentation on fisheye camera images. Compared to the GNSS-only model, the proposed vision-enhanced model significantly improved the classification performance and raised an average F1-score from 78% to 86%.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"217 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122849670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Autonomous Vehicle Navigation Parameters via Image and Image-and-Point Cloud Fusion-based End-to-End Methods 基于图像和图像与点云融合的端到端方法预测自动驾驶汽车导航参数
Semih Beycimen, Dmitry I. Ignatyev, A. Zolotas
This paper presents a study of end-to-end methods for predicting autonomous vehicle navigation parameters. Image-based and Image & Lidar points-based end-to-end models have been trained under Nvidia learning architectures as well as Densenet-169, Resnet-152 and Inception-v4. Various learning parameters for autonomous vehicle navigation, input models and pre-processing data algorithms i.e. image cropping, noise removing, semantic segmentation for image data have been investigated and tested. The best ones, from the rigorous investigation, are selected for the main framework of the study. Results reveal that the Nvidia architecture trained Image & Lidar points-based method offers the better results accuracy rate-wise for steering angle and speed.
本文研究了端到端自动驾驶汽车导航参数预测方法。基于图像和基于图像和激光雷达点的端到端模型已经在Nvidia学习架构以及Densenet-169, Resnet-152和Inception-v4下进行了训练。对自动驾驶汽车导航的各种学习参数、输入模型和预处理数据算法(即图像裁剪、噪声去除、图像数据的语义分割)进行了研究和测试。从严格的调查中选择最好的作为研究的主要框架。结果表明,Nvidia架构训练的基于图像和激光雷达点的方法在转向角度和速度方面提供了更好的结果准确率。
{"title":"Predicting Autonomous Vehicle Navigation Parameters via Image and Image-and-Point Cloud Fusion-based End-to-End Methods","authors":"Semih Beycimen, Dmitry I. Ignatyev, A. Zolotas","doi":"10.1109/MFI55806.2022.9913844","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913844","url":null,"abstract":"This paper presents a study of end-to-end methods for predicting autonomous vehicle navigation parameters. Image-based and Image & Lidar points-based end-to-end models have been trained under Nvidia learning architectures as well as Densenet-169, Resnet-152 and Inception-v4. Various learning parameters for autonomous vehicle navigation, input models and pre-processing data algorithms i.e. image cropping, noise removing, semantic segmentation for image data have been investigated and tested. The best ones, from the rigorous investigation, are selected for the main framework of the study. Results reveal that the Nvidia architecture trained Image & Lidar points-based method offers the better results accuracy rate-wise for steering angle and speed.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114716226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Statistically Motivated Likelihood for Track-Before-Detect 在检测之前跟踪的统计动机可能性
John Daniel Bossér, Gustaf Hendeby, M. L. Nordenvaad, I. Skog
A theoretically sound likelihood function for passive sonar surveillance using a hydrophone array is presented. The likelihood is derived from first order principles along with the assumption that the source signal can be approximated as white Gaussian noise within the considered frequency band. The resulting likelihood is a nonlinear function of the delay-and-sum beamformer response and signal-to-noise ratio (SNR).Evaluation of the proposed likelihood function is done by using it in a Bernoulli filter based track-before-detect (TkBD) framework. As a reference, the same TkBD framework, but with another beamforming response based likelihood, is used. Results from Monte-Carlo simulations of two bearings-only tracking scenarios are presented. The results show that the TkBD framework with the proposed likelihood yields an approx. 10 seconds faster target detection for a target at an SNR of -27 dB, and a lower bearing tracking error. Compared to a classical detect-and-track target tracker, the TkBD framework with the proposed likelihood yields 4 dB to 5 dB detection gain.
提出了一种利用水听器阵列进行被动声呐监视的理论上合理的似然函数。似然是从一阶原理推导出来的,同时假设源信号可以近似为在所考虑的频带内的高斯白噪声。得到的似然是波束形成器的延迟和响应和信噪比(SNR)的非线性函数。通过在基于伯努利滤波的检测前跟踪(TkBD)框架中对所提出的似然函数进行评估。作为参考,使用了相同的TkBD框架,但使用了另一种基于似然的波束形成响应。给出了两种纯方位跟踪场景的蒙特卡罗仿真结果。结果表明,具有所提出的似然的TkBD框架产生近似。对于信噪比为- 27db的目标,检测速度快10秒,且方位跟踪误差小。与传统的检测和跟踪目标跟踪器相比,具有提议似然的TkBD框架产生4 dB至5 dB的检测增益。
{"title":"A Statistically Motivated Likelihood for Track-Before-Detect","authors":"John Daniel Bossér, Gustaf Hendeby, M. L. Nordenvaad, I. Skog","doi":"10.1109/MFI55806.2022.9913853","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913853","url":null,"abstract":"A theoretically sound likelihood function for passive sonar surveillance using a hydrophone array is presented. The likelihood is derived from first order principles along with the assumption that the source signal can be approximated as white Gaussian noise within the considered frequency band. The resulting likelihood is a nonlinear function of the delay-and-sum beamformer response and signal-to-noise ratio (SNR).Evaluation of the proposed likelihood function is done by using it in a Bernoulli filter based track-before-detect (TkBD) framework. As a reference, the same TkBD framework, but with another beamforming response based likelihood, is used. Results from Monte-Carlo simulations of two bearings-only tracking scenarios are presented. The results show that the TkBD framework with the proposed likelihood yields an approx. 10 seconds faster target detection for a target at an SNR of -27 dB, and a lower bearing tracking error. Compared to a classical detect-and-track target tracker, the TkBD framework with the proposed likelihood yields 4 dB to 5 dB detection gain.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122586815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Global-local Feature Aggregation for Event-based Object Detection on EventKITTI 基于EventKITTI的事件目标检测的全局-局部特征聚合
Zichen Liang, Hu Cao, Chu Yang, Zikai Zhang, G. Chen
Event sequence conveys asynchronous pixel-wise visual information in a low power and high temporal resolution manner, which enables more robust perception under challenging conditions, e.g., fast motion. Two main factors limit the development of event-based object detection in traffic scenes: lack of high-quality datasets and effective event-based algorithms. To solve the first problem, we propose a simulated event-based detection dataset named EventKITTI, which incorporates the novel event modality information into a mixed two-level (i.e. object-level and video-level) detection dataset under traffic scenarios. EventKITTI possesses the high-quality event stream and the largest number of categories at microsecond temporal resolution and 1242×375 spatial resolution, exceeding existing datasets. As for the second problem, existing algorithms rely on CNN-based, spiking or graph architectures to capture local features of moving objects, leading to poor performance in objects with incomplete contours. Hence, we propose event-based object detectors named GFA-Net and CGFA-Net. To enhance the global-local learning ability in the spatial dimension, GFA-Net introduces transformer with edge-based position encoding and multi-scale feature fusion to detect objects on static frame. Furthermore, CGFA-Net optimizes edge-based position encoding with close-loop learning based on previous detected heatmap, which aggregates temporal global features across event frames. The proposed event-based object detectors achieve the best speed-accuracy trade-off on EventKITTI, approaching an 81.3% MAP at 33.0 FPS on object-level detection dataset and a 64.5% MAP at 30.3 FPS on video-level detection dataset.
事件序列以低功耗和高时间分辨率的方式传达异步像素级视觉信息,从而在具有挑战性的条件下(例如快速运动)实现更稳健的感知。两个主要因素限制了基于事件的交通场景目标检测的发展:缺乏高质量的数据集和有效的基于事件的算法。为了解决第一个问题,我们提出了一个基于事件的模拟检测数据集EventKITTI,它将新的事件模态信息融合到交通场景下的混合两级(即物体级和视频级)检测数据集中。EventKITTI在微秒级时间分辨率和1242×375空间分辨率下拥有高质量的事件流和最多的类别,超过了现有的数据集。对于第二个问题,现有算法依赖于基于cnn的、尖峰的或图的架构来捕捉运动物体的局部特征,导致在轮廓不完整的物体上表现不佳。因此,我们提出了基于事件的目标检测器,命名为GFA-Net和CGFA-Net。为了增强空间维度上的全局-局部学习能力,GFA-Net引入了基于边缘位置编码和多尺度特征融合的变压器来检测静态框架上的目标。此外,CGFA-Net利用基于先前检测到的热图的闭环学习优化了基于边缘的位置编码,从而聚合了跨事件帧的时间全局特征。所提出的基于事件的目标检测器在EventKITTI上实现了最佳的速度-精度权衡,在对象级检测数据集上以33.0 FPS接近81.3% MAP,在视频级检测数据集上以30.3 FPS接近64.5% MAP。
{"title":"Global-local Feature Aggregation for Event-based Object Detection on EventKITTI","authors":"Zichen Liang, Hu Cao, Chu Yang, Zikai Zhang, G. Chen","doi":"10.1109/MFI55806.2022.9913852","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913852","url":null,"abstract":"Event sequence conveys asynchronous pixel-wise visual information in a low power and high temporal resolution manner, which enables more robust perception under challenging conditions, e.g., fast motion. Two main factors limit the development of event-based object detection in traffic scenes: lack of high-quality datasets and effective event-based algorithms. To solve the first problem, we propose a simulated event-based detection dataset named EventKITTI, which incorporates the novel event modality information into a mixed two-level (i.e. object-level and video-level) detection dataset under traffic scenarios. EventKITTI possesses the high-quality event stream and the largest number of categories at microsecond temporal resolution and 1242×375 spatial resolution, exceeding existing datasets. As for the second problem, existing algorithms rely on CNN-based, spiking or graph architectures to capture local features of moving objects, leading to poor performance in objects with incomplete contours. Hence, we propose event-based object detectors named GFA-Net and CGFA-Net. To enhance the global-local learning ability in the spatial dimension, GFA-Net introduces transformer with edge-based position encoding and multi-scale feature fusion to detect objects on static frame. Furthermore, CGFA-Net optimizes edge-based position encoding with close-loop learning based on previous detected heatmap, which aggregates temporal global features across event frames. The proposed event-based object detectors achieve the best speed-accuracy trade-off on EventKITTI, approaching an 81.3% MAP at 33.0 FPS on object-level detection dataset and a 64.5% MAP at 30.3 FPS on video-level detection dataset.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114185703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Noninformative Prior Weights for Dirichlet PDFs* Dirichlet pdf文件的非信息先验权
A. Jøsang, Jinny Cho, Feng Chen
The noninformative prior weight W of a Dirichlet PDF (Probability Density Function) determines the balance between the prior probability and the influence of new observations on the posterior probability distribution. In this work, we propose a method for dynamically converging the weight W in a way that satisfies two constraints. The first constraint is that the prior Dirichlet PDF (i.e. in the absence of evidence) must always be uniform, which dictates that W = k where k is the cardinality of the domain. The second constraint is that the prior weight of large domains must not be so heavy that it prevents new observation evidence from having the expected influence over the shape of the Dirichlet PDF, which dictates that W quickly converges to a low constant CW in the presence of observation evidence, where typically CW = 2. In the case of a binary domain, the noninformative prior weight is normally set to W = 2, irrespective of the amount of evidence. In the case of a multidimensional domain with arbitrarily large cardinality k, the noninformative prior weight is initially equal to the domain cardinality k, but rapidly decreases to the constant convergence factor CW as the amount of evidence increases.
Dirichlet PDF(概率密度函数)的非信息先验权重W决定了先验概率与新观测值对后验概率分布的影响之间的平衡。在这项工作中,我们提出了一种以满足两个约束的方式动态收敛权W的方法。第一个约束是先验狄利克雷PDF(即在没有证据的情况下)必须始终是一致的,这决定了W = k,其中k是域的基数。第二个约束是,大域的先验权重不能太大,以至于它阻止新的观测证据对Dirichlet PDF的形状产生预期的影响,这决定了W在观测证据存在的情况下迅速收敛到一个低常数CW,其中CW通常= 2。在二元域的情况下,无论证据的数量如何,非信息先验权重通常设置为W = 2。在具有任意大基数k的多维域中,非信息先验权值初始等于域基数k,但随着证据量的增加迅速减小到恒定收敛因子CW。
{"title":"Noninformative Prior Weights for Dirichlet PDFs*","authors":"A. Jøsang, Jinny Cho, Feng Chen","doi":"10.1109/MFI55806.2022.9913864","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913864","url":null,"abstract":"The noninformative prior weight W of a Dirichlet PDF (Probability Density Function) determines the balance between the prior probability and the influence of new observations on the posterior probability distribution. In this work, we propose a method for dynamically converging the weight W in a way that satisfies two constraints. The first constraint is that the prior Dirichlet PDF (i.e. in the absence of evidence) must always be uniform, which dictates that W = k where k is the cardinality of the domain. The second constraint is that the prior weight of large domains must not be so heavy that it prevents new observation evidence from having the expected influence over the shape of the Dirichlet PDF, which dictates that W quickly converges to a low constant CW in the presence of observation evidence, where typically CW = 2. In the case of a binary domain, the noninformative prior weight is normally set to W = 2, irrespective of the amount of evidence. In the case of a multidimensional domain with arbitrarily large cardinality k, the noninformative prior weight is initially equal to the domain cardinality k, but rapidly decreases to the constant convergence factor CW as the amount of evidence increases.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130835556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Context Extraction from GIS Data Using LiDAR and Camera Features 利用激光雷达和相机特征从GIS数据中提取上下文
Juan D. González, Hans-Joachim Wünsche
We propose a method to extract spatial context of unknown objects in a driving scenario by classifying the surfaces in which the traffic participants transit. In order to classify these surfaces without the need for a big amount of labeled data, we resort to an unsupervised learning method that clusters patches of terrain using features extracted from LiDAR and image data. Using an iterative method, we are able to model the characteristics of map features from a geographical information system (GIS), such as streets and sidewalks, and extend their contextual meaning to the area around our test vehicle. We evaluate our results using a partially labeled 3D scan of our campus and find that our method is able to correctly extract and extend the spatial context of the map features from the GIS to the labeled surfaces on the campus.
我们提出了一种通过对交通参与者经过的表面进行分类来提取驾驶场景中未知物体的空间上下文的方法。为了在不需要大量标记数据的情况下对这些表面进行分类,我们采用了一种无监督学习方法,该方法使用从激光雷达和图像数据中提取的特征对地形斑块进行聚类。使用迭代方法,我们能够从地理信息系统(GIS)中建模地图特征的特征,例如街道和人行道,并将其上下文含义扩展到我们测试车辆周围的区域。我们使用校园的部分标记3D扫描来评估我们的结果,发现我们的方法能够正确地提取和扩展地图特征的空间背景,从GIS到校园的标记表面。
{"title":"Context Extraction from GIS Data Using LiDAR and Camera Features","authors":"Juan D. González, Hans-Joachim Wünsche","doi":"10.1109/MFI55806.2022.9913849","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913849","url":null,"abstract":"We propose a method to extract spatial context of unknown objects in a driving scenario by classifying the surfaces in which the traffic participants transit. In order to classify these surfaces without the need for a big amount of labeled data, we resort to an unsupervised learning method that clusters patches of terrain using features extracted from LiDAR and image data. Using an iterative method, we are able to model the characteristics of map features from a geographical information system (GIS), such as streets and sidewalks, and extend their contextual meaning to the area around our test vehicle. We evaluate our results using a partially labeled 3D scan of our campus and find that our method is able to correctly extract and extend the spatial context of the map features from the GIS to the labeled surfaces on the campus.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128077425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Benchmark for Vision-based Multi-UAV Multi-object Tracking 基于视觉的多无人机多目标跟踪基准
Hao Shen, Xiwen Yang, D. Lin, Jianduo Chai, Jiakai Huo, Xiaofeng Xing, Shaoming He
Vision-based multi-sensor multi-object tracking is a fundamental task in the applications of a swarm of Unmanned Aerial Vehicles (UAVs). The benchmark datasets are critical to the development of computer vision research since they can provide a fair and principled way to evaluate various approaches and promote the improvement of corresponding algorithms. In recent years, many benchmarks have been created for single-camera single-object tracking, single-camera multi-object detection, and single-camera multi-object tracking scenarios. However, up to the best of our knowledge, few benchmarks of multi-camera multi-object tracking have been provided. In this paper, we build a dataset for multi-UAV multi-object tracking tasks to fill the gap. Several cameras are placed in the VICON motion capture system to simulate the UAV team, and several toy cars are employed to represent ground targets. The first-perspective videos from the cameras, the motion states of the cameras, and the ground truth of the objects are recorded. We also propose a metric to evaluate the performance of the multi-UAV multi-object tracking task. The dataset and the code for algorithm evaluation are available at our GitHub (https://github.com/bitshenwenxiao/MUMO).
基于视觉的多传感器多目标跟踪是无人机群应用中的一项基础性任务。基准数据集对计算机视觉研究的发展至关重要,因为它们可以为评估各种方法提供公平和原则性的方法,并促进相应算法的改进。近年来,针对单相机单目标跟踪、单相机多目标检测和单相机多目标跟踪场景,已经创建了许多基准测试。然而,据我们所知,提供多相机多目标跟踪的基准很少。在本文中,我们建立了一个多无人机多目标跟踪任务的数据集来填补这一空白。几个摄像机被放置在VICON动作捕捉系统中去模拟无人机团队,几个玩具车被用来代表地面目标。记录摄像机的第一视角视频、摄像机的运动状态和物体的地面真实情况。我们还提出了一个评价多无人机多目标跟踪任务性能的指标。算法评估的数据集和代码可以在我们的GitHub (https://github.com/bitshenwenxiao/MUMO)上获得。
{"title":"A Benchmark for Vision-based Multi-UAV Multi-object Tracking","authors":"Hao Shen, Xiwen Yang, D. Lin, Jianduo Chai, Jiakai Huo, Xiaofeng Xing, Shaoming He","doi":"10.1109/MFI55806.2022.9913874","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913874","url":null,"abstract":"Vision-based multi-sensor multi-object tracking is a fundamental task in the applications of a swarm of Unmanned Aerial Vehicles (UAVs). The benchmark datasets are critical to the development of computer vision research since they can provide a fair and principled way to evaluate various approaches and promote the improvement of corresponding algorithms. In recent years, many benchmarks have been created for single-camera single-object tracking, single-camera multi-object detection, and single-camera multi-object tracking scenarios. However, up to the best of our knowledge, few benchmarks of multi-camera multi-object tracking have been provided. In this paper, we build a dataset for multi-UAV multi-object tracking tasks to fill the gap. Several cameras are placed in the VICON motion capture system to simulate the UAV team, and several toy cars are employed to represent ground targets. The first-perspective videos from the cameras, the motion states of the cameras, and the ground truth of the objects are recorded. We also propose a metric to evaluate the performance of the multi-UAV multi-object tracking task. The dataset and the code for algorithm evaluation are available at our GitHub (https://github.com/bitshenwenxiao/MUMO).","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131445921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Vision-based Fall Detection in Aircraft Maintenance Environment with Pose Estimation 基于姿态估计的飞机维修环境视觉坠落检测
Adeyemi Osigbesan, Solene Barrat, Harkeerat Singh, Dongzi Xia, Siddharth Singh, Yang Xing, Weisi Guo, A. Tsourdos
Fall-related injuries at the workplace account for a fair percentage of the global accident at work claims according to Health and Safety Executive (HSE). With a significant percentage of these being fatal, industrial and maintenance workshops have great potential for injuries that can be associated with slips, trips, and other types of falls, owing to their characteristic fast-paced workspaces. Typically, the short turnaround time expected for aircraft undergoing maintenance increases the risk of workers falling, and thus makes a good case for the study of more contemporary methods for the detection of work-related falls in the aircraft maintenance environment. Advanced development in human pose estimation using computer vision technology has made it possible to automate real-time detection and classification of human actions by analyzing body part motion and position relative to time. This paper attempts to combine the analysis of body silhouette bounding box with body joint position estimation to detect and categorize in real-time, human motion captured in continuous video feeds into a fall or a non-fall event. We proposed a standard wide-angle camera, installed at a diagonal ceiling position in an aircraft hangar for our visual data input, and a three-dimensional convolutional neural network with Long Short-Term Memory (LSTM) layers using a technique we referred to as Region Key point (Reg-Key) repartitioning for visual pose estimation and fall detection.
根据健康与安全执行局(HSE)的数据,工作场所与跌倒有关的伤害占全球工作事故索赔的相当比例。其中很大一部分是致命的,工业和维修车间由于其快节奏的工作空间特点,有很大的潜在伤害,可能与滑倒、绊倒和其他类型的跌倒有关。通常情况下,飞机维修所需的短周转时间增加了工人摔倒的风险,因此研究更现代的方法来检测飞机维修环境中与工作有关的摔倒是一个很好的例子。利用计算机视觉技术进行人体姿态估计的先进发展,使得通过分析身体部位的运动和相对于时间的位置来自动实时检测和分类人体动作成为可能。本文试图将身体轮廓边界盒分析与身体关节位置估计相结合,实时检测和分类连续视频馈送中捕获的人体运动,将其分为跌倒事件和非跌倒事件。我们提出了一个标准的广角摄像机,安装在机库的对角线天花板位置,用于视觉数据输入,以及一个具有长短期记忆(LSTM)层的三维卷积神经网络,使用我们称为区域关键点(Reg-Key)重新划分的技术,用于视觉姿态估计和跌倒检测。
{"title":"Vision-based Fall Detection in Aircraft Maintenance Environment with Pose Estimation","authors":"Adeyemi Osigbesan, Solene Barrat, Harkeerat Singh, Dongzi Xia, Siddharth Singh, Yang Xing, Weisi Guo, A. Tsourdos","doi":"10.1109/MFI55806.2022.9913877","DOIUrl":"https://doi.org/10.1109/MFI55806.2022.9913877","url":null,"abstract":"Fall-related injuries at the workplace account for a fair percentage of the global accident at work claims according to Health and Safety Executive (HSE). With a significant percentage of these being fatal, industrial and maintenance workshops have great potential for injuries that can be associated with slips, trips, and other types of falls, owing to their characteristic fast-paced workspaces. Typically, the short turnaround time expected for aircraft undergoing maintenance increases the risk of workers falling, and thus makes a good case for the study of more contemporary methods for the detection of work-related falls in the aircraft maintenance environment. Advanced development in human pose estimation using computer vision technology has made it possible to automate real-time detection and classification of human actions by analyzing body part motion and position relative to time. This paper attempts to combine the analysis of body silhouette bounding box with body joint position estimation to detect and categorize in real-time, human motion captured in continuous video feeds into a fall or a non-fall event. We proposed a standard wide-angle camera, installed at a diagonal ceiling position in an aircraft hangar for our visual data input, and a three-dimensional convolutional neural network with Long Short-Term Memory (LSTM) layers using a technique we referred to as Region Key point (Reg-Key) repartitioning for visual pose estimation and fall detection.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124423667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1