首页 > 最新文献

2023 18th International Conference on Machine Vision and Applications (MVA)最新文献

英文 中文
Panoptic Segmentation of Galactic Structures in LSB Images LSB图像中星系结构的全视分割
Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10216057
Felix Richards, A. Paiement, Xianghua Xie, Elisabeth Sola, P. Duc
We explore the use of deep learning to localise galactic structures in low surface brightness (LSB) images. LSB imaging reveals many interesting structures, though these are frequently confused with galactic dust contamination, due to a strong local visual similarity. We propose a novel unified approach to multi-class segmentation of galactic structures and of extended amorphous image contaminants. Our panoptic segmentation model combines Mask R-CNN with a contaminant specialised network and utilises an adaptive preprocessing layer to better capture the subtle features of LSB images. Further, a human-in-the-loop training scheme is employed to augment ground truth labels. These different approaches are evaluated in turn, and together greatly improve the detection of both galactic structures and contaminants in LSB images.
我们探索使用深度学习来定位低表面亮度(LSB)图像中的星系结构。LSB成像显示了许多有趣的结构,尽管由于强烈的局部视觉相似性,这些结构经常与星系尘埃污染混淆。我们提出了一种新的统一的方法来分割星系结构和扩展的无定形图像污染物。我们的全光学分割模型结合了掩模R-CNN和污染物专用网络,并利用自适应预处理层来更好地捕捉LSB图像的细微特征。此外,采用人在环训练方案来增强地面真值标签。这些不同的方法依次进行评估,并共同极大地提高了对LSB图像中星系结构和污染物的检测。
{"title":"Panoptic Segmentation of Galactic Structures in LSB Images","authors":"Felix Richards, A. Paiement, Xianghua Xie, Elisabeth Sola, P. Duc","doi":"10.23919/MVA57639.2023.10216057","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216057","url":null,"abstract":"We explore the use of deep learning to localise galactic structures in low surface brightness (LSB) images. LSB imaging reveals many interesting structures, though these are frequently confused with galactic dust contamination, due to a strong local visual similarity. We propose a novel unified approach to multi-class segmentation of galactic structures and of extended amorphous image contaminants. Our panoptic segmentation model combines Mask R-CNN with a contaminant specialised network and utilises an adaptive preprocessing layer to better capture the subtle features of LSB images. Further, a human-in-the-loop training scheme is employed to augment ground truth labels. These different approaches are evaluated in turn, and together greatly improve the detection of both galactic structures and contaminants in LSB images.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127934303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Mixed Distillation for Unsupervised Anomaly Detection 用于无监督异常检测的混合蒸馏
Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215597
Fuzhen Cai, Siyu Xia
Anomaly detection is typically a class of unsupervised learning problems in which the model is trained with only normal samples. Knowledge distillation (KD) has shown promising results in the field of image anomaly detection, especially for texture images. However, the knowledge of the classical KD model is step-by-step transferred from the shallow layers to the deep, which causes the deep layers not to be well-fitted due to an incomplete match of the shallow layers of the student network. For this problem, we propose a skip distillation method, which allows the deep layers of the student network to learn directly from the shallow of the teacher, avoiding a worse deep fit. We also design a symmetric path that allows the shallow layers of the student network to learn directly from the deep of the teacher. These two paths encode sufficient information for the student network. We have done thorough experiments on the anomaly detection benchmark dataset MvtecAD, and the experimental results show that our model exceeds the current state-of-the-art anomaly detection methods in terms of texture classes.
异常检测通常是一类无监督学习问题,其中模型仅使用正常样本进行训练。知识蒸馏(Knowledge distillation, KD)在图像异常检测领域,特别是纹理图像异常检测领域显示出良好的效果。然而,经典KD模型的知识是逐步从浅层转移到深层的,这导致深层由于学生网络的浅层不完全匹配而不能很好地拟合。对于这个问题,我们提出了一种跳过蒸馏方法,该方法允许学生网络的深层直接从教师的浅层学习,避免了深度拟合的恶化。我们还设计了一条对称路径,允许学生网络的浅层直接从教师的深层学习。这两条路径为学生网络编码了足够的信息。我们在异常检测基准数据集MvtecAD上进行了深入的实验,实验结果表明,我们的模型在纹理类方面优于当前最先进的异常检测方法。
{"title":"Mixed Distillation for Unsupervised Anomaly Detection","authors":"Fuzhen Cai, Siyu Xia","doi":"10.23919/MVA57639.2023.10215597","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215597","url":null,"abstract":"Anomaly detection is typically a class of unsupervised learning problems in which the model is trained with only normal samples. Knowledge distillation (KD) has shown promising results in the field of image anomaly detection, especially for texture images. However, the knowledge of the classical KD model is step-by-step transferred from the shallow layers to the deep, which causes the deep layers not to be well-fitted due to an incomplete match of the shallow layers of the student network. For this problem, we propose a skip distillation method, which allows the deep layers of the student network to learn directly from the shallow of the teacher, avoiding a worse deep fit. We also design a symmetric path that allows the shallow layers of the student network to learn directly from the deep of the teacher. These two paths encode sufficient information for the student network. We have done thorough experiments on the anomaly detection benchmark dataset MvtecAD, and the experimental results show that our model exceeds the current state-of-the-art anomaly detection methods in terms of texture classes.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133511811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Prior Based Multi-Scale Condition Network for Single-Image HDR Reconstruction 基于多先验的单图像HDR重构多尺度条件网络
Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10216063
Haorong Jiang, Fengshan Zhao, Junda Liao, Qin Liu, T. Ikenaga
High Dynamic Range (HDR) imaging aims to reconstruct the natural appearance of real-world scenes by expanding the bit depth of captured images. However, due to the imaging pipeline of off-the-shelf cameras, information loss in over-exposed areas and noise in under-exposed areas pose significant challenges for single-image HDR imaging. As a result, the key to success lies in restoring over-exposed regions and denoising under-exposed regions. In this paper, a multi-prior based multi-scale condition network is proposed to address this issue. (1) Three types of prior knowledge modulate the intermediate features in the reconstruction network from different perspectives, resulting in improved modulation effects. (2) Multi-scale fusion extracts and integrates deep semantic information from various priors. Experiments on the NTIRE HDR challenge dataset demonstrate that the proposed method achieves state-of-the-art quantitative results.
高动态范围(HDR)成像旨在通过扩展捕获图像的位深度来重建真实世界场景的自然外观。然而,由于现有相机的成像流水线,过度曝光区域的信息丢失和曝光不足区域的噪声对单图像HDR成像构成了重大挑战。因此,成功的关键在于过度曝光区域的恢复和曝光不足区域的去噪。本文提出了一种基于多先验的多尺度条件网络来解决这一问题。(1)三种先验知识从不同角度对重构网络中的中间特征进行了调制,提高了调制效果。(2)多尺度融合从各种先验信息中提取和融合深层语义信息。在整个HDR挑战数据集上的实验表明,该方法达到了最先进的定量结果。
{"title":"Multi-Prior Based Multi-Scale Condition Network for Single-Image HDR Reconstruction","authors":"Haorong Jiang, Fengshan Zhao, Junda Liao, Qin Liu, T. Ikenaga","doi":"10.23919/MVA57639.2023.10216063","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216063","url":null,"abstract":"High Dynamic Range (HDR) imaging aims to reconstruct the natural appearance of real-world scenes by expanding the bit depth of captured images. However, due to the imaging pipeline of off-the-shelf cameras, information loss in over-exposed areas and noise in under-exposed areas pose significant challenges for single-image HDR imaging. As a result, the key to success lies in restoring over-exposed regions and denoising under-exposed regions. In this paper, a multi-prior based multi-scale condition network is proposed to address this issue. (1) Three types of prior knowledge modulate the intermediate features in the reconstruction network from different perspectives, resulting in improved modulation effects. (2) Multi-scale fusion extracts and integrates deep semantic information from various priors. Experiments on the NTIRE HDR challenge dataset demonstrate that the proposed method achieves state-of-the-art quantitative results.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128020555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Safe height estimation of deformable objects for picking robots by detecting multiple potential contact points 基于多个潜在接触点的可变形物体拾取机器人安全高度估计
Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215690
Jaesung Yang, Daisuke Hagihara, Kiyoto Ito, Nobuhiro Chihara
Object sorting in logistics warehouses is still carried out manually, and there is a great need for automation with arm robots. It is desirable that target objects be carefully placed in situations where careful handling of products is important. We propose a method for estimating the height of picked object with a single depth camera to achieve precise placing of items such as stacking, especially for objects that are deformable, e.g., bags. The proposed method detects multiple potential contact points of a picked object to estimate the appropriate height to place the object using the point-cloud difference before and after picking. The validity of the proposed method was verified using 26 cases in which deformable objects were placed inside a container, and it was confirmed that object-height estimation is possible with an average error of 3.2 mm.
物流仓库中的物品分拣仍然是手工进行的,有很大的自动化需求与手臂机器人。在小心处理产品很重要的情况下,目标物体应小心放置。我们提出了一种用单深度相机估计拾取物体高度的方法,以实现物品的精确放置,如堆叠,特别是对于可变形的物体,如袋子。该方法利用拾取物体前后的点云差值,检测被拾取物体的多个潜在接触点,估计物体放置的合适高度。用26个可变形物体放置在容器内的实例验证了该方法的有效性,结果表明,该方法可以估计出物体高度,平均误差为3.2 mm。
{"title":"Safe height estimation of deformable objects for picking robots by detecting multiple potential contact points","authors":"Jaesung Yang, Daisuke Hagihara, Kiyoto Ito, Nobuhiro Chihara","doi":"10.23919/MVA57639.2023.10215690","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215690","url":null,"abstract":"Object sorting in logistics warehouses is still carried out manually, and there is a great need for automation with arm robots. It is desirable that target objects be carefully placed in situations where careful handling of products is important. We propose a method for estimating the height of picked object with a single depth camera to achieve precise placing of items such as stacking, especially for objects that are deformable, e.g., bags. The proposed method detects multiple potential contact points of a picked object to estimate the appropriate height to place the object using the point-cloud difference before and after picking. The validity of the proposed method was verified using 26 cases in which deformable objects were placed inside a container, and it was confirmed that object-height estimation is possible with an average error of 3.2 mm.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124094200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video Anomaly Detection Using Encoder-Decoder Networks with Video Vision Transformer and Channel Attention Blocks 基于视频视觉变压器和信道注意块的编码器-解码器网络的视频异常检测
Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215921
Shimpei Kobayashi, A. Hizukuri, R. Nakayama
A surveillance camera has been introduced in various locations for public safety. However, security personnel who have to keep observing surveillance camera movies with few abnormal events would be boring. The purpose of this study is to develop a computerized anomaly detection method for the surveillance camera movies. Our database consisted of three public datasets for anomaly detection: UCSD Pedestrian 1, 2, and CUHK Avenue datasets. In the proposed network, channel attention blocks were introduced to TransAnomaly which is one of the anomaly detections to focus important channel information. The areas under the receiver operating characteristic curves (AUCs) with the proposed network were 0.827 for UCSD Pedestrian 1, 0.964 for UCSD Pedestrian 2, and 0.854 for CUHK Avenue, respectively. The AUCs for the proposed network were greater than those for a conventional TransAnomaly without channel attention blocks (0.767, 0.934, and 0.839).
为保障公众安全,在多个地点安装了监控摄像头。但是,如果保安人员一直盯着监控录像看,几乎没有什么异常事件,那就太无聊了。本研究的目的是开发一种针对监控摄影机影像的电脑异常侦测方法。我们的数据库包括三个用于异常检测的公共数据集:UCSD行人1、2和中大大道数据集。在TransAnomaly中引入了通道注意块,这是一种聚焦重要通道信息的异常检测方法。UCSD行人通道1号、UCSD行人通道2号及中大大道的接收人工作特征曲线(auc)下面积分别为0.827、0.964及0.854。该网络的auc大于没有通道注意块的传统TransAnomaly(0.767, 0.934和0.839)。
{"title":"Video Anomaly Detection Using Encoder-Decoder Networks with Video Vision Transformer and Channel Attention Blocks","authors":"Shimpei Kobayashi, A. Hizukuri, R. Nakayama","doi":"10.23919/MVA57639.2023.10215921","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215921","url":null,"abstract":"A surveillance camera has been introduced in various locations for public safety. However, security personnel who have to keep observing surveillance camera movies with few abnormal events would be boring. The purpose of this study is to develop a computerized anomaly detection method for the surveillance camera movies. Our database consisted of three public datasets for anomaly detection: UCSD Pedestrian 1, 2, and CUHK Avenue datasets. In the proposed network, channel attention blocks were introduced to TransAnomaly which is one of the anomaly detections to focus important channel information. The areas under the receiver operating characteristic curves (AUCs) with the proposed network were 0.827 for UCSD Pedestrian 1, 0.964 for UCSD Pedestrian 2, and 0.854 for CUHK Avenue, respectively. The AUCs for the proposed network were greater than those for a conventional TransAnomaly without channel attention blocks (0.767, 0.934, and 0.839).","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125656568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TinyPedSeg: A Tiny Pedestrian Segmentation Benchmark for Top-Down Drone Images TinyPedSeg:一个微小的行人分割基准自上而下无人机图像
Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215829
Y. Sahin, Elvin Abdinli, M. A. Aydin, Gozde Unal
The usage of Unmanned Aerial Vehicles (UAVs) has significantly increased in various fields such as surveillance, agriculture, transportation, and military operations. However, the integration of UAVs in these applications requires the ability to navigate autonomously and detect/segment objects in real-time, which can be achieved through the use of neural networks. Despite object detection for RGB images/videos obtained from UAVs are widely studied before, limited effort has been made for segmentation from top-down aerial images. Considering the case in which the UAV is extremely high from the ground, the task can be formed as tiny object segmentation. Thus, inspired from the TinyPerson dataset which focuses on person detection from UAVs, we present TinyPedSeg, which contains 2563 pedestrians in 320 images. Specialized only in pedestrian segmentation, our dataset presents more informativeness than other UAV segmentation datasets. The dataset and the baseline codes are available at https://github.com/ituvisionlab/tinypedseg
无人驾驶飞行器(uav)在监视、农业、运输、军事行动等各个领域的使用显著增加。然而,在这些应用中集成无人机需要自主导航和实时检测/分割物体的能力,这可以通过使用神经网络来实现。尽管以前对无人机获得的RGB图像/视频的目标检测进行了广泛的研究,但对自上而下航拍图像的分割研究有限。考虑到无人机距离地面极高的情况,任务可以形成为微小目标分割。因此,受专注于无人机人员检测的TinyPerson数据集的启发,我们提出了TinyPedSeg,它包含320张图像中的2563名行人。我们的数据集只专注于行人分割,比其他无人机分割数据集具有更多的信息。数据集和基线代码可在https://github.com/ituvisionlab/tinypedseg上获得
{"title":"TinyPedSeg: A Tiny Pedestrian Segmentation Benchmark for Top-Down Drone Images","authors":"Y. Sahin, Elvin Abdinli, M. A. Aydin, Gozde Unal","doi":"10.23919/MVA57639.2023.10215829","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215829","url":null,"abstract":"The usage of Unmanned Aerial Vehicles (UAVs) has significantly increased in various fields such as surveillance, agriculture, transportation, and military operations. However, the integration of UAVs in these applications requires the ability to navigate autonomously and detect/segment objects in real-time, which can be achieved through the use of neural networks. Despite object detection for RGB images/videos obtained from UAVs are widely studied before, limited effort has been made for segmentation from top-down aerial images. Considering the case in which the UAV is extremely high from the ground, the task can be formed as tiny object segmentation. Thus, inspired from the TinyPerson dataset which focuses on person detection from UAVs, we present TinyPedSeg, which contains 2563 pedestrians in 320 images. Specialized only in pedestrian segmentation, our dataset presents more informativeness than other UAV segmentation datasets. The dataset and the baseline codes are available at https://github.com/ituvisionlab/tinypedseg","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"258 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132021178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainty Criteria in Active Transfer Learning for Efficient Video-Specific Human Pose Estimation 主动迁移学习中的不确定性准则用于高效视频特定人体姿态估计
Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215565
Hiromu Taketsugu, N. Ukita
This paper presents a combination of Active Learning (AL) and Transfer Learning (TL) for efficiently adapting Human Pose (HP) estimators to individual videos. The proposed approach quantifies estimation uncertainty through the temporal changes and unnaturalness of estimated HPs. These uncertainty criteria are combined with clustering-based representativeness criterion to avoid the useless selection of similar samples. Experiments demonstrated that the proposed method achieves high learning efficiency and outperforms comparative methods.
本文提出了一种结合主动学习(AL)和迁移学习(TL)的方法,用于有效地使人体姿态(HP)估计器适应单个视频。该方法通过估计hp的时间变化和非自然性来量化估计的不确定性。这些不确定度准则与基于聚类的代表性准则相结合,避免了相似样本的无用选择。实验结果表明,该方法具有较高的学习效率,优于其他比较方法。
{"title":"Uncertainty Criteria in Active Transfer Learning for Efficient Video-Specific Human Pose Estimation","authors":"Hiromu Taketsugu, N. Ukita","doi":"10.23919/MVA57639.2023.10215565","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215565","url":null,"abstract":"This paper presents a combination of Active Learning (AL) and Transfer Learning (TL) for efficiently adapting Human Pose (HP) estimators to individual videos. The proposed approach quantifies estimation uncertainty through the temporal changes and unnaturalness of estimated HPs. These uncertainty criteria are combined with clustering-based representativeness criterion to avoid the useless selection of similar samples. Experiments demonstrated that the proposed method achieves high learning efficiency and outperforms comparative methods.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114247722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
QaQ: Robust 6D Pose Estimation via Quality-Assessed RGB-D Fusion QaQ:基于质量评估RGB-D融合的稳健6D姿态估计
Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10216208
Théo Petitjean, Zongwei Wu, O. Laligant, C. Demonceaux
RGB-D 6D pose estimation has recently drawn great research attention thanks to the complementary depth information. Whereas, the depth and the color image are often noisy in real industrial scenarios. Therefore, it becomes challenging for many existing methods that fuse equally RGB and depth features. In this paper, we present a novel fusion design to adaptively merge RGB-D cues. Specifically, we created a Quality-assessment block that estimates the global quality of the input modalities. This quality represented as an α parameter is then used to reinforce the fusion. We have thus found a simple and effective way to improve the robustness to low-quality inputs in terms of Depth and RGB. Extensive experiments on 6D pose estimation demonstrate the efficiency of our method, especially when noise is present in the input.
rgb - d6d位姿估计由于其深度信息的互补性,近年来引起了广泛的研究关注。然而,在实际工业场景中,深度和彩色图像往往存在噪声。因此,现有的许多融合RGB和深度特征的方法变得具有挑战性。在本文中,我们提出了一种新的自适应融合RGB-D线索的融合设计。具体来说,我们创建了一个质量评估块来估计输入模式的整体质量。这种质量表示为α参数,然后用来加强融合。因此,我们找到了一种简单而有效的方法来提高深度和RGB方面的低质量输入的鲁棒性。对6D姿态估计的大量实验证明了我们的方法的有效性,特别是当输入中存在噪声时。
{"title":"QaQ: Robust 6D Pose Estimation via Quality-Assessed RGB-D Fusion","authors":"Théo Petitjean, Zongwei Wu, O. Laligant, C. Demonceaux","doi":"10.23919/MVA57639.2023.10216208","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216208","url":null,"abstract":"RGB-D 6D pose estimation has recently drawn great research attention thanks to the complementary depth information. Whereas, the depth and the color image are often noisy in real industrial scenarios. Therefore, it becomes challenging for many existing methods that fuse equally RGB and depth features. In this paper, we present a novel fusion design to adaptively merge RGB-D cues. Specifically, we created a Quality-assessment block that estimates the global quality of the input modalities. This quality represented as an α parameter is then used to reinforce the fusion. We have thus found a simple and effective way to improve the robustness to low-quality inputs in terms of Depth and RGB. Extensive experiments on 6D pose estimation demonstrate the efficiency of our method, especially when noise is present in the input.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114518314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalization of pixel-wise phase estimation by CNN and improvement of phase-unwrapping by MRF optimization for one-shot 3D scan 基于CNN的逐像素相位估计泛化及基于MRF优化的单次三维扫描相位展开改进
Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215780
Hiroto Harada, M. Mikamo, Furukawa Ryo, R. Sagawa, Hiroshi Kawasaki
Active stereo technique using single pattern projection, a.k.a. one-shot 3D scan, have drawn a wide attention from industry, medical purposes, etc. One severe drawback of one-shot 3D scan is sparse reconstruction. In addition, since spatial pattern becomes complicated for the purpose of efficient embedding, it is easily affected by noise, which results in unstable decoding. To solve the problems, we propose a pixel-wise interpolation technique for one-shot scan, which is applicable to any types of static pattern if the pattern is regular and periodic. This is achieved by U-net which is pre-trained by CG with efficient data augmentation algorithm. In the paper, to further overcome the decoding instability, we propose a robust correspondence finding algorithm based on Markov random field (MRF) optimization. We also propose a shape refinement algorithm based on b-spline and Gaussian kernel interpolation using explicitly detected laser curves. Experiments are conducted to show the effectiveness of the proposed method using real data with strong noises and textures.
单模式投影的主动立体技术,又称一次性三维扫描,已引起工业、医学等领域的广泛关注。单次三维扫描的一个严重缺点是重建稀疏。此外,为了有效嵌入,空间模式变得复杂,容易受到噪声的影响,导致解码不稳定。为了解决这些问题,我们提出了一种单次扫描的逐像素插值技术,该技术适用于任何类型的静态模式,只要该模式是规则的和周期性的。这是由CG用高效的数据增强算法对U-net进行预训练实现的。为了进一步克服解码不稳定性,本文提出了一种基于马尔可夫随机场(MRF)优化的鲁棒通信查找算法。我们还提出了一种基于b样条和高斯核插值的形状优化算法。在具有强噪声和强纹理的真实数据中进行了实验,验证了该方法的有效性。
{"title":"Generalization of pixel-wise phase estimation by CNN and improvement of phase-unwrapping by MRF optimization for one-shot 3D scan","authors":"Hiroto Harada, M. Mikamo, Furukawa Ryo, R. Sagawa, Hiroshi Kawasaki","doi":"10.23919/MVA57639.2023.10215780","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215780","url":null,"abstract":"Active stereo technique using single pattern projection, a.k.a. one-shot 3D scan, have drawn a wide attention from industry, medical purposes, etc. One severe drawback of one-shot 3D scan is sparse reconstruction. In addition, since spatial pattern becomes complicated for the purpose of efficient embedding, it is easily affected by noise, which results in unstable decoding. To solve the problems, we propose a pixel-wise interpolation technique for one-shot scan, which is applicable to any types of static pattern if the pattern is regular and periodic. This is achieved by U-net which is pre-trained by CG with efficient data augmentation algorithm. In the paper, to further overcome the decoding instability, we propose a robust correspondence finding algorithm based on Markov random field (MRF) optimization. We also propose a shape refinement algorithm based on b-spline and Gaussian kernel interpolation using explicitly detected laser curves. Experiments are conducted to show the effectiveness of the proposed method using real data with strong noises and textures.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122561657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LOTS: Litter On The Sand dataset for litter segmentation lot:用于凋落物分割的沙地上的凋落物数据集
Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10216220
Paola Barra, Alessia Auriemma Citarella, Giosuè Orefice, M. Castrillón-Santana, A. Ciaramella
The marine ecosystem is threatened by human waste released into the sea. One of the most challenging marine litter to identify and remove are the small particles settled on the sand which may be ingested by local fauna or cause damage to the marine ecosystem. Those particles are not easy to identify because they get confused with maritime/natural material, natural elements such as shells, stones or others, which can not be classified as "litter". In this work we present a dataset of Litter On The Sand (LOTS), with images of clean, dirty and wavy sand from 3 different beaches.
人类排入海洋的排泄物威胁着海洋生态系统。识别和清除最具挑战性的海洋垃圾之一是沉积在沙滩上的小颗粒,它们可能被当地动物摄入或对海洋生态系统造成破坏。这些颗粒不容易识别,因为它们与海洋/天然材料、贝壳、石头等自然元素混淆,而这些自然元素不能归类为“垃圾”。在这项工作中,我们展示了一个关于沙滩上的垃圾(LOTS)的数据集,其中有来自3个不同海滩的干净、脏和波浪状的沙子的图像。
{"title":"LOTS: Litter On The Sand dataset for litter segmentation","authors":"Paola Barra, Alessia Auriemma Citarella, Giosuè Orefice, M. Castrillón-Santana, A. Ciaramella","doi":"10.23919/MVA57639.2023.10216220","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216220","url":null,"abstract":"The marine ecosystem is threatened by human waste released into the sea. One of the most challenging marine litter to identify and remove are the small particles settled on the sand which may be ingested by local fauna or cause damage to the marine ecosystem. Those particles are not easy to identify because they get confused with maritime/natural material, natural elements such as shells, stones or others, which can not be classified as \"litter\". In this work we present a dataset of Litter On The Sand (LOTS), with images of clean, dirty and wavy sand from 3 different beaches.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123086610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2023 18th International Conference on Machine Vision and Applications (MVA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1