Pub Date : 2023-07-23DOI: 10.23919/MVA57639.2023.10215538
Chu-Chi Chiu, Hsuan-Kung Yang, Hao-Wei Chen, Yu-Wen Chen, Chun-Yi Lee
In this paper, we develop a Vision Transformer based visual odometry (VO), called ViTVO. ViTVO introduces an attention mechanism to perform visual odometry. Due to the nature of VO, Transformer based VO models tend to overconcentrate on few points, which may result in a degradation of accuracy. In addition, noises from dynamic objects usually cause difficulties in performing VO tasks. To overcome these issues, we propose an attention loss during training, which utilizes ground truth masks or self supervision to guide the attention maps to focus more on static regions of an image. In our experiments, we demonstrate the superior performance of ViTVO on the Sintel validation set, and validate the effectiveness of our attention supervision mechanism in performing VO tasks.
{"title":"ViTVO: Vision Transformer based Visual Odometry with Attention Supervision","authors":"Chu-Chi Chiu, Hsuan-Kung Yang, Hao-Wei Chen, Yu-Wen Chen, Chun-Yi Lee","doi":"10.23919/MVA57639.2023.10215538","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215538","url":null,"abstract":"In this paper, we develop a Vision Transformer based visual odometry (VO), called ViTVO. ViTVO introduces an attention mechanism to perform visual odometry. Due to the nature of VO, Transformer based VO models tend to overconcentrate on few points, which may result in a degradation of accuracy. In addition, noises from dynamic objects usually cause difficulties in performing VO tasks. To overcome these issues, we propose an attention loss during training, which utilizes ground truth masks or self supervision to guide the attention maps to focus more on static regions of an image. In our experiments, we demonstrate the superior performance of ViTVO on the Sintel validation set, and validate the effectiveness of our attention supervision mechanism in performing VO tasks.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127923758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-23DOI: 10.23919/MVA57639.2023.10216160
Ching-Ching Yang, W. Chu, S. Dubey
Weakly-supervised image hashing emerges recently because web images associated with contextual text or tags are abundant. Text information weakly-related to images can be utilized to guide the learning of a deep hashing network. In this paper, we propose Weakly-supervised deep Hashing based on Cross-Modal Transformer (WHCMT). First, cross-scale attention between image patches is discovered to form more effective visual representations. A baseline transformer is also adopted to find self-attention of tags and form tag representations. Second, the cross-modal attention between images and tags is discovered by the proposed cross-modal transformer. Effective hash codes are then generated by embedding layers. WHCMT is tested on semantic image retrieval, and we show new state-of-the-art results can be obtained for the MIRFLICKR-25K dataset and NUS-WIDE dataset.
{"title":"Weakly-Supervised Deep Image Hashing based on Cross-Modal Transformer","authors":"Ching-Ching Yang, W. Chu, S. Dubey","doi":"10.23919/MVA57639.2023.10216160","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216160","url":null,"abstract":"Weakly-supervised image hashing emerges recently because web images associated with contextual text or tags are abundant. Text information weakly-related to images can be utilized to guide the learning of a deep hashing network. In this paper, we propose Weakly-supervised deep Hashing based on Cross-Modal Transformer (WHCMT). First, cross-scale attention between image patches is discovered to form more effective visual representations. A baseline transformer is also adopted to find self-attention of tags and form tag representations. Second, the cross-modal attention between images and tags is discovered by the proposed cross-modal transformer. Effective hash codes are then generated by embedding layers. WHCMT is tested on semantic image retrieval, and we show new state-of-the-art results can be obtained for the MIRFLICKR-25K dataset and NUS-WIDE dataset.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129882408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-23DOI: 10.23919/MVA57639.2023.10215921
Shimpei Kobayashi, A. Hizukuri, R. Nakayama
A surveillance camera has been introduced in various locations for public safety. However, security personnel who have to keep observing surveillance camera movies with few abnormal events would be boring. The purpose of this study is to develop a computerized anomaly detection method for the surveillance camera movies. Our database consisted of three public datasets for anomaly detection: UCSD Pedestrian 1, 2, and CUHK Avenue datasets. In the proposed network, channel attention blocks were introduced to TransAnomaly which is one of the anomaly detections to focus important channel information. The areas under the receiver operating characteristic curves (AUCs) with the proposed network were 0.827 for UCSD Pedestrian 1, 0.964 for UCSD Pedestrian 2, and 0.854 for CUHK Avenue, respectively. The AUCs for the proposed network were greater than those for a conventional TransAnomaly without channel attention blocks (0.767, 0.934, and 0.839).
{"title":"Video Anomaly Detection Using Encoder-Decoder Networks with Video Vision Transformer and Channel Attention Blocks","authors":"Shimpei Kobayashi, A. Hizukuri, R. Nakayama","doi":"10.23919/MVA57639.2023.10215921","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215921","url":null,"abstract":"A surveillance camera has been introduced in various locations for public safety. However, security personnel who have to keep observing surveillance camera movies with few abnormal events would be boring. The purpose of this study is to develop a computerized anomaly detection method for the surveillance camera movies. Our database consisted of three public datasets for anomaly detection: UCSD Pedestrian 1, 2, and CUHK Avenue datasets. In the proposed network, channel attention blocks were introduced to TransAnomaly which is one of the anomaly detections to focus important channel information. The areas under the receiver operating characteristic curves (AUCs) with the proposed network were 0.827 for UCSD Pedestrian 1, 0.964 for UCSD Pedestrian 2, and 0.854 for CUHK Avenue, respectively. The AUCs for the proposed network were greater than those for a conventional TransAnomaly without channel attention blocks (0.767, 0.934, and 0.839).","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125656568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-23DOI: 10.23919/MVA57639.2023.10216063
Haorong Jiang, Fengshan Zhao, Junda Liao, Qin Liu, T. Ikenaga
High Dynamic Range (HDR) imaging aims to reconstruct the natural appearance of real-world scenes by expanding the bit depth of captured images. However, due to the imaging pipeline of off-the-shelf cameras, information loss in over-exposed areas and noise in under-exposed areas pose significant challenges for single-image HDR imaging. As a result, the key to success lies in restoring over-exposed regions and denoising under-exposed regions. In this paper, a multi-prior based multi-scale condition network is proposed to address this issue. (1) Three types of prior knowledge modulate the intermediate features in the reconstruction network from different perspectives, resulting in improved modulation effects. (2) Multi-scale fusion extracts and integrates deep semantic information from various priors. Experiments on the NTIRE HDR challenge dataset demonstrate that the proposed method achieves state-of-the-art quantitative results.
{"title":"Multi-Prior Based Multi-Scale Condition Network for Single-Image HDR Reconstruction","authors":"Haorong Jiang, Fengshan Zhao, Junda Liao, Qin Liu, T. Ikenaga","doi":"10.23919/MVA57639.2023.10216063","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216063","url":null,"abstract":"High Dynamic Range (HDR) imaging aims to reconstruct the natural appearance of real-world scenes by expanding the bit depth of captured images. However, due to the imaging pipeline of off-the-shelf cameras, information loss in over-exposed areas and noise in under-exposed areas pose significant challenges for single-image HDR imaging. As a result, the key to success lies in restoring over-exposed regions and denoising under-exposed regions. In this paper, a multi-prior based multi-scale condition network is proposed to address this issue. (1) Three types of prior knowledge modulate the intermediate features in the reconstruction network from different perspectives, resulting in improved modulation effects. (2) Multi-scale fusion extracts and integrates deep semantic information from various priors. Experiments on the NTIRE HDR challenge dataset demonstrate that the proposed method achieves state-of-the-art quantitative results.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128020555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Object sorting in logistics warehouses is still carried out manually, and there is a great need for automation with arm robots. It is desirable that target objects be carefully placed in situations where careful handling of products is important. We propose a method for estimating the height of picked object with a single depth camera to achieve precise placing of items such as stacking, especially for objects that are deformable, e.g., bags. The proposed method detects multiple potential contact points of a picked object to estimate the appropriate height to place the object using the point-cloud difference before and after picking. The validity of the proposed method was verified using 26 cases in which deformable objects were placed inside a container, and it was confirmed that object-height estimation is possible with an average error of 3.2 mm.
{"title":"Safe height estimation of deformable objects for picking robots by detecting multiple potential contact points","authors":"Jaesung Yang, Daisuke Hagihara, Kiyoto Ito, Nobuhiro Chihara","doi":"10.23919/MVA57639.2023.10215690","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215690","url":null,"abstract":"Object sorting in logistics warehouses is still carried out manually, and there is a great need for automation with arm robots. It is desirable that target objects be carefully placed in situations where careful handling of products is important. We propose a method for estimating the height of picked object with a single depth camera to achieve precise placing of items such as stacking, especially for objects that are deformable, e.g., bags. The proposed method detects multiple potential contact points of a picked object to estimate the appropriate height to place the object using the point-cloud difference before and after picking. The validity of the proposed method was verified using 26 cases in which deformable objects were placed inside a container, and it was confirmed that object-height estimation is possible with an average error of 3.2 mm.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124094200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-23DOI: 10.23919/MVA57639.2023.10215829
Y. Sahin, Elvin Abdinli, M. A. Aydin, Gozde Unal
The usage of Unmanned Aerial Vehicles (UAVs) has significantly increased in various fields such as surveillance, agriculture, transportation, and military operations. However, the integration of UAVs in these applications requires the ability to navigate autonomously and detect/segment objects in real-time, which can be achieved through the use of neural networks. Despite object detection for RGB images/videos obtained from UAVs are widely studied before, limited effort has been made for segmentation from top-down aerial images. Considering the case in which the UAV is extremely high from the ground, the task can be formed as tiny object segmentation. Thus, inspired from the TinyPerson dataset which focuses on person detection from UAVs, we present TinyPedSeg, which contains 2563 pedestrians in 320 images. Specialized only in pedestrian segmentation, our dataset presents more informativeness than other UAV segmentation datasets. The dataset and the baseline codes are available at https://github.com/ituvisionlab/tinypedseg
{"title":"TinyPedSeg: A Tiny Pedestrian Segmentation Benchmark for Top-Down Drone Images","authors":"Y. Sahin, Elvin Abdinli, M. A. Aydin, Gozde Unal","doi":"10.23919/MVA57639.2023.10215829","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215829","url":null,"abstract":"The usage of Unmanned Aerial Vehicles (UAVs) has significantly increased in various fields such as surveillance, agriculture, transportation, and military operations. However, the integration of UAVs in these applications requires the ability to navigate autonomously and detect/segment objects in real-time, which can be achieved through the use of neural networks. Despite object detection for RGB images/videos obtained from UAVs are widely studied before, limited effort has been made for segmentation from top-down aerial images. Considering the case in which the UAV is extremely high from the ground, the task can be formed as tiny object segmentation. Thus, inspired from the TinyPerson dataset which focuses on person detection from UAVs, we present TinyPedSeg, which contains 2563 pedestrians in 320 images. Specialized only in pedestrian segmentation, our dataset presents more informativeness than other UAV segmentation datasets. The dataset and the baseline codes are available at https://github.com/ituvisionlab/tinypedseg","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"258 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132021178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-23DOI: 10.23919/MVA57639.2023.10215565
Hiromu Taketsugu, N. Ukita
This paper presents a combination of Active Learning (AL) and Transfer Learning (TL) for efficiently adapting Human Pose (HP) estimators to individual videos. The proposed approach quantifies estimation uncertainty through the temporal changes and unnaturalness of estimated HPs. These uncertainty criteria are combined with clustering-based representativeness criterion to avoid the useless selection of similar samples. Experiments demonstrated that the proposed method achieves high learning efficiency and outperforms comparative methods.
{"title":"Uncertainty Criteria in Active Transfer Learning for Efficient Video-Specific Human Pose Estimation","authors":"Hiromu Taketsugu, N. Ukita","doi":"10.23919/MVA57639.2023.10215565","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215565","url":null,"abstract":"This paper presents a combination of Active Learning (AL) and Transfer Learning (TL) for efficiently adapting Human Pose (HP) estimators to individual videos. The proposed approach quantifies estimation uncertainty through the temporal changes and unnaturalness of estimated HPs. These uncertainty criteria are combined with clustering-based representativeness criterion to avoid the useless selection of similar samples. Experiments demonstrated that the proposed method achieves high learning efficiency and outperforms comparative methods.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114247722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-23DOI: 10.23919/MVA57639.2023.10216191
Mariona Carós, Ariadna Just, S. Seguí, Jordi Vitrià
Airborne LiDAR systems have the capability to capture the Earth’s surface by generating extensive point cloud data comprised of points mainly defined by 3D coordinates. However, labeling such points for supervised learning tasks is time-consuming. As a result, there is a need to investigate techniques that can learn from unlabeled data to significantly reduce the number of annotated samples. In this work, we propose to train a self-supervised encoder with Barlow Twins and use it as a pre-trained network in the task of semantic scene segmentation. The experimental results demonstrate that our unsupervised pre-training boosts performance once fine-tuned on the supervised task, especially for under-represented categories.
{"title":"Self-Supervised Pre-Training Boosts Semantic Scene Segmentation on LiDAR data","authors":"Mariona Carós, Ariadna Just, S. Seguí, Jordi Vitrià","doi":"10.23919/MVA57639.2023.10216191","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216191","url":null,"abstract":"Airborne LiDAR systems have the capability to capture the Earth’s surface by generating extensive point cloud data comprised of points mainly defined by 3D coordinates. However, labeling such points for supervised learning tasks is time-consuming. As a result, there is a need to investigate techniques that can learn from unlabeled data to significantly reduce the number of annotated samples. In this work, we propose to train a self-supervised encoder with Barlow Twins and use it as a pre-trained network in the task of semantic scene segmentation. The experimental results demonstrate that our unsupervised pre-training boosts performance once fine-tuned on the supervised task, especially for under-represented categories.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115157357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-23DOI: 10.23919/MVA57639.2023.10215780
Hiroto Harada, M. Mikamo, Furukawa Ryo, R. Sagawa, Hiroshi Kawasaki
Active stereo technique using single pattern projection, a.k.a. one-shot 3D scan, have drawn a wide attention from industry, medical purposes, etc. One severe drawback of one-shot 3D scan is sparse reconstruction. In addition, since spatial pattern becomes complicated for the purpose of efficient embedding, it is easily affected by noise, which results in unstable decoding. To solve the problems, we propose a pixel-wise interpolation technique for one-shot scan, which is applicable to any types of static pattern if the pattern is regular and periodic. This is achieved by U-net which is pre-trained by CG with efficient data augmentation algorithm. In the paper, to further overcome the decoding instability, we propose a robust correspondence finding algorithm based on Markov random field (MRF) optimization. We also propose a shape refinement algorithm based on b-spline and Gaussian kernel interpolation using explicitly detected laser curves. Experiments are conducted to show the effectiveness of the proposed method using real data with strong noises and textures.
{"title":"Generalization of pixel-wise phase estimation by CNN and improvement of phase-unwrapping by MRF optimization for one-shot 3D scan","authors":"Hiroto Harada, M. Mikamo, Furukawa Ryo, R. Sagawa, Hiroshi Kawasaki","doi":"10.23919/MVA57639.2023.10215780","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215780","url":null,"abstract":"Active stereo technique using single pattern projection, a.k.a. one-shot 3D scan, have drawn a wide attention from industry, medical purposes, etc. One severe drawback of one-shot 3D scan is sparse reconstruction. In addition, since spatial pattern becomes complicated for the purpose of efficient embedding, it is easily affected by noise, which results in unstable decoding. To solve the problems, we propose a pixel-wise interpolation technique for one-shot scan, which is applicable to any types of static pattern if the pattern is regular and periodic. This is achieved by U-net which is pre-trained by CG with efficient data augmentation algorithm. In the paper, to further overcome the decoding instability, we propose a robust correspondence finding algorithm based on Markov random field (MRF) optimization. We also propose a shape refinement algorithm based on b-spline and Gaussian kernel interpolation using explicitly detected laser curves. Experiments are conducted to show the effectiveness of the proposed method using real data with strong noises and textures.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122561657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-23DOI: 10.23919/MVA57639.2023.10216220
Paola Barra, Alessia Auriemma Citarella, Giosuè Orefice, M. Castrillón-Santana, A. Ciaramella
The marine ecosystem is threatened by human waste released into the sea. One of the most challenging marine litter to identify and remove are the small particles settled on the sand which may be ingested by local fauna or cause damage to the marine ecosystem. Those particles are not easy to identify because they get confused with maritime/natural material, natural elements such as shells, stones or others, which can not be classified as "litter". In this work we present a dataset of Litter On The Sand (LOTS), with images of clean, dirty and wavy sand from 3 different beaches.
{"title":"LOTS: Litter On The Sand dataset for litter segmentation","authors":"Paola Barra, Alessia Auriemma Citarella, Giosuè Orefice, M. Castrillón-Santana, A. Ciaramella","doi":"10.23919/MVA57639.2023.10216220","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216220","url":null,"abstract":"The marine ecosystem is threatened by human waste released into the sea. One of the most challenging marine litter to identify and remove are the small particles settled on the sand which may be ingested by local fauna or cause damage to the marine ecosystem. Those particles are not easy to identify because they get confused with maritime/natural material, natural elements such as shells, stones or others, which can not be classified as \"litter\". In this work we present a dataset of Litter On The Sand (LOTS), with images of clean, dirty and wavy sand from 3 different beaches.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123086610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}