Pub Date : 2022-11-30DOI: 10.1109/DICTA56598.2022.10034578
The use of optical flow to aid feature matching has been employed in recent self-supervised video object segmentation (VOS) methods and has shown promising results. However, computing pixel-wise optical flow is costly, and the optical flow can also be further utilized for efficient regional segmentation. To address these challenges, we propose an efficient motionaware mask propagation approach, dubbed EMMP, for selfsupervised VOS. EMMP introduces an efficient patch optical flow to compute the motion offsets of image patches for dynamic matching ROI generation. Fine-grained pixel-wise feature matching is performed based on the dynamic matching ROIs for mask propagation. To reduce redundant segmentation while avoiding unnecessary computations, we re-use the patch optical flow to estimate reliable foreground ROIs in the next frame and perform regional segmentation. Evaluation on benchmark VOS datasets shows that EMMP achieves competitive performance with significant wall-clock speed-ups compared to existing selfsupervised training methods, e.g., EMMP slightly outperforms MAMP and runs about 2× faster on segmentation. In addition, EMMP performs on par with many supervised training methods.
{"title":"Regional Video Object Segmentation by Efficient Motion-Aware Mask Propagation","authors":"","doi":"10.1109/DICTA56598.2022.10034578","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034578","url":null,"abstract":"The use of optical flow to aid feature matching has been employed in recent self-supervised video object segmentation (VOS) methods and has shown promising results. However, computing pixel-wise optical flow is costly, and the optical flow can also be further utilized for efficient regional segmentation. To address these challenges, we propose an efficient motionaware mask propagation approach, dubbed EMMP, for selfsupervised VOS. EMMP introduces an efficient patch optical flow to compute the motion offsets of image patches for dynamic matching ROI generation. Fine-grained pixel-wise feature matching is performed based on the dynamic matching ROIs for mask propagation. To reduce redundant segmentation while avoiding unnecessary computations, we re-use the patch optical flow to estimate reliable foreground ROIs in the next frame and perform regional segmentation. Evaluation on benchmark VOS datasets shows that EMMP achieves competitive performance with significant wall-clock speed-ups compared to existing selfsupervised training methods, e.g., EMMP slightly outperforms MAMP and runs about 2× faster on segmentation. In addition, EMMP performs on par with many supervised training methods.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129994458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-30DOI: 10.1109/DICTA56598.2022.10034591
Counting is a common preventative measure taken to ensure surgical instruments are not retained during surgery, which could cause serious detrimental effects including chronic pain and sepsis. A hybrid human-AI system could support or partially automate this manual counting of instruments. An important element to evaluate the viability of using deep learning computer vision-based counting is a suitable large-scale dataset of surgical instruments. Other domains, such as crowd analysis and instance counting, have leveraged synthetic datasets to evaluate and augment different approaches. We present a synthetic dataset (SORT), which is complemented by a smaller real-world dataset of surgical instruments (MSMI), to assess the hypothesis whether synthetic training data can improve the performance of multiclass multi-instance counting models when applied to real-world data. In this preliminary study, we provide comparative baselines for various popular counting techniques on synthetic data, such as direct regression, segmentation, localisation, and density estimation. These experiments are repeated at different resolutions - full high-definition (1080 × 1920 pixels), half (690 × 540 pixels), and quarter (480 × 270 pixels) - to measure the robustness of different supervision methods to varying image scales. The results indicate that neither the degree of supervision nor the image resolution during model training impact performance significantly on the synthetic data. However, when testing on the real-world instrument dataset, the models trained on synthetic data were significantly less accurate. These results indicate a need for further work in either the refinement of the synthetic depictions or fine-tuning on real-world data to achieve similar performance in domain adaptation scenarios compared to training and testing solely on the synthetic data.
{"title":"Can Synthetic Data Improve Multi-Class Counting of Surgical Instruments?","authors":"","doi":"10.1109/DICTA56598.2022.10034591","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034591","url":null,"abstract":"Counting is a common preventative measure taken to ensure surgical instruments are not retained during surgery, which could cause serious detrimental effects including chronic pain and sepsis. A hybrid human-AI system could support or partially automate this manual counting of instruments. An important element to evaluate the viability of using deep learning computer vision-based counting is a suitable large-scale dataset of surgical instruments. Other domains, such as crowd analysis and instance counting, have leveraged synthetic datasets to evaluate and augment different approaches. We present a synthetic dataset (SORT), which is complemented by a smaller real-world dataset of surgical instruments (MSMI), to assess the hypothesis whether synthetic training data can improve the performance of multiclass multi-instance counting models when applied to real-world data. In this preliminary study, we provide comparative baselines for various popular counting techniques on synthetic data, such as direct regression, segmentation, localisation, and density estimation. These experiments are repeated at different resolutions - full high-definition (1080 × 1920 pixels), half (690 × 540 pixels), and quarter (480 × 270 pixels) - to measure the robustness of different supervision methods to varying image scales. The results indicate that neither the degree of supervision nor the image resolution during model training impact performance significantly on the synthetic data. However, when testing on the real-world instrument dataset, the models trained on synthetic data were significantly less accurate. These results indicate a need for further work in either the refinement of the synthetic depictions or fine-tuning on real-world data to achieve similar performance in domain adaptation scenarios compared to training and testing solely on the synthetic data.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131158049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-30DOI: 10.1109/DICTA56598.2022.10034581
This paper proposes a simple approach to unsupervised deep learning for foreground object segmentation. Robust principal component analysis (RPCA) can achieve background subtraction by minimizing nuclear and $ell_{1}$ norms to exploit the prior knowledge about spatio-temporal sparseness and low-rankness of the foreground objects and background scene. With a combination of these norms as a loss function, the proposed method trains a U-Net-based model so as to encode and decode the sparse foreground objects for a batch of input images with a low-rank background. Once the model has learned enough features common to the foreground objects, it has the potential to detect them from any single image regardless of the low-rankness and sparseness. The proposed model performs online object segmentation with much less computational expense than that of RPCA. These advantages over RPCA are demonstrated with background subtraction in video surveillance. It is also shown experimentally that the present method can build up a well-generalized cell nuclei segmentation model from only a few dozens of unannotated training images.
{"title":"Unsupervised Deep Learning for Online Foreground Segmentation Exploiting Low-Rank and Sparse Priors","authors":"","doi":"10.1109/DICTA56598.2022.10034581","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034581","url":null,"abstract":"This paper proposes a simple approach to unsupervised deep learning for foreground object segmentation. Robust principal component analysis (RPCA) can achieve background subtraction by minimizing nuclear and $ell_{1}$ norms to exploit the prior knowledge about spatio-temporal sparseness and low-rankness of the foreground objects and background scene. With a combination of these norms as a loss function, the proposed method trains a U-Net-based model so as to encode and decode the sparse foreground objects for a batch of input images with a low-rank background. Once the model has learned enough features common to the foreground objects, it has the potential to detect them from any single image regardless of the low-rankness and sparseness. The proposed model performs online object segmentation with much less computational expense than that of RPCA. These advantages over RPCA are demonstrated with background subtraction in video surveillance. It is also shown experimentally that the present method can build up a well-generalized cell nuclei segmentation model from only a few dozens of unannotated training images.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128378766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-30DOI: 10.1109/DICTA56598.2022.10034619
There are a myriad of factors that influence on modern agriculture including water scarcity [1], climate change [2], biodiversity loss [3], pollutants [4], etc. Weed invasion is one of the greatest environmental threats to the productivity of agriculture. A threatening invasive species in Western Australia [5], and in many regions of the world [6] is wild radish (Raphanus raphanistrum). It has detrimental impacts on production costs, crop yield reduction (from 10% to 90% [7]), and quality of crops due to its capacity to compete with crops for nutrients, light and water [8], [9]. Particularly, the influence of wild radish on the quality and yield of canola has always been a concern in weed control [10]. Canola is known to be one of the world's healthiest vegetable oils, can be grown in winter or spring, and is an environmentally friendly biofuel [11]. The production of canola has a market value of AU$2.2 billion and has increased significantly to over four million tonnes in Australia from 2012 to 2013 [12]. With more than two million tonnes of canola seed exported by Australia every year, Australia has become the world's second largest exporter of canola [13]. However, the challenges in spraying herbicides on only targeted weeds has arisen from the high visual similarity between canola and wild radish species.
{"title":"An energy-efficient AkidaNet for morphologically similar weeds and crops recognition at the Edge","authors":"","doi":"10.1109/DICTA56598.2022.10034619","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034619","url":null,"abstract":"There are a myriad of factors that influence on modern agriculture including water scarcity [1], climate change [2], biodiversity loss [3], pollutants [4], etc. Weed invasion is one of the greatest environmental threats to the productivity of agriculture. A threatening invasive species in Western Australia [5], and in many regions of the world [6] is wild radish (Raphanus raphanistrum). It has detrimental impacts on production costs, crop yield reduction (from 10% to 90% [7]), and quality of crops due to its capacity to compete with crops for nutrients, light and water [8], [9]. Particularly, the influence of wild radish on the quality and yield of canola has always been a concern in weed control [10]. Canola is known to be one of the world's healthiest vegetable oils, can be grown in winter or spring, and is an environmentally friendly biofuel [11]. The production of canola has a market value of AU$2.2 billion and has increased significantly to over four million tonnes in Australia from 2012 to 2013 [12]. With more than two million tonnes of canola seed exported by Australia every year, Australia has become the world's second largest exporter of canola [13]. However, the challenges in spraying herbicides on only targeted weeds has arisen from the high visual similarity between canola and wild radish species.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122008869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-30DOI: 10.1109/DICTA56598.2022.10034638
Kidney disease is one of many severe chronic disease that a person can have. Early detection of this disease can be pivotal for proper treatment. Different neural networks have proven to be useful in disease prediction in the progression of modern science. In this paper, we have proposed a segmentation based kidney tumor classification method using Deep Neural Network (DNN). We have done our work in two Steps. Firstly, we have segmented kidneys using a manual segmentation technique and UNet along with SegNet for kidney segmentation. Then, for the classification task, the modified MobileNetV2, VGG16 and InceptionV3 was trained on the segmented kidney data. CT KIDNEY DATASET: Normal-Cyst-Tumor and Stone dataset(published in Kaggle) was used to train our models. Finally, the classification models MobileNetV2, VGG16, InceptionV3 scored with 95.29%, 99.48% and 97.38% accuracy on test set. We found that the VGG16 model has the best accuracy and the highest sensitivity and specificity. Explainable AI (GradCam) method has been applied to expalain our model's result.
{"title":"Kidney Tumor Segmentation and Classification Using Deep Neural Network on CT Images","authors":"","doi":"10.1109/DICTA56598.2022.10034638","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034638","url":null,"abstract":"Kidney disease is one of many severe chronic disease that a person can have. Early detection of this disease can be pivotal for proper treatment. Different neural networks have proven to be useful in disease prediction in the progression of modern science. In this paper, we have proposed a segmentation based kidney tumor classification method using Deep Neural Network (DNN). We have done our work in two Steps. Firstly, we have segmented kidneys using a manual segmentation technique and UNet along with SegNet for kidney segmentation. Then, for the classification task, the modified MobileNetV2, VGG16 and InceptionV3 was trained on the segmented kidney data. CT KIDNEY DATASET: Normal-Cyst-Tumor and Stone dataset(published in Kaggle) was used to train our models. Finally, the classification models MobileNetV2, VGG16, InceptionV3 scored with 95.29%, 99.48% and 97.38% accuracy on test set. We found that the VGG16 model has the best accuracy and the highest sensitivity and specificity. Explainable AI (GradCam) method has been applied to expalain our model's result.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121932756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-30DOI: 10.1109/DICTA56598.2022.10034641
Autonomous driving (AD) is undeniably gaining a substantial role in the automotive industry. AD exploits emerging advances in object detection that benefit from improvements in deep learning algorithms, especially convolutional neural networks (CNNs). Scene perception is a crucial task in self-driving vehicles. Scene perception in AD is the ability to extract relevant data from the surroundings. The efficacy of perception largely depends on the sensors and cameras used in capturing the scene. It also depends on the surrounding environmental conditions, which affect the sensors and cameras.
{"title":"Multi-Domain Thermal Object Detection Using Generative Adversarial Networks","authors":"","doi":"10.1109/DICTA56598.2022.10034641","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034641","url":null,"abstract":"Autonomous driving (AD) is undeniably gaining a substantial role in the automotive industry. AD exploits emerging advances in object detection that benefit from improvements in deep learning algorithms, especially convolutional neural networks (CNNs). Scene perception is a crucial task in self-driving vehicles. Scene perception in AD is the ability to extract relevant data from the surroundings. The efficacy of perception largely depends on the sensors and cameras used in capturing the scene. It also depends on the surrounding environmental conditions, which affect the sensors and cameras.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"176 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132273261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-30DOI: 10.1109/DICTA56598.2022.10034637
3D LiDAR has transformed various urban infrastructure management practices, including urban vegetation detection and monitoring. The accessibility and convenience of use of LiDAR observations in ecological investigations has substantially improved because of advancements in LiDAR hardware systems and data processing techniques. In this paper, we introduce a slot attention-based network for semantic segmentation and biomass estimation of vegetation. We named it the 3D semantic vegetation transformer (3DSVT). Our proposed method first extracts point features by exploiting RandLA-Net, which are then passed to slot attention to extract object central features for semantic segmentation. Finally, vegetation biomass is computed based on the resultant semantic segmentation. We compare our proposed approach to the state-of-the-art 3D point cloud semantic segmentation methods on SensatUrban and semantic3D datasets. The experiments show that our proposed method is giving promising results and can thus be used to analyse and compute the vegetation biomass of 3D point clouds at a large scale.
{"title":"3D LiDAR Transformer for City-scale Vegetation Segmentation and Biomass Estimation: immediate","authors":"","doi":"10.1109/DICTA56598.2022.10034637","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034637","url":null,"abstract":"3D LiDAR has transformed various urban infrastructure management practices, including urban vegetation detection and monitoring. The accessibility and convenience of use of LiDAR observations in ecological investigations has substantially improved because of advancements in LiDAR hardware systems and data processing techniques. In this paper, we introduce a slot attention-based network for semantic segmentation and biomass estimation of vegetation. We named it the 3D semantic vegetation transformer (3DSVT). Our proposed method first extracts point features by exploiting RandLA-Net, which are then passed to slot attention to extract object central features for semantic segmentation. Finally, vegetation biomass is computed based on the resultant semantic segmentation. We compare our proposed approach to the state-of-the-art 3D point cloud semantic segmentation methods on SensatUrban and semantic3D datasets. The experiments show that our proposed method is giving promising results and can thus be used to analyse and compute the vegetation biomass of 3D point clouds at a large scale.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"219 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132311821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-30DOI: 10.1109/DICTA56598.2022.10034642
Pedestrian attributes are defined as pedestrian appearance features which can be observed directly, usually including gender, age, clothing, etc. The purpose of pedestrian attribute recognition (PAR) is to perform semantic analysis on a given pedestrian image, which is widely used in person reidentification [1] and human detection [2]. Owing to the influence of factors such as changeable postures, occlusion, uneven lighting and different perspectives, some features with poor semantics in pedestrian images are too weak to learn, and thus the classification becomes more difficult.
{"title":"A Multi-Granularity Feature Fusion Model for Pedestrian Attribute Recognition","authors":"","doi":"10.1109/DICTA56598.2022.10034642","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034642","url":null,"abstract":"Pedestrian attributes are defined as pedestrian appearance features which can be observed directly, usually including gender, age, clothing, etc. The purpose of pedestrian attribute recognition (PAR) is to perform semantic analysis on a given pedestrian image, which is widely used in person reidentification [1] and human detection [2]. Owing to the influence of factors such as changeable postures, occlusion, uneven lighting and different perspectives, some features with poor semantics in pedestrian images are too weak to learn, and thus the classification becomes more difficult.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"163 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114662009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-30DOI: 10.1109/DICTA56598.2022.10034622
Synthetic-Computed Tomography (sCT) generation is a critical component of Magnetic Resonance Imaging (MRI)-only radiation therapy workflows. The sCT computed from MRI is generally assessed by measuring Hounsfield Units (HU) discrepancies with a reference CT. The aim of this work was to propose a process for the blind assessment of local errors in generated sCTs where a reference CT is unavailable, allowing for safe MRI-only radiation therapy treatment planning. A personalised inter-patient registration method was applied to align a cohort of reference CTs into the same coordinate system. This process resulted in probability maps for each segmented organ, a mean CT image and a standard deviation map. These data were propagated to the anatomical space for each sCT, allowing for out of distribution intensities to be detected at a voxel level by computing local z-scores. Probability maps of organs were used to weight the resulting z-scores, reducing the bias induced by the registration around structures. Two sCT generation methods were chosen as examples to illustrate this methodology: an atlas-based method (ABM) and a deep-learning approach based on a Generative Adversarial Network (GAN) architecture. 39 patients treated with external beam radiotherapy for prostate cancer, with co-registered CT and MR pairs, were used for sCT generation. 26 of these patients were selected as reference CT, and sCT of the remaining 13 patients were assessed. Accurate inter-individual registration was achieved, with mean Dice scores higher than 0.91 for all organs. The average volume of error represented 0.29% of the image for the ABM, 0.37% for the GAN. The proposed methodology produced 3D volumes which identify significant local sCT errors. Depending on their size and location, these errors could lead to inaccurate tissue density computation during radiation therapy. This work provides an automated QA method aimed at preventing incorrect radiation dose delivery to patients.
{"title":"Local quality assessment of patient specific synthetic-CT via voxel-wise analysis","authors":"","doi":"10.1109/DICTA56598.2022.10034622","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034622","url":null,"abstract":"Synthetic-Computed Tomography (sCT) generation is a critical component of Magnetic Resonance Imaging (MRI)-only radiation therapy workflows. The sCT computed from MRI is generally assessed by measuring Hounsfield Units (HU) discrepancies with a reference CT. The aim of this work was to propose a process for the blind assessment of local errors in generated sCTs where a reference CT is unavailable, allowing for safe MRI-only radiation therapy treatment planning. A personalised inter-patient registration method was applied to align a cohort of reference CTs into the same coordinate system. This process resulted in probability maps for each segmented organ, a mean CT image and a standard deviation map. These data were propagated to the anatomical space for each sCT, allowing for out of distribution intensities to be detected at a voxel level by computing local z-scores. Probability maps of organs were used to weight the resulting z-scores, reducing the bias induced by the registration around structures. Two sCT generation methods were chosen as examples to illustrate this methodology: an atlas-based method (ABM) and a deep-learning approach based on a Generative Adversarial Network (GAN) architecture. 39 patients treated with external beam radiotherapy for prostate cancer, with co-registered CT and MR pairs, were used for sCT generation. 26 of these patients were selected as reference CT, and sCT of the remaining 13 patients were assessed. Accurate inter-individual registration was achieved, with mean Dice scores higher than 0.91 for all organs. The average volume of error represented 0.29% of the image for the ABM, 0.37% for the GAN. The proposed methodology produced 3D volumes which identify significant local sCT errors. Depending on their size and location, these errors could lead to inaccurate tissue density computation during radiation therapy. This work provides an automated QA method aimed at preventing incorrect radiation dose delivery to patients.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125105454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-30DOI: 10.1109/DICTA56598.2022.10034594
The complex marine environment exacerbates the challenges of object Abstract-The complex marine environment exacerbates the challenges of object detection manifold. With the advent of the modern era, marine trash presents a danger to the aquatic ecosystem, and it has always been challenging to address this issue with complete grip. Therefore, there is a significant need to precisely detect marine deposits and locate them accurately in challenging aquatic surroundings. To ensure the safety of the marine environment caused by waste, the deployment of underwater object detection is a crucial tool to mitigate the harm of such waste. Our work explains the image enhancement strategies used and experiments exploring the best detection obtained after applying these methods. Specifically, we evaluate Detectron 2's backbones performance using different base models and configurations for the underwater detection task. We first propose a channel stabilization technique on top of a simplified image enhancement model to help reduce haze and colour cast in training images. The proposed procedure shows improved results on multi-scale size objects present in the data set. After processing the images, we explore various backbones in Detectron2 to give the best detection accuracy for these images. In addition, we use a sharpening filter with augmentation techniques. This highlights the profile of the object which helps us recognize it easily. We demonstrate our results by verifying these on TrashCan Data set, both instance and material version. We then explore the best-performing backbone method for this setting. We apply our channel stabilization and augmentation methods to the best-performing technique. We also compare our detection results from Detectron2 using the best backbones with those from Deformable Transformer. The detection result for small size objects in the Instance-version of TRASHCAN 1.0 gives us a 9.53box we get the absolute gain of 7to the baseline.
{"title":"Underwater Object Detection Enhancement via Channel Stabilization","authors":"","doi":"10.1109/DICTA56598.2022.10034594","DOIUrl":"https://doi.org/10.1109/DICTA56598.2022.10034594","url":null,"abstract":"The complex marine environment exacerbates the challenges of object Abstract-The complex marine environment exacerbates the challenges of object detection manifold. With the advent of the modern era, marine trash presents a danger to the aquatic ecosystem, and it has always been challenging to address this issue with complete grip. Therefore, there is a significant need to precisely detect marine deposits and locate them accurately in challenging aquatic surroundings. To ensure the safety of the marine environment caused by waste, the deployment of underwater object detection is a crucial tool to mitigate the harm of such waste. Our work explains the image enhancement strategies used and experiments exploring the best detection obtained after applying these methods. Specifically, we evaluate Detectron 2's backbones performance using different base models and configurations for the underwater detection task. We first propose a channel stabilization technique on top of a simplified image enhancement model to help reduce haze and colour cast in training images. The proposed procedure shows improved results on multi-scale size objects present in the data set. After processing the images, we explore various backbones in Detectron2 to give the best detection accuracy for these images. In addition, we use a sharpening filter with augmentation techniques. This highlights the profile of the object which helps us recognize it easily. We demonstrate our results by verifying these on TrashCan Data set, both instance and material version. We then explore the best-performing backbone method for this setting. We apply our channel stabilization and augmentation methods to the best-performing technique. We also compare our detection results from Detectron2 using the best backbones with those from Deformable Transformer. The detection result for small size objects in the Instance-version of TRASHCAN 1.0 gives us a 9.53box we get the absolute gain of 7to the baseline.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125160764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}