Pub Date : 2020-11-25DOI: 10.1109/IVCNZ51579.2020.9290732
I. Gallo, Gabriele Magistrali, Nicola Landro, Riccardo La Grassa
Nowadays, a challenge for the scientific community concerning deep learning is to design architectural models to obtain the best performance on specific data sets. Building effective models is not a trivial task and it can be very time-consuming if done manually. Neural Architecture Search (NAS) has achieved remarkable results in deep learning applications in the past few years. It involves training a recurrent neural network (RNN) controller using Reinforcement Learning (RL) to automatically generate architectures. Efficient Neural Architecture Search (ENAS) was created to address the prohibitively expensive computational complexity of NAS using weight sharing. In this paper we propose Improved-ENAS (I-ENAS), a further improvement of ENAS that augments the reinforcement learning training method by modifying the reward of each tested architecture according to the results obtained in previously tested architectures. We have conducted many experiments on different public domain datasets and demonstrated that I-ENAS, in the worst-case reaches the performance of ENAS, but in many other cases it overcomes ENAS in terms of convergence time needed to achieve better accuracies.
{"title":"Improving the Efficient Neural Architecture Search via Rewarding Modifications","authors":"I. Gallo, Gabriele Magistrali, Nicola Landro, Riccardo La Grassa","doi":"10.1109/IVCNZ51579.2020.9290732","DOIUrl":"https://doi.org/10.1109/IVCNZ51579.2020.9290732","url":null,"abstract":"Nowadays, a challenge for the scientific community concerning deep learning is to design architectural models to obtain the best performance on specific data sets. Building effective models is not a trivial task and it can be very time-consuming if done manually. Neural Architecture Search (NAS) has achieved remarkable results in deep learning applications in the past few years. It involves training a recurrent neural network (RNN) controller using Reinforcement Learning (RL) to automatically generate architectures. Efficient Neural Architecture Search (ENAS) was created to address the prohibitively expensive computational complexity of NAS using weight sharing. In this paper we propose Improved-ENAS (I-ENAS), a further improvement of ENAS that augments the reinforcement learning training method by modifying the reward of each tested architecture according to the results obtained in previously tested architectures. We have conducted many experiments on different public domain datasets and demonstrated that I-ENAS, in the worst-case reaches the performance of ENAS, but in many other cases it overcomes ENAS in terms of convergence time needed to achieve better accuracies.","PeriodicalId":164317,"journal":{"name":"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115363647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-25DOI: 10.1109/IVCNZ51579.2020.9290652
M. J. Edwards, M. Hayes, R. Green
The sub-pixel corner refinement algorithm in OpenCV is widely used to refine checkerboard corner location estimates to sub-pixel precision. This paper shows using both simulations and a large dataset of real images that the algorithm produces estimates with significant bias and noise which depend on the sub-pixel corner location. In the real images, the noise ranged from around 0.013 px at the pixel centre to 0.0072 px at the edges, a difference of around $1.8times$. The bias could not be determined from the real images due to residual lens distortion; in the simulated images it had a maximum magnitude of 0.043 px.
{"title":"Experimental Validation of Bias in Checkerboard Corner Detection","authors":"M. J. Edwards, M. Hayes, R. Green","doi":"10.1109/IVCNZ51579.2020.9290652","DOIUrl":"https://doi.org/10.1109/IVCNZ51579.2020.9290652","url":null,"abstract":"The sub-pixel corner refinement algorithm in OpenCV is widely used to refine checkerboard corner location estimates to sub-pixel precision. This paper shows using both simulations and a large dataset of real images that the algorithm produces estimates with significant bias and noise which depend on the sub-pixel corner location. In the real images, the noise ranged from around 0.013 px at the pixel centre to 0.0072 px at the edges, a difference of around $1.8times$. The bias could not be determined from the real images due to residual lens distortion; in the simulated images it had a maximum magnitude of 0.043 px.","PeriodicalId":164317,"journal":{"name":"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124994307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-25DOI: 10.1109/IVCNZ51579.2020.9290716
A. Govindaswamy, E. Montague, D. Raicu, J. Furst
Electronic health record systems used in clinical settings to facilitate informed decision making, affects the dynamics between the physician and the patient during clinical interactions. The interaction between the patient and the physician can impact patient satisfaction, and overall health outcomes. Gaze during patient-doctor interactions was found to impact patient-physician relationship and is an important measure of attention towards humans and technology. This study aims to automatically label physician gaze for video interactions which is typically measured using extensive human coding. In this study, physicians’ gaze is predicted at any time during the recorded video interaction using optical flow and body positioning coordinates as image features. Findings show that physician gaze could be predicted with an accuracy of over 83%. Our approach highlights the potential for the model to be an annotation tool which reduces the extensive human labor of annotating the videos for physician’s gaze. These interactions can further be connected to patient ratings to better understand patient outcomes.
{"title":"Predicting physician gaze in clinical settings using optical flow and positioning","authors":"A. Govindaswamy, E. Montague, D. Raicu, J. Furst","doi":"10.1109/IVCNZ51579.2020.9290716","DOIUrl":"https://doi.org/10.1109/IVCNZ51579.2020.9290716","url":null,"abstract":"Electronic health record systems used in clinical settings to facilitate informed decision making, affects the dynamics between the physician and the patient during clinical interactions. The interaction between the patient and the physician can impact patient satisfaction, and overall health outcomes. Gaze during patient-doctor interactions was found to impact patient-physician relationship and is an important measure of attention towards humans and technology. This study aims to automatically label physician gaze for video interactions which is typically measured using extensive human coding. In this study, physicians’ gaze is predicted at any time during the recorded video interaction using optical flow and body positioning coordinates as image features. Findings show that physician gaze could be predicted with an accuracy of over 83%. Our approach highlights the potential for the model to be an annotation tool which reduces the extensive human labor of annotating the videos for physician’s gaze. These interactions can further be connected to patient ratings to better understand patient outcomes.","PeriodicalId":164317,"journal":{"name":"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124995853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-25DOI: 10.1109/IVCNZ51579.2020.9290707
Nazabat Hussain, Mojde Hasanzade, D. Breiby, M. Akram
Computational microscopy algorithms can be used to improve resolution by synthesizing a bigger numerical aperture. Fourier Ptychographic (FP) microscopy utilizes multiple exposures, each illuminated with a unique incidence angle coherent source. The recorded images are often corrupted with background noises and preprocessing improves the quality of the FP recovered image. The preprocessing involves data denoising, thresholding and intensity balancing. We propose a wavelet-based thresholding scheme for noise removal. Any image can be decomposed into its coarse approximation, horizontal details, vertical details, and diagonal details using suitable wavelets. The details are extracted to find a suitable threshold, which is used to perform thresholding. In the proposed algorithm, two wavelet families, Daubechies and Biorthogonal with compact support of db4, db30, bior2.2 and bior6.8, have been used in conjunction with ptychographic phase retrieval. The obtained results show that the wavelet-based thresholding significantly improves the quality of the reconstructed FP microscopy image.
{"title":"Wavelet Based Thresholding for Fourier Ptychography Microscopy","authors":"Nazabat Hussain, Mojde Hasanzade, D. Breiby, M. Akram","doi":"10.1109/IVCNZ51579.2020.9290707","DOIUrl":"https://doi.org/10.1109/IVCNZ51579.2020.9290707","url":null,"abstract":"Computational microscopy algorithms can be used to improve resolution by synthesizing a bigger numerical aperture. Fourier Ptychographic (FP) microscopy utilizes multiple exposures, each illuminated with a unique incidence angle coherent source. The recorded images are often corrupted with background noises and preprocessing improves the quality of the FP recovered image. The preprocessing involves data denoising, thresholding and intensity balancing. We propose a wavelet-based thresholding scheme for noise removal. Any image can be decomposed into its coarse approximation, horizontal details, vertical details, and diagonal details using suitable wavelets. The details are extracted to find a suitable threshold, which is used to perform thresholding. In the proposed algorithm, two wavelet families, Daubechies and Biorthogonal with compact support of db4, db30, bior2.2 and bior6.8, have been used in conjunction with ptychographic phase retrieval. The obtained results show that the wavelet-based thresholding significantly improves the quality of the reconstructed FP microscopy image.","PeriodicalId":164317,"journal":{"name":"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115203192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-25DOI: 10.1109/IVCNZ51579.2020.9290719
Naeha Sharif, M. Jalwana, Bennamoun, Wei Liu, Syed Afaq Ali Shah
Image captioning is a challenging vision-to-language task, which has garnered a lot of attention over the past decade. The introduction of Encoder-Decoder based architectures expedited the research in this area and provided the backbone of the most recent systems. Moreover, leveraging relationships between objects for holistic scene understanding, which in turn improves captioning, has recently sparked interest among researchers. Our proposed model encodes the spatial and semantic proximity of object pairs into linguistically-aware relationship embeddings. Moreover, it captures the global semantics of the image using NASNet. This way, true semantic relations that are not apparent in visual content of an image can be learned, such that the decoder can attend to the most relevant object relations and visual features to generate more semantically-meaningful captions. Our experiments highlight the usefulness of linguistically-aware object relations as well as NASNet visual features for image captioning.
{"title":"Leveraging Linguistically-aware Object Relations and NASNet for Image Captioning","authors":"Naeha Sharif, M. Jalwana, Bennamoun, Wei Liu, Syed Afaq Ali Shah","doi":"10.1109/IVCNZ51579.2020.9290719","DOIUrl":"https://doi.org/10.1109/IVCNZ51579.2020.9290719","url":null,"abstract":"Image captioning is a challenging vision-to-language task, which has garnered a lot of attention over the past decade. The introduction of Encoder-Decoder based architectures expedited the research in this area and provided the backbone of the most recent systems. Moreover, leveraging relationships between objects for holistic scene understanding, which in turn improves captioning, has recently sparked interest among researchers. Our proposed model encodes the spatial and semantic proximity of object pairs into linguistically-aware relationship embeddings. Moreover, it captures the global semantics of the image using NASNet. This way, true semantic relations that are not apparent in visual content of an image can be learned, such that the decoder can attend to the most relevant object relations and visual features to generate more semantically-meaningful captions. Our experiments highlight the usefulness of linguistically-aware object relations as well as NASNet visual features for image captioning.","PeriodicalId":164317,"journal":{"name":"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129187210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-25DOI: 10.1109/IVCNZ51579.2020.9290677
Tapabrata (Rohan) Chakraborty, B. McCane, S. Mills, U. Pal
We present an end-to-end deep network for fine-grained visual categorization called Collaborative Convolutional Network (CoCoNet). The network uses a collaborative layer after the convolutional layers to represent an image as an optimal weighted collaboration of features learned from training samples as a whole rather than one at a time. This gives CoCoNet more power to encode the fine-grained nature of the data with limited samples. We perform a detailed study of the performance with 1-stage and 2-stage transfer learning. The ablation study shows that the proposed method outperforms its constituent parts consistently. CoCoNet also outperforms few state-of-the-art competing methods. Experiments have been performed on the fine-grained bird species classification problem as a representative example, but the method may be applied to other similar tasks. We also introduce a new public dataset for fine-grained species recognition, that of Indian endemic birds and have reported initial results on it.
{"title":"CoCoNet: A Collaborative Convolutional Network applied to fine-grained bird species classification","authors":"Tapabrata (Rohan) Chakraborty, B. McCane, S. Mills, U. Pal","doi":"10.1109/IVCNZ51579.2020.9290677","DOIUrl":"https://doi.org/10.1109/IVCNZ51579.2020.9290677","url":null,"abstract":"We present an end-to-end deep network for fine-grained visual categorization called Collaborative Convolutional Network (CoCoNet). The network uses a collaborative layer after the convolutional layers to represent an image as an optimal weighted collaboration of features learned from training samples as a whole rather than one at a time. This gives CoCoNet more power to encode the fine-grained nature of the data with limited samples. We perform a detailed study of the performance with 1-stage and 2-stage transfer learning. The ablation study shows that the proposed method outperforms its constituent parts consistently. CoCoNet also outperforms few state-of-the-art competing methods. Experiments have been performed on the fine-grained bird species classification problem as a representative example, but the method may be applied to other similar tasks. We also introduce a new public dataset for fine-grained species recognition, that of Indian endemic birds and have reported initial results on it.","PeriodicalId":164317,"journal":{"name":"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123979353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-25DOI: 10.1109/IVCNZ51579.2020.9290620
Rishil Shah
Lossy image compression is ubiquitously used for storage and transmission at lower rates. Among the existing lossy image compression methods, the JPEG standard is the most widely used technique in the multimedia world. Over the years, numerous methods have been proposed to suppress the compression artifacts introduced in JPEG-compressed images. However, all current learning-based methods include deep convolutional neural networks (CNNs) that are manually-designed by researchers. The network design process requires extensive computational resources and expertise. Focusing on this issue, we investigate evolutionary search for finding the optimal residual block based architecture for artifact removal. We first define a residual network structure and its corresponding genotype representation used in the search. Then, we provide details of the evolutionary algorithm and the multi-objective function used to find the optimal residual block architecture. Finally, we present experimental results to indicate the effectiveness of our approach and compare performance with existing artifact removal networks. The proposed approach is scalable and portable to numerous low-level vision tasks.
{"title":"Evolutionary Algorithm Based Residual Block Search for Compression Artifact Removal","authors":"Rishil Shah","doi":"10.1109/IVCNZ51579.2020.9290620","DOIUrl":"https://doi.org/10.1109/IVCNZ51579.2020.9290620","url":null,"abstract":"Lossy image compression is ubiquitously used for storage and transmission at lower rates. Among the existing lossy image compression methods, the JPEG standard is the most widely used technique in the multimedia world. Over the years, numerous methods have been proposed to suppress the compression artifacts introduced in JPEG-compressed images. However, all current learning-based methods include deep convolutional neural networks (CNNs) that are manually-designed by researchers. The network design process requires extensive computational resources and expertise. Focusing on this issue, we investigate evolutionary search for finding the optimal residual block based architecture for artifact removal. We first define a residual network structure and its corresponding genotype representation used in the search. Then, we provide details of the evolutionary algorithm and the multi-objective function used to find the optimal residual block architecture. Finally, we present experimental results to indicate the effectiveness of our approach and compare performance with existing artifact removal networks. The proposed approach is scalable and portable to numerous low-level vision tasks.","PeriodicalId":164317,"journal":{"name":"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123208083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-25DOI: 10.1109/IVCNZ51579.2020.9290670
Luxin Zhang, W. Yan
The use of deep learning methods for virus identification from digital images is a timely research topic. Given an electron microscopy image, virus recognition utilizing deep learning approaches is critical at present, because virus identification by human experts is relatively slow and time-consuming. In this project, our objective is to develop deep learning methods for automatic virus identification from digital images, there are four viral species taken into consideration, namely, SARS, MERS, HIV, and COVID-19. In this work, we firstly examine virus morphological characteristics and propose a novel loss function which aims at virus identification from the given electron micrographs. We take into account of attention mechanism for virus locating and classification from digital images. In order to generate the most reliable estimate of bounding boxes and classification for a virus as visual object, we train and test five deep learning models: R-CNN, Fast R-CNN, Faster R-CNN, YOLO, and SSD, based on our dataset of virus electron microscopy. Additionally, we explicate the evaluation approaches. The conclusion reveals SSD and Faster R-CNN outperform in the virus identification.
{"title":"Deep Learning Methods for Virus Identification from Digital Images","authors":"Luxin Zhang, W. Yan","doi":"10.1109/IVCNZ51579.2020.9290670","DOIUrl":"https://doi.org/10.1109/IVCNZ51579.2020.9290670","url":null,"abstract":"The use of deep learning methods for virus identification from digital images is a timely research topic. Given an electron microscopy image, virus recognition utilizing deep learning approaches is critical at present, because virus identification by human experts is relatively slow and time-consuming. In this project, our objective is to develop deep learning methods for automatic virus identification from digital images, there are four viral species taken into consideration, namely, SARS, MERS, HIV, and COVID-19. In this work, we firstly examine virus morphological characteristics and propose a novel loss function which aims at virus identification from the given electron micrographs. We take into account of attention mechanism for virus locating and classification from digital images. In order to generate the most reliable estimate of bounding boxes and classification for a virus as visual object, we train and test five deep learning models: R-CNN, Fast R-CNN, Faster R-CNN, YOLO, and SSD, based on our dataset of virus electron microscopy. Additionally, we explicate the evaluation approaches. The conclusion reveals SSD and Faster R-CNN outperform in the virus identification.","PeriodicalId":164317,"journal":{"name":"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126346106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-25DOI: 10.1109/IVCNZ51579.2020.9290547
P. Chitale, Kaustubh Y. Kekre, Hrishikesh Shenai, R. Karani, Jay Gala
The world is advancing towards an autonomous environment at a great pace and it has become a need of an hour, especially during the current pandemic situation. The pandemic has hindered the functioning of many sectors, one of them being Road development and maintenance. Creating a safe working environment for workers is a major concern of road maintenance during such difficult times. This can be achieved to some extent with the help of an autonomous system that will aim at reducing human dependency. In this paper, one of such systems, a pothole detection and dimension estimation, is proposed. The proposed system uses a Deep Learning based algorithm YOLO (You Only Look Once) for pothole detection. Further, an image processing based triangular similarity measure is used for pothole dimension estimation. The proposed system provides reasonably accurate results of both pothole detection and dimension estimation. The proposed system also helps in reducing the time required for road maintenance. The system uses a custom made dataset consisting of images of water-logged and dry potholes of various shapes and sizes.
世界正在快速走向自主环境,这已经成为一个小时的需要,特别是在当前的大流行形势下。大流行病阻碍了许多部门的运作,其中之一是道路发展和维护。在这种困难时期,为工人创造一个安全的工作环境是道路养护的一个主要问题。在某种程度上,这可以通过一个旨在减少人类依赖的自主系统来实现。本文提出了一种凹坑检测与尺寸估计系统。该系统使用基于深度学习的YOLO (You Only Look Once)算法进行坑洞检测。在此基础上,采用基于图像处理的三角形相似性测度进行坑穴尺寸估计。该系统在凹坑探测和尺寸估计方面均提供了较为准确的结果。建议的系统亦有助减少道路维修所需的时间。该系统使用一个定制的数据集,包括各种形状和大小的积水和干坑的图像。
{"title":"Pothole Detection and Dimension Estimation System using Deep Learning (YOLO) and Image Processing","authors":"P. Chitale, Kaustubh Y. Kekre, Hrishikesh Shenai, R. Karani, Jay Gala","doi":"10.1109/IVCNZ51579.2020.9290547","DOIUrl":"https://doi.org/10.1109/IVCNZ51579.2020.9290547","url":null,"abstract":"The world is advancing towards an autonomous environment at a great pace and it has become a need of an hour, especially during the current pandemic situation. The pandemic has hindered the functioning of many sectors, one of them being Road development and maintenance. Creating a safe working environment for workers is a major concern of road maintenance during such difficult times. This can be achieved to some extent with the help of an autonomous system that will aim at reducing human dependency. In this paper, one of such systems, a pothole detection and dimension estimation, is proposed. The proposed system uses a Deep Learning based algorithm YOLO (You Only Look Once) for pothole detection. Further, an image processing based triangular similarity measure is used for pothole dimension estimation. The proposed system provides reasonably accurate results of both pothole detection and dimension estimation. The proposed system also helps in reducing the time required for road maintenance. The system uses a custom made dataset consisting of images of water-logged and dry potholes of various shapes and sizes.","PeriodicalId":164317,"journal":{"name":"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129372719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-25DOI: 10.1109/IVCNZ51579.2020.9290585
D. Bailey
The techniques for single pass connected component analysis have undergone significant changes from their initial development to current state-of-the-art algorithms. This review traces the evolution of the algorithms, and explores the linkages and development of ideas introduced by various researchers. Three significant developments are: the recycling of labels to enable processing with resources proportional to the image width; reduction of overheads associated with label merging; and processing of multiple pixels in parallel. These are of particular interest to those developing high speed and low latency image processing and machine vision systems.
{"title":"History and Evolution of Single Pass Connected Component Analysis","authors":"D. Bailey","doi":"10.1109/IVCNZ51579.2020.9290585","DOIUrl":"https://doi.org/10.1109/IVCNZ51579.2020.9290585","url":null,"abstract":"The techniques for single pass connected component analysis have undergone significant changes from their initial development to current state-of-the-art algorithms. This review traces the evolution of the algorithms, and explores the linkages and development of ideas introduced by various researchers. Three significant developments are: the recycling of labels to enable processing with resources proportional to the image width; reduction of overheads associated with label merging; and processing of multiple pixels in parallel. These are of particular interest to those developing high speed and low latency image processing and machine vision systems.","PeriodicalId":164317,"journal":{"name":"2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125755150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}