首页 > 最新文献

2020 Digital Image Computing: Techniques and Applications (DICTA)最新文献

英文 中文
Fruit Detection in the Wild: The Impact of Varying Conditions and Cultivar 野外果实检测:不同条件和品种的影响
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363407
Michael Halstead, S. Denman, C. Fookes, C. McCool
Agricultural robotics is a rapidly evolving research field due to advances in computer vision, machine learning, robotics, and increased agricultural demand. However, there is still a considerable gap between farming requirements and available technology due to the large differences between cropping environments. This creates a pressing need for models with greater generalisability. We explore the issue of generalisability by considering a fruit (sweet pepper) that is grown using different cultivar (sub-species) and in different environments (field vs glasshouse). To investigate these differences, we publicly release three novel datasets captured with different domains, cultivar, cameras, and geographic locations. We exploit these new datasets in a singular and combined (to promote generalisation) manner to evaluate sweet pepper (fruit) detection and classification in the wild. For evaluation, we employ Faster-RCNN for detection due to the ease in which it can be expanded to incorporate multitask learning by utilising the Mask-RCNN framework (instance-based segmentation). This multi-task learning technique is shown to increase the cross dataset detection F1-Score from 0.323 to 0.700, demonstrating the potential to reduce the requirements of new annotations through improved generalisation of the model. We further exploit the Faster-RCNN architecture to include both super- and sub-classes, fruit and ripeness respectively, by incorporating a parallel classification layer. For sub-class classification considering the percentage of correct detections, we are able to achieve an accuracy score of 0.900 in a cross domain evaluation. In our experiments, we find that intra-environmental inference is generally inferior, however, diversifying the data by using a combination of datasets increases performance through greater diversity in the training data. Overall, the introduction of these three novel and diverse datasets demonstrates the potential for multi-task learning to improve cross-dataset generalisability while also highlighting the importance of diverse data to adequately train and evaluate real-world systems.
由于计算机视觉、机器学习、机器人技术的进步和农业需求的增加,农业机器人是一个快速发展的研究领域。然而,由于种植环境的巨大差异,农业需求与现有技术之间仍然存在相当大的差距。这就产生了对具有更强通用性的模型的迫切需求。我们通过考虑在不同的栽培品种(亚种)和不同的环境(田间与温室)中种植的水果(甜椒)来探讨普遍性问题。为了研究这些差异,我们公开发布了三个新的数据集,这些数据集由不同的域、品种、相机和地理位置捕获。我们利用这些新的数据集,以单一和组合(以促进泛化)的方式来评估野生甜椒(水果)的检测和分类。对于评估,我们采用Faster-RCNN进行检测,因为它可以通过使用Mask-RCNN框架(基于实例的分割)轻松扩展到合并多任务学习。这种多任务学习技术被证明可以将交叉数据集检测F1-Score从0.323提高到0.700,这表明通过改进模型的泛化可以减少对新注释的需求。我们进一步利用Faster-RCNN架构,通过合并一个并行分类层,分别包括超类和子类、水果和成熟度。对于考虑正确检测百分比的子类分类,我们能够在跨域评估中获得0.900的准确率分数。在我们的实验中,我们发现环境内推理通常较差,然而,通过使用数据集的组合来多样化数据,通过训练数据的更大多样性来提高性能。总的来说,这三个新颖且多样化的数据集的引入证明了多任务学习在提高跨数据集通用性方面的潜力,同时也强调了多样化数据对充分训练和评估现实世界系统的重要性。
{"title":"Fruit Detection in the Wild: The Impact of Varying Conditions and Cultivar","authors":"Michael Halstead, S. Denman, C. Fookes, C. McCool","doi":"10.1109/DICTA51227.2020.9363407","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363407","url":null,"abstract":"Agricultural robotics is a rapidly evolving research field due to advances in computer vision, machine learning, robotics, and increased agricultural demand. However, there is still a considerable gap between farming requirements and available technology due to the large differences between cropping environments. This creates a pressing need for models with greater generalisability. We explore the issue of generalisability by considering a fruit (sweet pepper) that is grown using different cultivar (sub-species) and in different environments (field vs glasshouse). To investigate these differences, we publicly release three novel datasets captured with different domains, cultivar, cameras, and geographic locations. We exploit these new datasets in a singular and combined (to promote generalisation) manner to evaluate sweet pepper (fruit) detection and classification in the wild. For evaluation, we employ Faster-RCNN for detection due to the ease in which it can be expanded to incorporate multitask learning by utilising the Mask-RCNN framework (instance-based segmentation). This multi-task learning technique is shown to increase the cross dataset detection F1-Score from 0.323 to 0.700, demonstrating the potential to reduce the requirements of new annotations through improved generalisation of the model. We further exploit the Faster-RCNN architecture to include both super- and sub-classes, fruit and ripeness respectively, by incorporating a parallel classification layer. For sub-class classification considering the percentage of correct detections, we are able to achieve an accuracy score of 0.900 in a cross domain evaluation. In our experiments, we find that intra-environmental inference is generally inferior, however, diversifying the data by using a combination of datasets increases performance through greater diversity in the training data. Overall, the introduction of these three novel and diverse datasets demonstrates the potential for multi-task learning to improve cross-dataset generalisability while also highlighting the importance of diverse data to adequately train and evaluate real-world systems.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128529112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
MixCaps: Capsules With Iteration Free Routing MixCaps:具有迭代自由路由的胶囊
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363386
Ifty Mohammad Rezwan, Mirza Belal Ahmed, S. Sourav, Ezab Quader, Arafat Hossain, Nabeel Mohammed
In this paper, we propose a new variant of Capsule Networks called MixCaps. It is a new architecture that significantly decreases the compute capability required to run capsule networks. Due to the nature of our modules, we propose a new routing algorithm that does not require multiple iterations. All routing models prior to this architecture uses multiple iterations. This decreases our model's memory requirements by a significant margin unlike previous methods. This also provides us with the advantage to use both Matrix and Vector Poses. The model learns better complex representations as an aftereffect. Despite all this, we also show that our model performs on par with all prior capsule architectures on complex datasets such as Cifar-10 and Cifar-100.
在本文中,我们提出了一种叫做MixCaps的胶囊网络的新变体。这是一种新的架构,可以显著降低运行胶囊网络所需的计算能力。由于我们的模块的性质,我们提出了一个新的路由算法,不需要多次迭代。在此体系结构之前的所有路由模型都使用多次迭代。与以前的方法不同,这大大降低了我们模型的内存需求。这也为我们提供了使用矩阵和矢量姿势的优势。作为后续效果,模型学习了更好的复杂表征。尽管如此,我们还表明,我们的模型在复杂数据集(如Cifar-10和Cifar-100)上的性能与所有先前的胶囊架构相当。
{"title":"MixCaps: Capsules With Iteration Free Routing","authors":"Ifty Mohammad Rezwan, Mirza Belal Ahmed, S. Sourav, Ezab Quader, Arafat Hossain, Nabeel Mohammed","doi":"10.1109/DICTA51227.2020.9363386","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363386","url":null,"abstract":"In this paper, we propose a new variant of Capsule Networks called MixCaps. It is a new architecture that significantly decreases the compute capability required to run capsule networks. Due to the nature of our modules, we propose a new routing algorithm that does not require multiple iterations. All routing models prior to this architecture uses multiple iterations. This decreases our model's memory requirements by a significant margin unlike previous methods. This also provides us with the advantage to use both Matrix and Vector Poses. The model learns better complex representations as an aftereffect. Despite all this, we also show that our model performs on par with all prior capsule architectures on complex datasets such as Cifar-10 and Cifar-100.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128782368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Furrow Mapping of Sugarcane Billet Density Using Deep Learning and Object Detection 基于深度学习和目标检测的甘蔗坯料密度沟槽映射
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363394
J. Scott, Andrew Busch
Australia's sugar industry is currently undergoing significant hardships, due to global market contractions from COVID-19, increased crop forecasts from larger global producers, and falling oil prices. Current planting practices utilize inefficient mass-flow planting techniques, and no attempt to map the seed using machine vision has been made, to date, in order to understand the underlying problems. This paper investigates the plausibility of creating a labeled sugarcane billet dataset using a readily-available camera positioned beneath a planter and analysing this using a YOLOv3 network. This network resulted in a high mean average precision at intersect over union of 0.5 (mAP50) of 0.852 on test images, and was used to provide planting metrics by generating a furrow map.
由于全球市场因新冠肺炎疫情而萎缩,全球大型生产商的产量预测上调,以及油价下跌,澳大利亚的制糖业目前正面临重大困难。目前的种植实践利用低效的大流量种植技术,并且迄今为止还没有尝试使用机器视觉来绘制种子,以了解潜在的问题。本文研究了使用放置在种植机下方的现成摄像机创建标记甘蔗坯数据集的可行性,并使用YOLOv3网络对其进行分析。该网络在测试图像上获得了0.5 (mAP50) 0.852的高平均相交精度,并通过生成沟图来提供种植指标。
{"title":"Furrow Mapping of Sugarcane Billet Density Using Deep Learning and Object Detection","authors":"J. Scott, Andrew Busch","doi":"10.1109/DICTA51227.2020.9363394","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363394","url":null,"abstract":"Australia's sugar industry is currently undergoing significant hardships, due to global market contractions from COVID-19, increased crop forecasts from larger global producers, and falling oil prices. Current planting practices utilize inefficient mass-flow planting techniques, and no attempt to map the seed using machine vision has been made, to date, in order to understand the underlying problems. This paper investigates the plausibility of creating a labeled sugarcane billet dataset using a readily-available camera positioned beneath a planter and analysing this using a YOLOv3 network. This network resulted in a high mean average precision at intersect over union of 0.5 (mAP50) of 0.852 on test images, and was used to provide planting metrics by generating a furrow map.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123467616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Monocular Rotational Odometry with Incremental Rotation Averaging and Loop Closure 单眼旋转里程计与增量旋转平均和闭环
Pub Date : 2020-10-05 DOI: 10.1109/DICTA51227.2020.9363388
Chee-Kheng Chng, Álvaro Parra, Tat-Jun Chin, Y. Latif
Estimating absolute camera orientations is essential for attitude estimation tasks. An established approach is to first carry out visual odometry (VO) or visual SLAM (V-SLAM), and retrieve the camera orientations (3 DOF) from the camera poses (6 DOF) estimated by VO or V-SLAM. One drawback of this approach, besides the redundancy in estimating full 6 DOF camera poses, is the dependency on estimating a map (3D scene points) jointly with the 6 DOF poses due to the basic constraint on structure-and-motion. To simplify the task of absolute orientation estimation, we formulate the monocular rotational odometry problem and devise a fast algorithm to accurately estimate camera orientations with 2D-2D feature matches alone. Underpinning our system is a new incremental rotation averaging method for fast and constant time iterative updating. Furthermore, our system maintains a view-graph that 1) allows solving loop closure to remove camera orientation drift, and 2) can be used to warm start a V-SLAM system. We conduct extensive quantitative experiments on real-world datasets to demonstrate the accuracy of our incremental camera orientation solver. Finally, we showcase the benefit of our algorithm to V-SLAM: 1) solving the known rotation problem to estimate the trajectory of the camera and the surrounding map, and 2) enabling V-SLAM systems to track pure rotational motions.
相机绝对方位估计是姿态估计任务的关键。一种成熟的方法是首先进行视觉里程测量(VO)或视觉SLAM (V-SLAM),并从VO或V-SLAM估计的相机姿态(6 DOF)中检索相机方向(3 DOF)。这种方法的一个缺点是,除了估计全6 DOF相机姿态的冗余之外,由于结构和运动的基本约束,依赖于与6 DOF姿态联合估计地图(3D场景点)。为了简化绝对方向估计的任务,我们提出了单目旋转测程问题,并设计了一种快速算法,可以仅通过2D-2D特征匹配来准确估计相机的方向。该系统的基础是一种新的增量旋转平均方法,用于快速和恒定时间的迭代更新。此外,我们的系统维护了一个视图图,1)允许解决环路关闭以消除摄像机方向漂移,2)可用于热启动V-SLAM系统。我们在真实世界的数据集上进行了大量的定量实验,以证明我们的增量相机方向求解器的准确性。最后,我们展示了我们的算法对V-SLAM的好处:1)解决已知的旋转问题,以估计相机和周围地图的轨迹;2)使V-SLAM系统能够跟踪纯旋转运动。
{"title":"Monocular Rotational Odometry with Incremental Rotation Averaging and Loop Closure","authors":"Chee-Kheng Chng, Álvaro Parra, Tat-Jun Chin, Y. Latif","doi":"10.1109/DICTA51227.2020.9363388","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363388","url":null,"abstract":"Estimating absolute camera orientations is essential for attitude estimation tasks. An established approach is to first carry out visual odometry (VO) or visual SLAM (V-SLAM), and retrieve the camera orientations (3 DOF) from the camera poses (6 DOF) estimated by VO or V-SLAM. One drawback of this approach, besides the redundancy in estimating full 6 DOF camera poses, is the dependency on estimating a map (3D scene points) jointly with the 6 DOF poses due to the basic constraint on structure-and-motion. To simplify the task of absolute orientation estimation, we formulate the monocular rotational odometry problem and devise a fast algorithm to accurately estimate camera orientations with 2D-2D feature matches alone. Underpinning our system is a new incremental rotation averaging method for fast and constant time iterative updating. Furthermore, our system maintains a view-graph that 1) allows solving loop closure to remove camera orientation drift, and 2) can be used to warm start a V-SLAM system. We conduct extensive quantitative experiments on real-world datasets to demonstrate the accuracy of our incremental camera orientation solver. Finally, we showcase the benefit of our algorithm to V-SLAM: 1) solving the known rotation problem to estimate the trajectory of the camera and the surrounding map, and 2) enabling V-SLAM systems to track pure rotational motions.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129109636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
One-Shot learning based classification for segregation of plastic waste 基于一次性学习的塑料垃圾分类
Pub Date : 2020-09-29 DOI: 10.1109/DICTA51227.2020.9363374
Shivaank Agarwal, R. Gudi, Paresh Saxena
The problem of segregating recyclable waste is fairly daunting for many countries. This article presents an approach for image based classification of plastic waste using one-shot learning techniques. The proposed approach exploits discriminative features generated via the siamese and triplet loss convolutional neural networks to help differentiate between 5 types of plastic waste based on their resin codes. The approach achieves an accuracy of 99.74% on the WaDaBa Database [1].
对许多国家来说,可回收垃圾的分类问题相当棘手。本文提出了一种利用一次性学习技术进行塑料垃圾图像分类的方法。该方法利用连体和三重损失卷积神经网络产生的判别特征,根据树脂编码帮助区分5种塑料垃圾。该方法在WaDaBa数据库[1]上的准确率达到99.74%。
{"title":"One-Shot learning based classification for segregation of plastic waste","authors":"Shivaank Agarwal, R. Gudi, Paresh Saxena","doi":"10.1109/DICTA51227.2020.9363374","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363374","url":null,"abstract":"The problem of segregating recyclable waste is fairly daunting for many countries. This article presents an approach for image based classification of plastic waste using one-shot learning techniques. The proposed approach exploits discriminative features generated via the siamese and triplet loss convolutional neural networks to help differentiate between 5 types of plastic waste based on their resin codes. The approach achieves an accuracy of 99.74% on the WaDaBa Database [1].","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"91 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126026663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
PS8-Net: A Deep Convolutional Neural Network to Predict the Eight-State Protein Secondary Structure PS8-Net:一种深度卷积神经网络预测蛋白质八态二级结构
Pub Date : 2020-09-22 DOI: 10.1109/DICTA51227.2020.9363393
Md. Aminur Rab Ratul, M. T. Elahi, M. Mozaffari, Won-Sook Lee
Protein secondary structure is crucial to creating an information bridge between the primary and tertiary structures. Precise prediction of eight-state protein secondary structure (PSS) has been significantly utilized in the structural and functional analysis of proteins. Deep learning techniques have been recently applied in this area and raised the eight-state (Q8) protein secondary structure prediction accuracy remarkably. Nevertheless, from a theoretical standpoint, there are still many rooms for improvement, specifically in the eight-state PSS prediction. In this study, we have presented a new deep convolutional neural network called PS8- Net, to enhance the accuracy of eight-class PSS prediction. The input of this architecture is a carefully constructed feature matrix from the proteins sequence features and profile features. We introduce a new PS8 module with skip connection to extracting the long-term inter-dependencies from higher layers, obtaining local contexts in earlier layers, and achieving global information during secondary structure prediction. This architecture enables the efficient processing of local and global interdependencies between amino acids to make an accurate prediction of each class. To the best of our knowledge, our proposed PS8-Net experiment results demonstrate that it outperforms all the state-of-the-art methods on the benchmark CullPdb6133, CB513, CASP10, and CASP11 datasets.
蛋白质的二级结构对于在一级和三级结构之间建立信息桥梁至关重要。蛋白质八态二级结构(PSS)的精确预测在蛋白质的结构和功能分析中具有重要意义。近年来,深度学习技术在这一领域得到了应用,并显著提高了八态(Q8)蛋白二级结构的预测精度。然而,从理论的角度来看,仍有许多改进的空间,特别是在八状态PSS预测方面。在本研究中,我们提出了一种新的深度卷积神经网络PS8- Net,以提高八类PSS预测的准确性。该结构的输入是由蛋白质序列特征和剖面特征精心构建的特征矩阵。我们引入了一个新的PS8模块,该模块具有跳过连接,可以从较高层提取长期相互依赖关系,在较早的层中获得局部上下文,并在二级结构预测中获得全局信息。这种结构能够有效地处理氨基酸之间的局部和全局相互依赖关系,从而对每一类氨基酸进行准确的预测。据我们所知,我们提出的PS8-Net实验结果表明,它在基准CullPdb6133、CB513、CASP10和CASP11数据集上优于所有最先进的方法。
{"title":"PS8-Net: A Deep Convolutional Neural Network to Predict the Eight-State Protein Secondary Structure","authors":"Md. Aminur Rab Ratul, M. T. Elahi, M. Mozaffari, Won-Sook Lee","doi":"10.1109/DICTA51227.2020.9363393","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363393","url":null,"abstract":"Protein secondary structure is crucial to creating an information bridge between the primary and tertiary structures. Precise prediction of eight-state protein secondary structure (PSS) has been significantly utilized in the structural and functional analysis of proteins. Deep learning techniques have been recently applied in this area and raised the eight-state (Q8) protein secondary structure prediction accuracy remarkably. Nevertheless, from a theoretical standpoint, there are still many rooms for improvement, specifically in the eight-state PSS prediction. In this study, we have presented a new deep convolutional neural network called PS8- Net, to enhance the accuracy of eight-class PSS prediction. The input of this architecture is a carefully constructed feature matrix from the proteins sequence features and profile features. We introduce a new PS8 module with skip connection to extracting the long-term inter-dependencies from higher layers, obtaining local contexts in earlier layers, and achieving global information during secondary structure prediction. This architecture enables the efficient processing of local and global interdependencies between amino acids to make an accurate prediction of each class. To the best of our knowledge, our proposed PS8-Net experiment results demonstrate that it outperforms all the state-of-the-art methods on the benchmark CullPdb6133, CB513, CASP10, and CASP11 datasets.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128507113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Exploring Intensity Invariance in Deep Neural Networks for Brain Image Registration 脑图像配准中深度神经网络的强度不变性研究
Pub Date : 2020-09-21 DOI: 10.1109/DICTA51227.2020.9363409
Hassan Mahmood, Asim Iqbal, S. Islam
Image registration is a widely-used technique in analysing large scale datasets that are captured through various imaging modalities and techniques in biomedical imaging such as MRI, X-Rays, etc. These datasets are typically collected from various sites and under different imaging protocols using a variety of scanners. Such heterogeneity in the data collection process causes inhomogeneity or variation in intensity (brightness) and noise distribution. These variations play a detrimental role in the performance of image registration, segmentation and detection algorithms. Classical image registration methods are computationally expensive but are able to handle these artifacts relatively better. However, deep learning-based techniques are shown to be computationally efficient for automated brain registration but are sensitive to the intensity variations. In this study, we investigate the effect of variation in intensity distribution among input image pairs for deep learning-based image registration methods. We find a performance degradation of these models when brain image pairs with different intensity distribution are presented even with similar structures. To overcome this limitation, we incorporate a structural similarity-based loss function in a deep neural network and test its performance on the validation split separated before training as well as on a completely unseen new dataset. We report that the deep learning models trained with structure similarity-based loss seems to perform better for both datasets. This investigation highlights a possible performance limiting factor in deep learning-based registration models and suggests a potential solution to incorporate the intensity distribution variation in the input image pairs. Our code and models are available at https://github.com/hassaanmahmood/DeepIntense.
图像配准是一种广泛应用于分析大规模数据集的技术,这些数据集是通过各种成像模式和生物医学成像技术(如MRI, x射线等)捕获的。这些数据集通常从不同的地点和不同的成像协议下收集,使用各种扫描仪。数据收集过程中的这种异质性导致强度(亮度)和噪声分布的不均匀性或变化。这些变化对图像配准、分割和检测算法的性能产生不利影响。经典的图像配准方法计算成本高,但能够相对更好地处理这些伪影。然而,基于深度学习的技术被证明对自动脑配准具有计算效率,但对强度变化很敏感。在本研究中,我们研究了基于深度学习的图像配准方法中输入图像对之间强度分布变化的影响。我们发现,当具有不同强度分布的脑图像对具有相似的结构时,这些模型的性能会下降。为了克服这一限制,我们在深度神经网络中加入了一个基于结构相似性的损失函数,并在训练前分离的验证分割以及完全看不见的新数据集上测试其性能。我们报告说,使用基于结构相似性的损失训练的深度学习模型似乎在这两个数据集上都表现得更好。该研究强调了基于深度学习的配准模型中可能存在的性能限制因素,并提出了一种潜在的解决方案,以结合输入图像对中的强度分布变化。我们的代码和模型可在https://github.com/hassaanmahmood/DeepIntense上获得。
{"title":"Exploring Intensity Invariance in Deep Neural Networks for Brain Image Registration","authors":"Hassan Mahmood, Asim Iqbal, S. Islam","doi":"10.1109/DICTA51227.2020.9363409","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363409","url":null,"abstract":"Image registration is a widely-used technique in analysing large scale datasets that are captured through various imaging modalities and techniques in biomedical imaging such as MRI, X-Rays, etc. These datasets are typically collected from various sites and under different imaging protocols using a variety of scanners. Such heterogeneity in the data collection process causes inhomogeneity or variation in intensity (brightness) and noise distribution. These variations play a detrimental role in the performance of image registration, segmentation and detection algorithms. Classical image registration methods are computationally expensive but are able to handle these artifacts relatively better. However, deep learning-based techniques are shown to be computationally efficient for automated brain registration but are sensitive to the intensity variations. In this study, we investigate the effect of variation in intensity distribution among input image pairs for deep learning-based image registration methods. We find a performance degradation of these models when brain image pairs with different intensity distribution are presented even with similar structures. To overcome this limitation, we incorporate a structural similarity-based loss function in a deep neural network and test its performance on the validation split separated before training as well as on a completely unseen new dataset. We report that the deep learning models trained with structure similarity-based loss seems to perform better for both datasets. This investigation highlights a possible performance limiting factor in deep learning-based registration models and suggests a potential solution to incorporate the intensity distribution variation in the input image pairs. Our code and models are available at https://github.com/hassaanmahmood/DeepIntense.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128023528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-species Seagrass Detection and Classification from Underwater Images 基于水下图像的多物种海草检测与分类
Pub Date : 2020-09-18 DOI: 10.1109/DICTA51227.2020.9363371
Scarlett Raine, R. Marchant, Peyman Moghadam, F. Maire, B. Kettle, Brano Kusy
Underwater surveys conducted using divers or robots equipped with customized camera payloads can generate a large number of images. Manual review of these images to extract ecological data is prohibitive in terms of time and cost, thus providing strong incentive to automate this process using machine learning solutions. In this paper, we introduce a multi-species detector and classifier for seagrasses based on a deep convolutional neural network (achieved an overall accuracy of 92.4%). We also introduce a simple method to semi-automatically label image patches and therefore minimize manual labelling requirement. We describe and release publicly the dataset collected in this study as well as the code and pre-trained models to replicate our experiments at: https://github.com/csiro-robotics/deepseagrass
使用配备定制相机有效载荷的潜水员或机器人进行的水下调查可以生成大量图像。人工审查这些图像以提取生态数据在时间和成本方面是令人望而却步的,因此提供了使用机器学习解决方案自动化这一过程的强烈动机。本文介绍了一种基于深度卷积神经网络的海草多物种检测和分类器(总体准确率为92.4%)。我们还介绍了一种简单的方法来半自动标记图像补丁,从而最大限度地减少人工标记的要求。我们描述并公开发布本研究中收集的数据集,以及代码和预训练模型,以复制我们的实验:https://github.com/csiro-robotics/deepseagrass
{"title":"Multi-species Seagrass Detection and Classification from Underwater Images","authors":"Scarlett Raine, R. Marchant, Peyman Moghadam, F. Maire, B. Kettle, Brano Kusy","doi":"10.1109/DICTA51227.2020.9363371","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363371","url":null,"abstract":"Underwater surveys conducted using divers or robots equipped with customized camera payloads can generate a large number of images. Manual review of these images to extract ecological data is prohibitive in terms of time and cost, thus providing strong incentive to automate this process using machine learning solutions. In this paper, we introduce a multi-species detector and classifier for seagrasses based on a deep convolutional neural network (achieved an overall accuracy of 92.4%). We also introduce a simple method to semi-automatically label image patches and therefore minimize manual labelling requirement. We describe and release publicly the dataset collected in this study as well as the code and pre-trained models to replicate our experiments at: https://github.com/csiro-robotics/deepseagrass","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131040689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Feature-Extracting Functions for Neural Logic Rule Learning 神经逻辑规则学习的特征提取函数
Pub Date : 2020-08-14 DOI: 10.1109/DICTA51227.2020.9363415
Shashank Gupta, A. Robles-Kelly
In this paper, we present a method aimed at integrating domain knowledge abstracted as logic rules into the predictive behaviour of a neural network using feature extracting functions. We combine the declarative first-order logic rules which represents the human knowledge in a logically-structured format akin to that introduced in [1] with feature-extracting functions which act as the decision rules presented in [2]. These functions are embodied as programming functions which can represent, in a straightforward manner, the applicable domain knowledge as a set of logical instructions and provide a cumulative set of probability distributions of the input data. These distributions can then be used during the training process in a mini-batch strategy. We also illustrate the utility of our method for sentiment analysis and compare our results to those obtained using a number of alternatives elsewhere in the literature.
本文提出了一种利用特征提取函数将抽象为逻辑规则的领域知识集成到神经网络预测行为中的方法。我们将陈述性一阶逻辑规则(以类似于[1]中引入的逻辑结构格式表示人类知识)与特征提取函数(作为[2]中提出的决策规则)相结合。这些函数以编程函数的形式体现,以一种直接的方式将适用的领域知识表示为一组逻辑指令,并提供输入数据的一组累积概率分布。这些分布可以在小批量策略的训练过程中使用。我们还说明了我们的情感分析方法的实用性,并将我们的结果与文献中其他地方使用许多替代方法获得的结果进行了比较。
{"title":"Feature-Extracting Functions for Neural Logic Rule Learning","authors":"Shashank Gupta, A. Robles-Kelly","doi":"10.1109/DICTA51227.2020.9363415","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363415","url":null,"abstract":"In this paper, we present a method aimed at integrating domain knowledge abstracted as logic rules into the predictive behaviour of a neural network using feature extracting functions. We combine the declarative first-order logic rules which represents the human knowledge in a logically-structured format akin to that introduced in [1] with feature-extracting functions which act as the decision rules presented in [2]. These functions are embodied as programming functions which can represent, in a straightforward manner, the applicable domain knowledge as a set of logical instructions and provide a cumulative set of probability distributions of the input data. These distributions can then be used during the training process in a mini-batch strategy. We also illustrate the utility of our method for sentiment analysis and compare our results to those obtained using a number of alternatives elsewhere in the literature.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114152513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recent Data Augmentation Strategies for Deep Learning in Plant Phenotyping and Their Significance 植物表型深度学习的最新数据增强策略及其意义
Pub Date : 2020-08-04 DOI: 10.1109/DICTA51227.2020.9363383
D. Gomes, Lihong Zheng
Plant phenotyping concerns the study of plant traits resulted from their interaction with their environment. Computer vision (CV) techniques represent promising, noninvasive approaches for related tasks such as leaf counting, defining leaf area, and tracking plant growth. Between potential CV techniques, deep learning has been prevalent in the last couple of years. Such an increase in interest happened mainly due to the release of a data set containing rosette plants that defined objective metrics to benchmark solutions. This paper discusses an interesting aspect of the recent best-performing works in this field: the fact that their main contribution comes from novel data augmentation techniques, rather than model improvements. Moreover, experiments are set to highlight the significance of data augmentation practices for limited data sets with narrow distributions. This paper intends to review the ingenious techniques to generate synthetic data to augment training and display evidence of their potential importance.
植物表型研究涉及植物与环境相互作用所产生的性状。计算机视觉(CV)技术为相关任务(如叶片计数、定义叶面积和跟踪植物生长)提供了有前途的非侵入性方法。在潜在的简历技术中,深度学习在过去几年一直很流行。这种兴趣的增加主要是由于发布了包含玫瑰植物的数据集,该数据集定义了基准解决方案的客观度量。本文讨论了该领域最近表现最好的作品的一个有趣方面:他们的主要贡献来自新颖的数据增强技术,而不是模型改进。此外,实验的设置是为了突出数据增强实践对有限的数据集与窄分布的重要性。本文旨在回顾产生合成数据的巧妙技术,以增强训练并显示其潜在重要性的证据。
{"title":"Recent Data Augmentation Strategies for Deep Learning in Plant Phenotyping and Their Significance","authors":"D. Gomes, Lihong Zheng","doi":"10.1109/DICTA51227.2020.9363383","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363383","url":null,"abstract":"Plant phenotyping concerns the study of plant traits resulted from their interaction with their environment. Computer vision (CV) techniques represent promising, noninvasive approaches for related tasks such as leaf counting, defining leaf area, and tracking plant growth. Between potential CV techniques, deep learning has been prevalent in the last couple of years. Such an increase in interest happened mainly due to the release of a data set containing rosette plants that defined objective metrics to benchmark solutions. This paper discusses an interesting aspect of the recent best-performing works in this field: the fact that their main contribution comes from novel data augmentation techniques, rather than model improvements. Moreover, experiments are set to highlight the significance of data augmentation practices for limited data sets with narrow distributions. This paper intends to review the ingenious techniques to generate synthetic data to augment training and display evidence of their potential importance.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116882627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2020 Digital Image Computing: Techniques and Applications (DICTA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1