2020 Digital Image Computing: Techniques and Applications (DICTA)最新文献

英文中文

Density-Based Vehicle Counting with Unsupervised Scale Selection 基于密度的无监督尺度选择车辆计数

2020 Digital Image Computing: Techniques and Applications (DICTA)

Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363401

P. Dobes, Jakub Špaňhel, Vojtech Bartl, Roman Juránek, A. Herout

A significant hurdle within any counting task is the variance in a scale of the objects to be counted. While size changes of some extent can be induced by perspective distortion, more severe scale differences can easily occur, e.g. in case of images taken by a drone from different elevations above the ground. The aim of our work is to overcome this issue by leveraging only lightweight dot annotations and a minimum level of training supervision. We propose a modification to the Stacked Hourglass network which enables the model to process multiple input scales and to automatically select the most suitable candidate using a quality score. We alter the training procedure to enable learning of the quality scores while avoiding their direct supervision, and thus without requiring any additional annotation effort. We evaluate our method on three standard datasets: PUCPR+, TRANCOS and CARPK. The obtained results are on par with current state-of-the-art methods while being more robust towards significant variations in input scale.

在任何计数任务中，一个重要的障碍是待计数对象的比例差异。虽然透视失真会引起一定程度的尺寸变化，但也容易产生更严重的尺度差异，例如无人机从地面以上不同高度拍摄的图像。我们工作的目的是通过仅利用轻量级点注释和最低级别的训练监督来克服这个问题。我们提出了对堆叠沙漏网络的修改，使模型能够处理多个输入尺度，并使用质量分数自动选择最合适的候选对象。我们改变了训练过程，以便在避免直接监督的情况下学习质量分数，因此不需要任何额外的注释工作。我们在PUCPR+、TRANCOS和CARPK三个标准数据集上对我们的方法进行了评估。所获得的结果与当前最先进的方法相当，同时对输入规模的显着变化更加稳健。

引用次数: 1

Dual-Stage Domain Adaptive Mitosis Detection for Histopathology Images 组织病理学图像的双阶段域自适应有丝分裂检测

2020 Digital Image Computing: Techniques and Applications (DICTA)

Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363411

Veena Dodballapur, Yang Song, Heng Huang, Mei Chen, Wojciech Chrzanowski, Weidong (Tom) Cai

Histopathology images for mitosis detection vary in appearance due to the non-standard method of preparing the tissues as well as differences in scanner hardware. This makes automatic machine learning based mitosis detection very challenging because of domain shift between the training and testing datasets. In this paper, we propose a method of addressing this domain shift problem by using a two-stage domain adaptive neural network. In the first stage, we use domain adaptive Mask R-CNN to generate masks for mitotic regions. Thus generated masks are used by a second domain adaptive convolutional neural network to perform finer mitosis detection. Our method achieved state-of-the-art performance on both ICPR 2012 and 2014 datasets. We demonstrate that using a domain agnostic approach achieves better generalization and mitosis cell localization for the trained models.

有丝分裂检测的组织病理学图像由于制备组织的非标准方法以及扫描仪硬件的差异而在外观上有所不同。这使得基于机器学习的自动有丝分裂检测非常具有挑战性，因为训练和测试数据集之间存在域转移。在本文中，我们提出了一种利用两阶段域自适应神经网络来解决这一问题的方法。在第一阶段，我们使用域自适应掩码R-CNN来生成有丝分裂区域的掩码。由此产生的掩模被第二域自适应卷积神经网络用于执行更精细的有丝分裂检测。我们的方法在ICPR 2012和2014数据集上都取得了最先进的性能。我们证明，使用领域不可知的方法可以获得更好的泛化和有丝分裂细胞定位训练模型。

引用次数: 1

Using Environmental Context to Synthesis Missing Pixels 使用环境上下文合成缺失像素

2020 Digital Image Computing: Techniques and Applications (DICTA)

Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363419

Thaer F. Ali, A. Woodley

Satellites have proven to be a technology that can help in a variety of environmental and human development contexts. However, at times some pixels in the satellite images are not captured. These uncaptured pixels are called missing pixels. Having these missing pixels means that important data for research and satellite imagery-based applications is lost. Therefore, people have developed pixel synthesis methods. This paper presents a new pixel synthesis method called the Iterative Self-Organizing Data Analysis Techniques Algorithm - Integration of Geostatistical and Temporal Missing Pixels' Properties (ISODATA-IGTMPP). The method is built upon the Integration of Geostatistical and Temporal Missing Pixels' Properties (IG TMPP) method and adds a seminal clustering technique called the Iterative Self-Organizing Data Analysis Techniques Algorithm (ISODATA). The clustering technique allows a new way of predicting the missing pixel from their environmental class with benefit of the spatial and temporal properties. Here, the ISODATA-IGTMPP method was tested on the Spatial-Temporal Change in the Environment Context (STCEC) dataset and was compared with results of four missing pixel predicting methods. The method shows the best performing results and preforms very well across different environment types.

卫星已被证明是一种可以在各种环境和人类发展背景下提供帮助的技术。然而，有时卫星图像中的一些像素没有被捕获。这些未捕获的像素被称为缺失像素。缺少这些像素意味着研究和基于卫星图像的应用的重要数据丢失了。因此，人们开发了像素合成方法。本文提出了一种新的像元合成方法，称为迭代自组织数据分析技术算法-地统计和时间缺失像元属性集成(ISODATA-IGTMPP)。该方法建立在地统计和时间缺失像素属性集成(IG TMPP)方法的基础上，并添加了一种称为迭代自组织数据分析技术算法(ISODATA)的开创性聚类技术。聚类技术提供了一种利用空间和时间属性预测环境类缺失像素的新方法。在STCEC (Spatial-Temporal Change in Environment Context)数据集上对ISODATA-IGTMPP方法进行了测试，并与4种缺失像元预测方法的结果进行了比较。该方法显示了最佳的执行结果，并且在不同的环境类型中表现非常好。

{"title":"Using Environmental Context to Synthesis Missing Pixels","authors":"Thaer F. Ali, A. Woodley","doi":"10.1109/DICTA51227.2020.9363419","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363419","url":null,"abstract":"Satellites have proven to be a technology that can help in a variety of environmental and human development contexts. However, at times some pixels in the satellite images are not captured. These uncaptured pixels are called missing pixels. Having these missing pixels means that important data for research and satellite imagery-based applications is lost. Therefore, people have developed pixel synthesis methods. This paper presents a new pixel synthesis method called the Iterative Self-Organizing Data Analysis Techniques Algorithm - Integration of Geostatistical and Temporal Missing Pixels' Properties (ISODATA-IGTMPP). The method is built upon the Integration of Geostatistical and Temporal Missing Pixels' Properties (IG TMPP) method and adds a seminal clustering technique called the Iterative Self-Organizing Data Analysis Techniques Algorithm (ISODATA). The clustering technique allows a new way of predicting the missing pixel from their environmental class with benefit of the spatial and temporal properties. Here, the ISODATA-IGTMPP method was tested on the Spatial-Temporal Change in the Environment Context (STCEC) dataset and was compared with results of four missing pixel predicting methods. The method shows the best performing results and preforms very well across different environment types.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116324707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Pseudo Supervised Solar Panel Mapping based on Deep Convolutional Networks with Label Correction Strategy in Aerial Images 航拍图像中基于深度卷积网络标签校正策略的伪监督太阳能电池板映射

2020 Digital Image Computing: Techniques and Applications (DICTA)

Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363379

Jue Zhang, X. Jia, Jiankun Hu

Solar panel mapping has gained a rising interest in renewable energy field with the aid of remote sensing imagery. Significant previous work is based on fully supervised learning with classical classifiers or convolutional neural networks (CNNs), which often require manual annotations of pixel-wise ground-truth to provide accurate supervision. Weakly supervised methods can accept image-wise annotations which can help reduce the cost for pixel-level labelling. Inevitable performance gap, however, exists between weakly and fully supervised methods in mapping accuracy. To address this problem, we propose a pseudo supervised deep convolutional network with label correction strategy (PS-CNNLC) for solar panels mapping. It combines the benefits of both weak and strong supervision to provide accurate solar panel extraction. First, a convolutional neural network is trained with positive and negative samples with image-level labels. It is then used to automatically identify more positive samples from randomly selected unlabeled images. The feature maps of the positive samples are further processed by gradient-weighted class activation mapping to generate initial mapping results, which are taken as initial pseudo labels as they are generally coarse and incomplete. A progressive label correction strategy is designed to refine the initial pseudo labels and train an end-to-end target mapping network iteratively, thereby improving the model reliability. Comprehensive evaluations and ablation study conducted validate the superiority of the proposed PS-CNNLC.

在可再生能源领域，利用遥感影像对太阳能板进行测绘已引起越来越多的关注。之前的重要工作是基于经典分类器或卷积神经网络(cnn)的完全监督学习，这通常需要手动标注像素级的基础真值来提供准确的监督。弱监督方法可以接受图像注释，这有助于降低像素级标记的成本。然而，弱监督方法和完全监督方法在映射精度上存在不可避免的性能差距。为了解决这个问题，我们提出了一种带有标签校正策略的伪监督深度卷积网络(PS-CNNLC)用于太阳能电池板映射。它结合了弱监管和强监管的优点，以提供准确的太阳能电池板提取。首先，用带有图像级标签的正样本和负样本训练卷积神经网络。然后使用它从随机选择的未标记图像中自动识别更多阳性样本。将阳性样本的特征图进一步进行梯度加权类激活映射处理，生成初始映射结果，由于其一般粗糙且不完整，因此将其作为初始伪标签。设计渐进式标签校正策略，对初始伪标签进行细化，迭代训练端到端目标映射网络，从而提高模型可靠性。综合评价和烧蚀研究验证了PS-CNNLC的优越性。

{"title":"Pseudo Supervised Solar Panel Mapping based on Deep Convolutional Networks with Label Correction Strategy in Aerial Images","authors":"Jue Zhang, X. Jia, Jiankun Hu","doi":"10.1109/DICTA51227.2020.9363379","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363379","url":null,"abstract":"Solar panel mapping has gained a rising interest in renewable energy field with the aid of remote sensing imagery. Significant previous work is based on fully supervised learning with classical classifiers or convolutional neural networks (CNNs), which often require manual annotations of pixel-wise ground-truth to provide accurate supervision. Weakly supervised methods can accept image-wise annotations which can help reduce the cost for pixel-level labelling. Inevitable performance gap, however, exists between weakly and fully supervised methods in mapping accuracy. To address this problem, we propose a pseudo supervised deep convolutional network with label correction strategy (PS-CNNLC) for solar panels mapping. It combines the benefits of both weak and strong supervision to provide accurate solar panel extraction. First, a convolutional neural network is trained with positive and negative samples with image-level labels. It is then used to automatically identify more positive samples from randomly selected unlabeled images. The feature maps of the positive samples are further processed by gradient-weighted class activation mapping to generate initial mapping results, which are taken as initial pseudo labels as they are generally coarse and incomplete. A progressive label correction strategy is designed to refine the initial pseudo labels and train an end-to-end target mapping network iteratively, thereby improving the model reliability. Comprehensive evaluations and ablation study conducted validate the superiority of the proposed PS-CNNLC.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126541107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Vis-CRF: A Simplified Filtering Model for Vision 视觉- crf:一种简化的视觉过滤模型

2020 Digital Image Computing: Techniques and Applications (DICTA)

Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363403

Nasim Nematzadeh, D. Powers, T. Lewis

Over the last decade, a variety of new neurophysiological experiments have deepened our understanding about retinal cells functionality, leading to new insights as to how, when and where retinal processing takes place, and the nature of the retinal representation and encoding sent to the cortex for further processing. Based on these neurobiological discoveries, we provide computer simulation evidence to suggest that Geometrical illusions are explained in part, by the interaction of multiscale visual processing performed in the retina supporting previous studies [1, 2]. The output of our retinal stage model, named Vis-CRF which is a filtering vision model is presented here for a sample of Café Wall pattern and for an illusory pattern, in which the final percept arises from multiple scale processing of Difference of Gaussians (DoG) and the perceptual interaction of foreground and background elements.

在过去的十年中，各种新的神经生理学实验加深了我们对视网膜细胞功能的理解，导致了关于视网膜处理如何，何时和何地发生，以及视网膜表征和编码的本质发送到皮层进行进一步处理的新见解。基于这些神经生物学的发现，我们提供了计算机模拟证据，表明几何错觉可以部分解释为视网膜中进行的多尺度视觉处理的相互作用，支持先前的研究[1,2]。我们的视网膜阶段模型的输出，称为Vis-CRF，这是一个过滤视觉模型，在这里展示了一个caf墙壁图案和一个错觉图案的样本，其中最终的感知来自于对高斯差分(DoG)的多尺度处理以及前景和背景元素的感知交互。

引用次数: 0

M2-Net: A Multi-scale Multi-level Feature Enhanced Network for Object Detection in Optical Remote Sensing Images M2-Net:一种用于光学遥感图像目标检测的多尺度多层次特征增强网络

2020 Digital Image Computing: Techniques and Applications (DICTA)

Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363420

X. Ye, Fengchao Xiong, Jianfeng Lu, Haifeng Zhao, Jun Zhou

Object detection in remote sensing images is a challenging task due to diversified orientation, complex background, dense distribution and scale variation of objects. In this paper, we tackle this problem by proposing a novel multi-scale multi-level feature enhanced network ($M$2-Net) that integrates a Feature Map Enhancement (FME) module and a Feature Fusion Block (FFB) into Rotational RetinaNet. The FME module aims to enhance the weak features by factorizing the convolutional operation into two similar branches instead of one single branch, which helps to broaden receptive field with less parameters. This module is embedded into different layers in the backbone network to capture multi-scale semantics and location information for detection. The FFB module is used to shorten the information propagation path between low-level high-resolution features in shallow layers and high-level semantic features in deep layers, facilitating more effective feature fusion and object detection especially those with small sizes. Experimental results on three benchmark datasets show that our method not only outperforms many one-stage detectors but also achieves competitive accuracy with lower time cost than two-stage detectors.

由于遥感图像中目标方向多样、背景复杂、分布密集、尺度多变等特点，目标检测是一项具有挑战性的任务。在本文中，我们通过提出一种新的多尺度多层次特征增强网络($M$2-Net)来解决这个问题，该网络将特征映射增强(FME)模块和特征融合块(FFB)集成到旋转视网膜网络中。FME模块旨在通过将卷积运算分解为两个相似的分支而不是一个分支来增强弱特征，这有助于在较少参数的情况下拓宽接受域。该模块嵌入到骨干网的不同层中，以捕获多尺度语义和位置信息进行检测。FFB模块用于缩短浅层低分辨率特征与深层高语义特征之间的信息传播路径，便于更有效的特征融合和目标检测，尤其是小尺寸目标。在三个基准数据集上的实验结果表明，该方法不仅优于许多单阶段检测器，而且在较低的时间成本下获得了具有竞争力的精度。

{"title":"M2-Net: A Multi-scale Multi-level Feature Enhanced Network for Object Detection in Optical Remote Sensing Images","authors":"X. Ye, Fengchao Xiong, Jianfeng Lu, Haifeng Zhao, Jun Zhou","doi":"10.1109/DICTA51227.2020.9363420","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363420","url":null,"abstract":"Object detection in remote sensing images is a challenging task due to diversified orientation, complex background, dense distribution and scale variation of objects. In this paper, we tackle this problem by proposing a novel multi-scale multi-level feature enhanced network ($M$2-Net) that integrates a Feature Map Enhancement (FME) module and a Feature Fusion Block (FFB) into Rotational RetinaNet. The FME module aims to enhance the weak features by factorizing the convolutional operation into two similar branches instead of one single branch, which helps to broaden receptive field with less parameters. This module is embedded into different layers in the backbone network to capture multi-scale semantics and location information for detection. The FFB module is used to shorten the information propagation path between low-level high-resolution features in shallow layers and high-level semantic features in deep layers, facilitating more effective feature fusion and object detection especially those with small sizes. Experimental results on three benchmark datasets show that our method not only outperforms many one-stage detectors but also achieves competitive accuracy with lower time cost than two-stage detectors.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114461377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Automatic Assessment of Open Street Maps Database Quality using Aerial Imagery 基于航空影像的开放街道地图数据库质量自动评估

2020 Digital Image Computing: Techniques and Applications (DICTA)

Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363412

Boris Repasky, Timothy Payne, A. Dick

Open data initiatives such as OpenStreetMap (OSM) are a powerful crowd sourced approach to data collection. However due to their crowd-sourced nature the quality of the database heavily depends on the enthusiasm and determination of the public. We propose a novel method based on variational autoencoder generative adversarial networks (VAE-GAN) together with an information theoretic measure of database quality based on the expected discrimination information between the original image and labels generated from OSM data. Experiments on overhead aerial imagery and segmentation masks generated from OSM data show that our proposed discrimination information measure is a promising measure to regional database quality in OSM.

像OpenStreetMap (OSM)这样的开放数据计划是一种强大的数据收集方法。然而，由于其众包性质，数据库的质量在很大程度上取决于公众的热情和决心。我们提出了一种基于变分自编码器生成对抗网络(VAE-GAN)的新方法，以及一种基于原始图像和OSM数据生成的标签之间期望区分信息的数据库质量信息论度量。对OSM数据生成的高架航空图像和分割掩模进行的实验表明，我们提出的区分信息测度是一种很有前途的OSM区域数据库质量测度。

引用次数: 0

HCI for Elderly, Measuring Visual Complexity of Webpages Based on Machine Learning 面向老年人的人机交互，基于机器学习的网页视觉复杂性测量

2020 Digital Image Computing: Techniques and Applications (DICTA)

Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363381

Zahra Sadeghi, E. Homayounvala, M. Borhani

The increasing number of elderly persons, aged 65 and over, highlights the problem of improving their experience with computers and the web considering their preferences and needs. Elderlies' skills like cognitive, haptic, visual, and motor skills are reduced by age. The visual complexity of web pages has a major influence on the quality of user experience of elderly users according to their reduced abilities. Therefore, it is quite beneficial if the visual complexity of web pages could be measured and reduced in applications and websites which are designed for them. In this way a personalized less complex version of the website could be provided for older users. In this article, a new approach for measuring the visual complexity is proposed by using both Human-Computer Interaction (HCI) and machine learning methods. Six features are considered for complexity measurements. Experimental results demonstrated that the trained proposed machine learning approach increases the accuracy of classification of applications and websites based on their visual complexity up to 82% which is more than its competitors. Besides, a feature selection algorithm indicates that features such as clutter and equilibrium were selected to have the most influence on the classification of webpages based on their visual complexity.

随着六十五岁及以上的长者人数不断增加，考虑到他们的喜好和需要，改善他们使用电脑和网络的体验，是一个突出的问题。老年人的认知、触觉、视觉和运动技能等技能随着年龄的增长而下降。由于老年用户的能力下降，网页的视觉复杂性对用户体验的质量有很大的影响。因此，在为网页设计的应用程序和网站中，如果可以测量和减少网页的视觉复杂性是非常有益的。通过这种方式，可以为老年用户提供个性化的、不那么复杂的网站版本。本文提出了一种利用人机交互(HCI)和机器学习方法测量视觉复杂性的新方法。复杂度度量考虑了六个特征。实验结果表明，经过训练的机器学习方法将基于视觉复杂性的应用程序和网站分类的准确率提高了82%，超过了竞争对手。此外，特征选择算法表明，根据网页的视觉复杂性，选择杂乱和均衡等特征对网页分类影响最大。

{"title":"HCI for Elderly, Measuring Visual Complexity of Webpages Based on Machine Learning","authors":"Zahra Sadeghi, E. Homayounvala, M. Borhani","doi":"10.1109/DICTA51227.2020.9363381","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363381","url":null,"abstract":"The increasing number of elderly persons, aged 65 and over, highlights the problem of improving their experience with computers and the web considering their preferences and needs. Elderlies' skills like cognitive, haptic, visual, and motor skills are reduced by age. The visual complexity of web pages has a major influence on the quality of user experience of elderly users according to their reduced abilities. Therefore, it is quite beneficial if the visual complexity of web pages could be measured and reduced in applications and websites which are designed for them. In this way a personalized less complex version of the website could be provided for older users. In this article, a new approach for measuring the visual complexity is proposed by using both Human-Computer Interaction (HCI) and machine learning methods. Six features are considered for complexity measurements. Experimental results demonstrated that the trained proposed machine learning approach increases the accuracy of classification of applications and websites based on their visual complexity up to 82% which is more than its competitors. Besides, a feature selection algorithm indicates that features such as clutter and equilibrium were selected to have the most influence on the classification of webpages based on their visual complexity.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124005656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

UNET-Based Multi-Task Architecture for Brain Lesion Segmentation 基于unet的脑损伤分割多任务结构

2020 Digital Image Computing: Techniques and Applications (DICTA)

Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363397

Ava Assadi Abolvardi, Len Hamey, K. Ho-Shon

Image segmentation is the task of extracting the region of interest in images and is one of the main applications of computer vision in the medical domain. Like other computer vision tasks, deep learning is the main solution to image segmentation problems. Deep learning methods are data-hungry and need a huge amount of data for training. On the other side, data shortage is always a problem, especially in the medical domain. Multi-task learning is a technique which helps the deep model to learn better representation from data distribution by introducing related auxiliary tasks. In this study, we investigate a research question to whether it is better to provide this auxiliary information as an input to the network, or is it better to use this task and design a multi-output network. Our findings suggest that however, the multi-output manner improves the overall performance, but the best result achieves when this extra information serves as auxiliary input information.

图像分割是提取图像中感兴趣区域的任务，是计算机视觉在医学领域的主要应用之一。与其他计算机视觉任务一样，深度学习是图像分割问题的主要解决方案。深度学习方法需要大量的数据，需要大量的数据进行训练。另一方面，数据短缺一直是一个问题，特别是在医疗领域。多任务学习是一种通过引入相关的辅助任务，帮助深度模型从数据分布中更好地学习表征的技术。在本研究中，我们探讨了一个研究问题，即是否更好地提供这些辅助信息作为网络的输入，或者更好地使用这个任务并设计一个多输出网络。然而，我们的研究结果表明，多输出方式提高了整体性能，但当这些额外的信息作为辅助输入信息时，效果最好。

引用次数: 1

Automated Pneumoconiosis Detection on Chest X-Rays Using Cascaded Learning with Real and Synthetic Radiographs 使用级联学习与真实和合成射线片的胸部x射线自动尘肺病检测

2020 Digital Image Computing: Techniques and Applications (DICTA)

Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363416

Dadong Wang, Y. Arzhaeva, Liton Devnath, Maoying Qiao, Saeed K. Amirgholipour, Qiyu Liao, R. McBean, J. Hillhouse, S. Luo, David Meredith, K. Newbigin, Deborah Yates

Pneumoconiosis is an incurable respiratory disease caused by long-term inhalation of respirable dust. Due to small pneumoconiosis incidence and restrictions on sharing of patient data, the number of available pneumoconiosis X-rays is insufficient, which introduces significant challenges for training deep learning models. In this paper, we use both real and synthetic pneumoconiosis radiographs to train a cascaded machine learning framework for the automated detection of pneumoconiosis, including a machine learning based pixel classifier for lung field segmentation, and Cycle-Consistent Adversarial Networks (CycleGAN) for generating abundant lung field images for training, and a Convolutional Neural Network (CNN) based image classier. Experiments are conducted to compare the classification results from several state-of-the-art machine learning models and ours. Our proposed model outperforms the others and achieves an overall classification accuracy of 90.24%, a specificity of 88.46% and an excellent sensitivity of 93.33% for detecting pneumoconiosis.

尘肺病是一种无法治愈的呼吸道疾病，由长期吸入可呼吸性粉尘引起。由于尘肺发病率低且患者数据共享受到限制，可用的尘肺x射线数量不足，这给深度学习模型的训练带来了重大挑战。在本文中，我们使用真实和合成的尘肺x线片来训练用于尘肺自动检测的级联机器学习框架，包括基于机器学习的肺场分割像素分类器，循环一致对抗网络(CycleGAN)用于生成丰富的肺场图像进行训练，以及基于卷积神经网络(CNN)的图像分类器。通过实验比较了几种最先进的机器学习模型和我们的模型的分类结果。我们提出的模型优于其他模型，总体分类准确率为90.24%，特异性为88.46%，检测尘肺的灵敏度为93.33%。

{"title":"Automated Pneumoconiosis Detection on Chest X-Rays Using Cascaded Learning with Real and Synthetic Radiographs","authors":"Dadong Wang, Y. Arzhaeva, Liton Devnath, Maoying Qiao, Saeed K. Amirgholipour, Qiyu Liao, R. McBean, J. Hillhouse, S. Luo, David Meredith, K. Newbigin, Deborah Yates","doi":"10.1109/DICTA51227.2020.9363416","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363416","url":null,"abstract":"Pneumoconiosis is an incurable respiratory disease caused by long-term inhalation of respirable dust. Due to small pneumoconiosis incidence and restrictions on sharing of patient data, the number of available pneumoconiosis X-rays is insufficient, which introduces significant challenges for training deep learning models. In this paper, we use both real and synthetic pneumoconiosis radiographs to train a cascaded machine learning framework for the automated detection of pneumoconiosis, including a machine learning based pixel classifier for lung field segmentation, and Cycle-Consistent Adversarial Networks (CycleGAN) for generating abundant lung field images for training, and a Convolutional Neural Network (CNN) based image classier. Experiments are conducted to compare the classification results from several state-of-the-art machine learning models and ours. Our proposed model outperforms the others and achieves an overall classification accuracy of 90.24%, a specificity of 88.46% and an excellent sensitivity of 93.33% for detecting pneumoconiosis.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123433839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2020 Digital Image Computing: Techniques and Applications (DICTA)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀