首页 > 最新文献

2018 Digital Image Computing: Techniques and Applications (DICTA)最新文献

英文 中文
In Situ Cane Toad Recognition 就地蔗蟾蜍识别
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615780
D. Konovalov, Simindokht Jahangard, L. Schwarzkopf
Cane toads are invasive, toxic to native predators, compete with native insectivores, and have a devastating impact on Australian ecosystems, prompting the Australian government to list toads as a key threatening process under the Environment Protection and Biodiversity Conservation Act 1999. Mechanical cane toad traps could be made more native-fauna friendly if they could distinguish invasive cane toads from native species. Here we designed and trained a Convolution Neural Network (CNN) starting from the Xception CNN. The XToadGmp toad-recognition CNN we developed was trained end-to-end using heat-map Gaussian targets. After training, XToadGmp required minimum image pre/post-processing and when tested on 720×1280 shaped images, it achieved 97.1% classification accuracy on 1863 toad and 2892 not-toad test images, which were not used in training.
甘蔗蟾蜍具有入侵性,对本地捕食者有毒,与本地食虫动物竞争,并对澳大利亚的生态系统产生破坏性影响,促使澳大利亚政府将蟾蜍列为1999年《环境保护和生物多样性保护法》下的关键威胁过程。如果能够区分入侵蔗蜍和本地蔗蜍,则可以使机械蔗蜍陷阱对本地动物更加友好。在这里,我们设计并训练了一个卷积神经网络(CNN),从异常CNN开始。我们开发的XToadGmp蟾蜍识别CNN使用热图高斯目标进行端到端训练。经过训练后,XToadGmp对图像预处理/后处理要求最低,在720×1280形状图像上进行测试时,对未用于训练的1863张蟾蜍和2892张非蟾蜍测试图像的分类准确率达到97.1%。
{"title":"In Situ Cane Toad Recognition","authors":"D. Konovalov, Simindokht Jahangard, L. Schwarzkopf","doi":"10.1109/DICTA.2018.8615780","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615780","url":null,"abstract":"Cane toads are invasive, toxic to native predators, compete with native insectivores, and have a devastating impact on Australian ecosystems, prompting the Australian government to list toads as a key threatening process under the Environment Protection and Biodiversity Conservation Act 1999. Mechanical cane toad traps could be made more native-fauna friendly if they could distinguish invasive cane toads from native species. Here we designed and trained a Convolution Neural Network (CNN) starting from the Xception CNN. The XToadGmp toad-recognition CNN we developed was trained end-to-end using heat-map Gaussian targets. After training, XToadGmp required minimum image pre/post-processing and when tested on 720×1280 shaped images, it achieved 97.1% classification accuracy on 1863 toad and 2892 not-toad test images, which were not used in training.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131232544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Left Ventricle Volume Measuring using Echocardiography Sequences 超声心动图序列测量左心室容积
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615766
Yi Guo, S. Green, L. Park, Lauren Rispen
Measuring left ventricle (LV) volume is a challenging problem in physiological study. One of the non-intrusive methods that is possible for this task is echocardiography. By extracting left ventricle area from ultrasound images, the volume can be approximated by the size of the left ventricle area. The core of the problem becomes the identification of the left ventricle in noisy images considering spatial temporal information. We propose adaptive sparse smoothing for left ventricle segmentation for each frame in echocardiography video for the benefit of robustness against strong speckle noise in ultrasound imagery. Then we adjust the identified left ventricle areas (as curves in polar coordinate system) further by a fixed rank principal component analysis as post processing. This method is tested on two data sets with labelled left ventricle areas for some frames by expert physiologist and compared against active contour based method. The experimental results show clearly that the proposed method has better accuracy than that of the competitor.
左心室容积测量是生理学研究中的一个难题。其中一种非侵入性的方法是超声心动图。通过从超声图像中提取左心室面积,可以用左心室面积的大小来近似计算容积。该问题的核心是在考虑时空信息的噪声图像中识别左心室。我们提出了自适应稀疏平滑左心室分割的超声心动图视频的每一帧的优势,鲁棒性强的斑点噪声在超声图像。然后通过固定秩主成分分析作为后处理进一步调整识别出的左心室面积(作为极坐标系曲线)。由生理学专家在两个数据集上对部分帧的左心室区域进行了测试,并与基于活动轮廓的方法进行了比较。实验结果表明,该方法具有较好的精度。
{"title":"Left Ventricle Volume Measuring using Echocardiography Sequences","authors":"Yi Guo, S. Green, L. Park, Lauren Rispen","doi":"10.1109/DICTA.2018.8615766","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615766","url":null,"abstract":"Measuring left ventricle (LV) volume is a challenging problem in physiological study. One of the non-intrusive methods that is possible for this task is echocardiography. By extracting left ventricle area from ultrasound images, the volume can be approximated by the size of the left ventricle area. The core of the problem becomes the identification of the left ventricle in noisy images considering spatial temporal information. We propose adaptive sparse smoothing for left ventricle segmentation for each frame in echocardiography video for the benefit of robustness against strong speckle noise in ultrasound imagery. Then we adjust the identified left ventricle areas (as curves in polar coordinate system) further by a fixed rank principal component analysis as post processing. This method is tested on two data sets with labelled left ventricle areas for some frames by expert physiologist and compared against active contour based method. The experimental results show clearly that the proposed method has better accuracy than that of the competitor.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122182999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Cluster-Based Crowd Movement Behavior Detection 基于集群的人群运动行为检测
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615809
Meng Yang, Lida Rashidi, A. S. Rao, S. Rajasegarar, Mohadeseh Ganji, M. Palaniswami, C. Leckie
Crowd behaviour monitoring and prediction is an important research topic in video surveillance that has gained increasing attention. In this paper, we propose a novel architecture for crowd event detection, which comprises methods for object detection, clustering of various groups of objects, characterizing the movement patterns of the various groups of objects, detecting group events, and finding the change point of group events. In our proposed framework, we use clusters to represent the groups of objects/people present in the scene. We then extract the movement patterns of the various groups of objects over the video sequence to detect movement patterns. We define several crowd events and propose a methodology to detect the change point of the group events over time. We evaluated our scheme using six video sequences from benchmark datasets, which include events such as walking, running, global merging, local merging, global splitting and local splitting. We compared our scheme with state of the art methods and showed the superiority of our method in accurately detecting the crowd behavioral changes.
人群行为监测与预测是视频监控领域的一个重要研究课题,越来越受到人们的关注。在本文中,我们提出了一种新的群体事件检测体系结构,该体系结构包括对象检测方法、各种对象组的聚类方法、描述各种对象组的运动模式方法、检测群体事件方法和寻找群体事件变化点方法。在我们提出的框架中,我们使用集群来表示场景中存在的对象/人组。然后,我们提取视频序列上不同组对象的运动模式,以检测运动模式。我们定义了几个群体事件,并提出了一种方法来检测群体事件随时间的变化点。我们使用来自基准数据集的6个视频序列来评估我们的方案,包括步行、跑步、全局合并、局部合并、全局分裂和局部分裂等事件。我们将该方案与当前最先进的方法进行了比较,证明了该方法在准确检测人群行为变化方面的优越性。
{"title":"Cluster-Based Crowd Movement Behavior Detection","authors":"Meng Yang, Lida Rashidi, A. S. Rao, S. Rajasegarar, Mohadeseh Ganji, M. Palaniswami, C. Leckie","doi":"10.1109/DICTA.2018.8615809","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615809","url":null,"abstract":"Crowd behaviour monitoring and prediction is an important research topic in video surveillance that has gained increasing attention. In this paper, we propose a novel architecture for crowd event detection, which comprises methods for object detection, clustering of various groups of objects, characterizing the movement patterns of the various groups of objects, detecting group events, and finding the change point of group events. In our proposed framework, we use clusters to represent the groups of objects/people present in the scene. We then extract the movement patterns of the various groups of objects over the video sequence to detect movement patterns. We define several crowd events and propose a methodology to detect the change point of the group events over time. We evaluated our scheme using six video sequences from benchmark datasets, which include events such as walking, running, global merging, local merging, global splitting and local splitting. We compared our scheme with state of the art methods and showed the superiority of our method in accurately detecting the crowd behavioral changes.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124800550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Similar Gesture Recognition using Hierarchical Classification Approach in RGB Videos 基于层次分类方法的RGB视频相似手势识别
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615804
Di Wu, N. Sharma, M. Blumenstein
Recognizing human actions from the video streams has become one of the very popular research areas in computer vision and deep learning in the recent years. Action recognition is wildly used in different scenarios in real life, such as surveillance, robotics, healthcare, video indexing and human-computer interaction. The challenges and complexity involved in developing a video-based human action recognition system are manifold. In particular, recognizing actions with similar gestures and describing complex actions is a very challenging problem. To address these issues, we study the problem of classifying human actions using Convolutional Neural Networks (CNN) and develop a hierarchical 3DCNN architecture for similar gesture recognition. The proposed model firstly combines similar gesture pairs into one class, and classify them along with all other class, as a stage-1 classification. In stage-2, similar gesture pairs are classified individually, which reduces the problem to binary classification. We apply and evaluate the developed models to recognize the similar human actions on the HMDB51 dataset. The result shows that the proposed model can achieve high performance in comparison to the state-of-the-art methods.
从视频流中识别人类行为已成为近年来计算机视觉和深度学习领域的热门研究领域之一。动作识别广泛应用于现实生活中的不同场景,如监控、机器人、医疗保健、视频索引和人机交互。开发基于视频的人体动作识别系统所涉及的挑战和复杂性是多方面的。特别是,识别具有相似手势的动作和描述复杂的动作是一个非常具有挑战性的问题。为了解决这些问题,我们研究了使用卷积神经网络(CNN)对人类行为进行分类的问题,并开发了用于类似手势识别的分层3DCNN架构。该模型首先将相似的手势组合成一个类别,并将其与所有其他类别一起分类,作为第一阶段的分类。在第二阶段,对相似的手势对进行单独分类,将问题简化为二值分类。我们应用和评估了开发的模型来识别HMDB51数据集上类似的人类活动。结果表明,与现有方法相比,该模型具有较高的性能。
{"title":"Similar Gesture Recognition using Hierarchical Classification Approach in RGB Videos","authors":"Di Wu, N. Sharma, M. Blumenstein","doi":"10.1109/DICTA.2018.8615804","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615804","url":null,"abstract":"Recognizing human actions from the video streams has become one of the very popular research areas in computer vision and deep learning in the recent years. Action recognition is wildly used in different scenarios in real life, such as surveillance, robotics, healthcare, video indexing and human-computer interaction. The challenges and complexity involved in developing a video-based human action recognition system are manifold. In particular, recognizing actions with similar gestures and describing complex actions is a very challenging problem. To address these issues, we study the problem of classifying human actions using Convolutional Neural Networks (CNN) and develop a hierarchical 3DCNN architecture for similar gesture recognition. The proposed model firstly combines similar gesture pairs into one class, and classify them along with all other class, as a stage-1 classification. In stage-2, similar gesture pairs are classified individually, which reduces the problem to binary classification. We apply and evaluate the developed models to recognize the similar human actions on the HMDB51 dataset. The result shows that the proposed model can achieve high performance in comparison to the state-of-the-art methods.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"204 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131556502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Heuristic Evaluations of Cultural Heritage Websites 文化遗产网站的启发式评价
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615847
Duyen Lam, Atul Sajjanhar
Heuristic evaluation, a systematic inspection, aims to find the usability problems in websites. Numerous sets of usability heuristics have been adopted for specific fields through the examination and the judgment of evaluators. Cultural heritage has drawn significant interest and needs thorough investigations in order to improve the interfaces of websites and help to promote cultural values of a country. An in-deep review of literature on user interface evaluations about cultural heritage is presented. We examine several aspects including cultural dimensions in interface design, cultural-based adaptive web design, and technologies for cultural heritage websites' interfaces. The findings are expected to be a foundation in designing archiving websites in the domain of cultural heritage.
启发式评价是一种系统的检查方法,旨在发现网站的可用性问题。通过评估者的检查和判断,针对特定领域采用了许多套可用性启发式。文化遗产已经引起了人们极大的兴趣,需要进行深入的调查,以改善网站的界面,并有助于促进一个国家的文化价值。对有关文化遗产的用户界面评价的文献进行了深入的回顾。我们研究了几个方面,包括界面设计中的文化维度,基于文化的适应性网页设计,以及文化遗产网站界面的技术。此次调查结果有望成为日后设计文化遗产领域存档网站的基础。
{"title":"Heuristic Evaluations of Cultural Heritage Websites","authors":"Duyen Lam, Atul Sajjanhar","doi":"10.1109/DICTA.2018.8615847","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615847","url":null,"abstract":"Heuristic evaluation, a systematic inspection, aims to find the usability problems in websites. Numerous sets of usability heuristics have been adopted for specific fields through the examination and the judgment of evaluators. Cultural heritage has drawn significant interest and needs thorough investigations in order to improve the interfaces of websites and help to promote cultural values of a country. An in-deep review of literature on user interface evaluations about cultural heritage is presented. We examine several aspects including cultural dimensions in interface design, cultural-based adaptive web design, and technologies for cultural heritage websites' interfaces. The findings are expected to be a foundation in designing archiving websites in the domain of cultural heritage.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132340546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Human Brain Tissue Segmentation in fMRI using Deep Long-Term Recurrent Convolutional Network 基于深度长期递归卷积网络的fMRI人脑组织分割
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615850
Sui Paul Ang, S. L. Phung, M. Schira, A. Bouzerdoum, S. T. Duong
Accurate segmentation of different brain tissue types is an important step in the study of neuronal activities using functional magnetic resonance imaging (fMRI). Traditionally, due to the low spatial resolution of fMRI data and the absence of an automated segmentation approach, human experts often resort to superimposing fMRI data on high resolution structural MRI images for analysis. The recent advent of fMRI with higher spatial resolutions offers a new possibility of differentiating brain tissues by their spatio-temporal characteristics, without relying on the structural MRI images. In this paper, we propose a patch-wise deep learning method for segmenting human brain tissues into five types, which are gray matter, white matter, blood vessel, non-brain and cerebrospinal fluid. The proposed method achieves a classification rate of 84.04% and a Dice similarity coefficient of 76.99%, which exceed those by several other methods.
在功能磁共振成像(fMRI)研究神经元活动中,准确分割不同的脑组织类型是一个重要步骤。传统上,由于fMRI数据的空间分辨率较低,并且缺乏自动分割的方法,人类专家经常求助于将fMRI数据叠加在高分辨率的结构MRI图像上进行分析。近年来,具有更高空间分辨率的功能磁共振成像技术的出现,为不依赖于结构MRI图像,通过其时空特征来区分脑组织提供了新的可能性。在本文中,我们提出了一种基于补丁的深度学习方法,将人脑组织分为灰质、白质、血管、非脑和脑脊液五种类型。该方法的分类率为84.04%,Dice相似系数为76.99%,优于其他几种方法。
{"title":"Human Brain Tissue Segmentation in fMRI using Deep Long-Term Recurrent Convolutional Network","authors":"Sui Paul Ang, S. L. Phung, M. Schira, A. Bouzerdoum, S. T. Duong","doi":"10.1109/DICTA.2018.8615850","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615850","url":null,"abstract":"Accurate segmentation of different brain tissue types is an important step in the study of neuronal activities using functional magnetic resonance imaging (fMRI). Traditionally, due to the low spatial resolution of fMRI data and the absence of an automated segmentation approach, human experts often resort to superimposing fMRI data on high resolution structural MRI images for analysis. The recent advent of fMRI with higher spatial resolutions offers a new possibility of differentiating brain tissues by their spatio-temporal characteristics, without relying on the structural MRI images. In this paper, we propose a patch-wise deep learning method for segmenting human brain tissues into five types, which are gray matter, white matter, blood vessel, non-brain and cerebrospinal fluid. The proposed method achieves a classification rate of 84.04% and a Dice similarity coefficient of 76.99%, which exceed those by several other methods.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120974070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Table Detection in Document Images using Foreground and Background Features 使用前景和背景特征的文档图像中的表检测
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615795
Saman Arif, F. Shafait
Table detection is an important step in many document analysis systems. It is a difficult problem due to the variety of table layouts, encoding techniques and the similarity of tabular regions with non-tabular document elements. Earlier approaches of table detection are based on heuristic rules or require additional PDF metadata. Recently proposed methods based on machine learning have shown good results. This paper demonstrates performance improvement to these table detection techniques. The proposed solution is based on the observation that tables tend to contain more numeric data and hence it applies color coding/coloration as a signal for telling apart numeric and textual data. Deep learning based Faster R-CNN is used for detection of tabular regions from document images. To gauge the performance of our proposed solution, publicly available UNLV dataset is used. Performance measures indicate improvement when compared with best in-class strategies.
表检测是许多文档分析系统中的一个重要步骤。由于表格布局、编码技术的多样性以及表格区域与非表格文档元素的相似性,这是一个难题。早期的表检测方法基于启发式规则,或者需要额外的PDF元数据。最近提出的基于机器学习的方法已经显示出良好的效果。本文演示了这些表检测技术的性能改进。提出的解决方案是基于观察到表往往包含更多的数字数据,因此它应用颜色编码/颜色作为区分数字和文本数据的信号。基于深度学习的Faster R-CNN用于从文档图像中检测表格区域。为了衡量我们提出的解决方案的性能,使用了公开可用的UNLV数据集。与同类最佳策略相比,性能指标表明有所改善。
{"title":"Table Detection in Document Images using Foreground and Background Features","authors":"Saman Arif, F. Shafait","doi":"10.1109/DICTA.2018.8615795","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615795","url":null,"abstract":"Table detection is an important step in many document analysis systems. It is a difficult problem due to the variety of table layouts, encoding techniques and the similarity of tabular regions with non-tabular document elements. Earlier approaches of table detection are based on heuristic rules or require additional PDF metadata. Recently proposed methods based on machine learning have shown good results. This paper demonstrates performance improvement to these table detection techniques. The proposed solution is based on the observation that tables tend to contain more numeric data and hence it applies color coding/coloration as a signal for telling apart numeric and textual data. Deep learning based Faster R-CNN is used for detection of tabular regions from document images. To gauge the performance of our proposed solution, publicly available UNLV dataset is used. Performance measures indicate improvement when compared with best in-class strategies.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"447 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115606782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
DICTA 2018 Keynotes
Pub Date : 2018-12-01 DOI: 10.1109/dicta.2018.8615756
{"title":"DICTA 2018 Keynotes","authors":"","doi":"10.1109/dicta.2018.8615756","DOIUrl":"https://doi.org/10.1109/dicta.2018.8615756","url":null,"abstract":"","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115018684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DGDI: A Dataset for Detecting Glomeruli on Renal Direct Immunofluorescence DGDI:肾直接免疫荧光检测肾小球的数据集
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615769
Kun Zhao, Yuliang Tang, Teng Zhang, J. Carvajal, Daniel F. Smith, A. Wiliem, Peter Hobson, A. Jennings, B. Lovell
With the growing popularity of whole slide scanners, there is a high demand to develop computer aided diagnostic techniques for this new digitized pathology data. The ability to extract effective information from digital slides, which serve as fundamental representations of the prognostic data patterns or structures, provides promising opportunities to improve the accuracy of automatic disease diagnosis. The recent advances in computer vision have shown that Convolutional Neural Networks (CNNs) can be used to analyze digitized pathology images providing more consistent and objective information to the pathologists. In this paper, to advance the progress in developing computer aided diagnosis systems for renal direct immunofluorescence test, we introduce a new benchmark dataset for Detecting Glomeruli on renal Direct Immunofluorescence (DGDI). To build the baselines, we investigate various CNN-based detectors on DGDI. Experiments demonstrate that DGDI well represents the challenges of renal direct immunofluorescence image analysis and encourages the progress in developing new approaches for understanding renal disease.
随着全切片扫描仪的日益普及,对这种新型数字化病理数据的计算机辅助诊断技术提出了很高的要求。从数字载玻片中提取有效信息的能力,作为预后数据模式或结构的基本表示,为提高自动疾病诊断的准确性提供了有希望的机会。计算机视觉的最新进展表明,卷积神经网络(cnn)可以用于分析数字化病理图像,为病理学家提供更加一致和客观的信息。为了促进肾脏直接免疫荧光检测计算机辅助诊断系统的发展,我们介绍了一个新的肾脏直接免疫荧光检测肾小球的基准数据集。为了建立基线,我们在DGDI上研究了各种基于cnn的检测器。实验表明,DGDI很好地代表了肾脏直接免疫荧光图像分析的挑战,并鼓励开发新的方法来了解肾脏疾病。
{"title":"DGDI: A Dataset for Detecting Glomeruli on Renal Direct Immunofluorescence","authors":"Kun Zhao, Yuliang Tang, Teng Zhang, J. Carvajal, Daniel F. Smith, A. Wiliem, Peter Hobson, A. Jennings, B. Lovell","doi":"10.1109/DICTA.2018.8615769","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615769","url":null,"abstract":"With the growing popularity of whole slide scanners, there is a high demand to develop computer aided diagnostic techniques for this new digitized pathology data. The ability to extract effective information from digital slides, which serve as fundamental representations of the prognostic data patterns or structures, provides promising opportunities to improve the accuracy of automatic disease diagnosis. The recent advances in computer vision have shown that Convolutional Neural Networks (CNNs) can be used to analyze digitized pathology images providing more consistent and objective information to the pathologists. In this paper, to advance the progress in developing computer aided diagnosis systems for renal direct immunofluorescence test, we introduce a new benchmark dataset for Detecting Glomeruli on renal Direct Immunofluorescence (DGDI). To build the baselines, we investigate various CNN-based detectors on DGDI. Experiments demonstrate that DGDI well represents the challenges of renal direct immunofluorescence image analysis and encourages the progress in developing new approaches for understanding renal disease.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121888617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Size-Invariant Attention Accuracy Metric for Image Captioning with High-Resolution Residual Attention 具有高分辨率剩余注意的图像标题尺寸不变注意精度度量
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615788
Zongjian Zhang, Qiang Wu, Yang Wang, Fang Chen
Spatial visual attention mechanisms have achieved significant performance improvements for image captioning. To quantitatively evaluate the performances of attention mechanisms, the "attention correctness" metric has been proposed to calculate the sum of attention weights generated for ground truth regions. However, this metric cannot consistently measure the attention accuracy among the element regions with large size variance. Moreover, its evaluations are inconsistent with captioning performances across different fine-grained attention resolutions. To address these problems, this paper proposes a size-invariant evaluation metric by normalizing the "attention correctness" metric with the size percentage of the attended region. To demonstrate the efficiency of our size-invariant metric, this paper further proposes a high-resolution residual attention model that uses RefineNet as the Fully Convolutional Network (FCN) encoder. By using the COCO-Stuff dataset, we can achieve pixel-level evaluations on both object and "stuff" regions. We use our metric to evaluate the proposed attention model across four high fine-grained resolutions (i.e., 27×27, 40×40, 60×60, 80×80). The results demonstrate that, compared with the "attention correctness" metric, our size-invariant metric is more consistent with the captioning performances and is more efficient for evaluating the attention accuracy.
空间视觉注意机制在图像字幕方面取得了显著的性能改进。为了定量评价注意机制的性能,提出了“注意正确性”度量来计算为地面真区生成的注意权值的总和。然而,该指标不能一致地衡量元素区域之间的注意准确性,且差异较大。此外,它的评价与字幕在不同细粒度注意力分辨率下的表现不一致。为了解决这些问题,本文提出了一个大小不变的评价指标,通过将“注意正确性”指标规范化为被关注区域的大小百分比。为了证明我们的尺寸不变度量的有效性,本文进一步提出了一个高分辨率的剩余注意力模型,该模型使用RefineNet作为全卷积网络(FCN)编码器。通过使用COCO-Stuff数据集,我们可以在对象和“材料”区域上实现像素级的评估。我们使用我们的度量来评估四种高细粒度分辨率(即27×27, 40×40, 60×60, 80×80)的建议的注意力模型。结果表明,与“注意正确性”度量相比,我们的尺寸不变度量更符合字幕的表现,更有效地评价字幕的注意准确性。
{"title":"Size-Invariant Attention Accuracy Metric for Image Captioning with High-Resolution Residual Attention","authors":"Zongjian Zhang, Qiang Wu, Yang Wang, Fang Chen","doi":"10.1109/DICTA.2018.8615788","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615788","url":null,"abstract":"Spatial visual attention mechanisms have achieved significant performance improvements for image captioning. To quantitatively evaluate the performances of attention mechanisms, the \"attention correctness\" metric has been proposed to calculate the sum of attention weights generated for ground truth regions. However, this metric cannot consistently measure the attention accuracy among the element regions with large size variance. Moreover, its evaluations are inconsistent with captioning performances across different fine-grained attention resolutions. To address these problems, this paper proposes a size-invariant evaluation metric by normalizing the \"attention correctness\" metric with the size percentage of the attended region. To demonstrate the efficiency of our size-invariant metric, this paper further proposes a high-resolution residual attention model that uses RefineNet as the Fully Convolutional Network (FCN) encoder. By using the COCO-Stuff dataset, we can achieve pixel-level evaluations on both object and \"stuff\" regions. We use our metric to evaluate the proposed attention model across four high fine-grained resolutions (i.e., 27×27, 40×40, 60×60, 80×80). The results demonstrate that, compared with the \"attention correctness\" metric, our size-invariant metric is more consistent with the captioning performances and is more efficient for evaluating the attention accuracy.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124269994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2018 Digital Image Computing: Techniques and Applications (DICTA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1