首页 > 最新文献

2018 Digital Image Computing: Techniques and Applications (DICTA)最新文献

英文 中文
Object Classification using Deep Learning on Extremely Low-Resolution Time-of-Flight Data 在极低分辨率飞行时间数据上使用深度学习的目标分类
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615877
Ana Daysi Ruvalcaba-Cardenas, T. Scoleri, Geoffrey Day
This paper proposes two novel deep learning models for 2D and 3D classification of objects in extremely low-resolution time-of-flight imagery. The models have been developed to suit contemporary range imaging hardware based on a recently fabricated Single Photon Avalanche Diode (SPAD) camera with 64 χ 64 pixel resolution. Being the first prototype of its kind, only a small data set has been collected so far which makes it challenging for training models. To bypass this hurdle, transfer learning is applied to the widely used VGG-16 convolutional neural network (CNN), with supplementary layers added specifically to handle SPAD data. This classifier and the renowned Faster-RCNN detector offer benchmark models for comparison to a newly created 3D CNN operating on time-of-flight data acquired by the SPAD sensor. Another contribution of this work is the proposed shot noise removal algorithm which is particularly useful to mitigate the camera sensitivity in situations of excessive lighting. Models have been tested in both low-light indoor settings and outdoor daytime conditions, on eight objects exhibiting small physical dimensions, low reflectivity, featureless structures and located at ranges from 25m to 700m. Despite antagonist factors, the proposed 2D model has achieved 95% average precision and recall, with higher accuracy for the 3D model.
本文提出了两种新的深度学习模型,用于极低分辨率飞行时间图像中物体的二维和三维分类。这些模型的开发是为了适应基于最近制造的64 x 64像素分辨率的单光子雪崩二极管(SPAD)相机的当代距离成像硬件。作为该类型的第一个原型,到目前为止只收集了一小部分数据集,这给训练模型带来了挑战。为了绕过这个障碍,迁移学习被应用到广泛使用的VGG-16卷积神经网络(CNN)中,并添加了专门用于处理SPAD数据的补充层。该分类器和著名的Faster-RCNN检测器提供了基准模型,用于与SPAD传感器获取的飞行时间数据上运行的新创建的3D CNN进行比较。这项工作的另一个贡献是提出的镜头噪声去除算法,该算法特别有助于在过度照明的情况下降低相机的灵敏度。模型在室内低光环境和室外白天条件下进行了测试,测试对象为8个物体,它们的物理尺寸小,反射率低,结构无特征,距离从25米到700米不等。尽管存在拮抗剂因素,所提出的2D模型达到了95%的平均精度和召回率,3D模型的准确率更高。
{"title":"Object Classification using Deep Learning on Extremely Low-Resolution Time-of-Flight Data","authors":"Ana Daysi Ruvalcaba-Cardenas, T. Scoleri, Geoffrey Day","doi":"10.1109/DICTA.2018.8615877","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615877","url":null,"abstract":"This paper proposes two novel deep learning models for 2D and 3D classification of objects in extremely low-resolution time-of-flight imagery. The models have been developed to suit contemporary range imaging hardware based on a recently fabricated Single Photon Avalanche Diode (SPAD) camera with 64 χ 64 pixel resolution. Being the first prototype of its kind, only a small data set has been collected so far which makes it challenging for training models. To bypass this hurdle, transfer learning is applied to the widely used VGG-16 convolutional neural network (CNN), with supplementary layers added specifically to handle SPAD data. This classifier and the renowned Faster-RCNN detector offer benchmark models for comparison to a newly created 3D CNN operating on time-of-flight data acquired by the SPAD sensor. Another contribution of this work is the proposed shot noise removal algorithm which is particularly useful to mitigate the camera sensitivity in situations of excessive lighting. Models have been tested in both low-light indoor settings and outdoor daytime conditions, on eight objects exhibiting small physical dimensions, low reflectivity, featureless structures and located at ranges from 25m to 700m. Despite antagonist factors, the proposed 2D model has achieved 95% average precision and recall, with higher accuracy for the 3D model.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128587255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
DICTA 2018 Conference Sponsors 2018年DICTA会议赞助商
Pub Date : 2018-12-01 DOI: 10.1109/dicta.2018.8615752
{"title":"DICTA 2018 Conference Sponsors","authors":"","doi":"10.1109/dicta.2018.8615752","DOIUrl":"https://doi.org/10.1109/dicta.2018.8615752","url":null,"abstract":"","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130619177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online Relational Manifold Learning for Multiview Segmentation in Echocardiography 超声心动图多视点分割的在线关系流形学习
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615773
G. Belous, Andrew Busch, D. Rowlands, Yongsheng Gao
Accurate delineation of the left ventricle (LV) endocardial border in echocardiography is of vital importance for the diagnosis and treatment of heart disease. Effective segmentation of the LV is challenging due to low contrast, signal dropout and acoustic noise. In the situation where low level and region-based image cues are unable to define the LV boundary, shape prior models are critical to infer shape. These models perform well when there is low variability in the underlying shape subspace and the shape instance produced by appearance cues does not contain gross errors, however in the absence of these conditions results are often much poorer. In this paper, we first propose a shape model to overcome the problem of modelling complex shape subspaces. Our method connects the implicit relationship between image features and shape by extending graph regularized sparse nonnegative matrix factorization (NMF) to jointly learn the structure and connection between two low dimensional manifolds comprising image features and shapes, respectively. We extend conventional NMF learning to an online learning-based approach where the input image is used to leverage the learning and connection of each manifold to the most relevant subspace regions. This ensures robust shape inference and a shape model constructed from contextually relevant shapes. A fully automatic segmentation approach using a probabilistic framework is then proposed to detect the LV endocardial border. Our method is applied to a diverse dataset that contains multiple views of the LV. Results show the effectiveness of our approach compared to state-of-the-art methods.
超声心动图准确描绘左心室心内膜边界对心脏病的诊断和治疗具有重要意义。由于低对比度、信号衰减和噪声,有效分割左室是具有挑战性的。在低层次和基于区域的图像线索无法定义LV边界的情况下,形状先验模型对于推断形状至关重要。当底层形状子空间的可变性较低,并且由外观线索产生的形状实例不包含严重误差时,这些模型表现良好,但是在没有这些条件的情况下,结果通常会差得多。本文首先提出了一种形状模型来克服复杂形状子空间的建模问题。该方法通过扩展图正则化稀疏非负矩阵分解(NMF)连接图像特征和形状之间的隐式关系,共同学习图像特征和形状组成的两个低维流形之间的结构和联系。我们将传统的NMF学习扩展到基于在线学习的方法,其中输入图像用于利用每个流形的学习和连接到最相关的子空间区域。这确保了鲁棒的形状推理和由上下文相关形状构建的形状模型。然后提出了一种基于概率框架的全自动分割方法来检测左室心内膜边界。我们的方法应用于包含LV多个视图的不同数据集。结果表明,我们的方法与最先进的方法相比是有效的。
{"title":"Online Relational Manifold Learning for Multiview Segmentation in Echocardiography","authors":"G. Belous, Andrew Busch, D. Rowlands, Yongsheng Gao","doi":"10.1109/DICTA.2018.8615773","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615773","url":null,"abstract":"Accurate delineation of the left ventricle (LV) endocardial border in echocardiography is of vital importance for the diagnosis and treatment of heart disease. Effective segmentation of the LV is challenging due to low contrast, signal dropout and acoustic noise. In the situation where low level and region-based image cues are unable to define the LV boundary, shape prior models are critical to infer shape. These models perform well when there is low variability in the underlying shape subspace and the shape instance produced by appearance cues does not contain gross errors, however in the absence of these conditions results are often much poorer. In this paper, we first propose a shape model to overcome the problem of modelling complex shape subspaces. Our method connects the implicit relationship between image features and shape by extending graph regularized sparse nonnegative matrix factorization (NMF) to jointly learn the structure and connection between two low dimensional manifolds comprising image features and shapes, respectively. We extend conventional NMF learning to an online learning-based approach where the input image is used to leverage the learning and connection of each manifold to the most relevant subspace regions. This ensures robust shape inference and a shape model constructed from contextually relevant shapes. A fully automatic segmentation approach using a probabilistic framework is then proposed to detect the LV endocardial border. Our method is applied to a diverse dataset that contains multiple views of the LV. Results show the effectiveness of our approach compared to state-of-the-art methods.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128932684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image Enhancement for Face Recognition in Adverse Environments 不利环境下人脸识别的图像增强
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615793
D. Kamenetsky, Sau Yee Yiu, Martyn Hole
Face recognition in adverse environments, such as at long distances or in low light conditions, remains a challenging task for current state-of-the-art face matching algorithms. The facial images taken in these conditions are often low resolution and low quality due to the effects of atmospheric turbulence and/or insufficient amount of light reaching the camera. In this work, we use an atmospheric turbulence mitigation algorithm (MPE) to enhance low resolution RGB videos of faces captured either at long distances or in low light conditions. Due to its interactive nature, MPE is tuned to work well in each specific environment. We also propose three image enhancement techniques that further improve the images produced by MPE: two for low light imagery (MPEf and fMPE) and one for long distance imagery (MPEh). Experimental results show that all three methods significantly improve the image quality and face recognition performance, allowing effective face recognition in almost complete darkness (at close range) or at distances up to 200m (in daylight).
对于当前最先进的人脸匹配算法来说,在远距离或弱光条件下的恶劣环境下的人脸识别仍然是一项具有挑战性的任务。由于大气湍流和/或到达相机的光线不足的影响,在这些条件下拍摄的面部图像通常是低分辨率和低质量的。在这项工作中,我们使用大气湍流缓解算法(MPE)来增强在远距离或弱光条件下拍摄的低分辨率RGB人脸视频。由于其交互性,MPE可以在每个特定环境中很好地工作。我们还提出了三种图像增强技术,以进一步改善MPE产生的图像:两种用于弱光图像(MPEf和fMPE),一种用于远距离图像(MPEh)。实验结果表明,这三种方法都显著提高了图像质量和人脸识别性能,可以在几乎完全黑暗(近距离)或200米(白天)的距离下进行有效的人脸识别。
{"title":"Image Enhancement for Face Recognition in Adverse Environments","authors":"D. Kamenetsky, Sau Yee Yiu, Martyn Hole","doi":"10.1109/DICTA.2018.8615793","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615793","url":null,"abstract":"Face recognition in adverse environments, such as at long distances or in low light conditions, remains a challenging task for current state-of-the-art face matching algorithms. The facial images taken in these conditions are often low resolution and low quality due to the effects of atmospheric turbulence and/or insufficient amount of light reaching the camera. In this work, we use an atmospheric turbulence mitigation algorithm (MPE) to enhance low resolution RGB videos of faces captured either at long distances or in low light conditions. Due to its interactive nature, MPE is tuned to work well in each specific environment. We also propose three image enhancement techniques that further improve the images produced by MPE: two for low light imagery (MPEf and fMPE) and one for long distance imagery (MPEh). Experimental results show that all three methods significantly improve the image quality and face recognition performance, allowing effective face recognition in almost complete darkness (at close range) or at distances up to 200m (in daylight).","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126059532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Convolutional 3D Attention Network for Video Based Freezing of Gait Recognition 基于视频冻结的卷积三维注意网络步态识别
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615791
Renfei Sun, Zhiyong Wang, K. E. Martens, S. Lewis
Freezing of gait (FoG) is defined as a brief, episodic absence or marked reduction of forward progression of the feet despite the intention to walk. It is a typical symptom of Parkinson's disease (PD) and has a significant impact on the life quality of PD patients. Generally trained experts need to review the gait of a patient for clinical diagnosis, which is time consuming and subjective. Nowadays, automatic FoG identification from videos provides a promising solution to address these issues by formulating FoG identification as a human action recognition task. However, most existing human action recognition algorithms are limited in this task as FoG is very subtle and can be easily overlooked when being interfered with by irrelevant motion. In this paper, we propose a novel action recognition algorithm, namely convolutional 3D attention network (C3DAN), to address this issue by learning an informative region for more effective recognition. The network consists of two main parts: Spatial Attention Network (SAN) and 3-dimensional convolutional network (C3D). SAN aims to generate an attention region from coarse to fine, while C3D extracts discriminative features. Our proposed approach is able to localize attention region without manual annotation and to extract discriminative features in an end-to-end way. We evaluate our proposed C3DAN method on a video dataset collected from 45 PD patients in a clinical setting for the quantification of FoG in PD. We obtained sensitivity of 68.2%, specificity of 80.8% and accuracy of 79.3%, which outperformed several state-of-the-art human action recognition methods. To the best of our knowledge, our work is one of the first studies detecting FoG from clinical videos.
步态冻结(FoG)被定义为尽管有行走的意图,但短暂的,间歇性的缺乏或脚向前进展的显着减少。它是帕金森病(PD)的典型症状,对PD患者的生活质量有显著影响。一般训练有素的专家需要检查病人的步态进行临床诊断,这是耗时和主观的。目前,从视频中自动识别FoG为解决这些问题提供了一个很有前途的解决方案,它将FoG识别制定为一个人类动作识别任务。然而,大多数现有的人类动作识别算法在这项任务中都受到限制,因为FoG非常微妙,当被无关运动干扰时很容易被忽略。在本文中,我们提出了一种新的动作识别算法,即卷积三维注意网络(C3DAN),通过学习一个信息区域来解决这个问题,从而更有效地识别。该网络主要由两部分组成:空间注意网络(SAN)和三维卷积网络(C3D)。SAN的目标是生成一个由粗到细的注意区域,而C3D则是提取判别特征。我们提出的方法能够在不需要人工标注的情况下定位注意区域,并以端到端方式提取判别特征。我们在临床环境中收集了45名PD患者的视频数据集,以评估我们提出的C3DAN方法,用于量化PD中的FoG。我们获得的灵敏度为68.2%,特异性为80.8%,准确率为79.3%,优于几种最先进的人体动作识别方法。据我们所知,我们的工作是最早从临床视频中检测FoG的研究之一。
{"title":"Convolutional 3D Attention Network for Video Based Freezing of Gait Recognition","authors":"Renfei Sun, Zhiyong Wang, K. E. Martens, S. Lewis","doi":"10.1109/DICTA.2018.8615791","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615791","url":null,"abstract":"Freezing of gait (FoG) is defined as a brief, episodic absence or marked reduction of forward progression of the feet despite the intention to walk. It is a typical symptom of Parkinson's disease (PD) and has a significant impact on the life quality of PD patients. Generally trained experts need to review the gait of a patient for clinical diagnosis, which is time consuming and subjective. Nowadays, automatic FoG identification from videos provides a promising solution to address these issues by formulating FoG identification as a human action recognition task. However, most existing human action recognition algorithms are limited in this task as FoG is very subtle and can be easily overlooked when being interfered with by irrelevant motion. In this paper, we propose a novel action recognition algorithm, namely convolutional 3D attention network (C3DAN), to address this issue by learning an informative region for more effective recognition. The network consists of two main parts: Spatial Attention Network (SAN) and 3-dimensional convolutional network (C3D). SAN aims to generate an attention region from coarse to fine, while C3D extracts discriminative features. Our proposed approach is able to localize attention region without manual annotation and to extract discriminative features in an end-to-end way. We evaluate our proposed C3DAN method on a video dataset collected from 45 PD patients in a clinical setting for the quantification of FoG in PD. We obtained sensitivity of 68.2%, specificity of 80.8% and accuracy of 79.3%, which outperformed several state-of-the-art human action recognition methods. To the best of our knowledge, our work is one of the first studies detecting FoG from clinical videos.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122837838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Clearing Multiview Structure Graph from Inconsistencies 清除多视图结构图的不一致性
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615787
S. Kabbour, Pierre-Yves Richard
Dealing with repetitive patterns in images proves to be difficult in Multiview structure from motion. Previous work in the field suggests that this problem can be solved by clearing inconsistent rotations in the visual graph that represents pairwise relations between images. So we present a simple and rather effective algorithm, to clear the graph based on cycles. While trying to generate all cycles within the graph is computationally impossible in most cases, we choose to verify only the cycles that we need, and without relying on the spanning tree method because it puts a big emphasis on certain edges.
在运动的多视图结构中,处理图像中的重复模式是困难的。该领域以前的工作表明,这个问题可以通过清除表示图像之间成对关系的视觉图中的不一致旋转来解决。因此,我们提出了一个简单而有效的算法来清除基于循环的图。虽然在大多数情况下,试图在图中生成所有的循环在计算上是不可能的,但我们选择只验证我们需要的循环,而不依赖于生成树方法,因为它非常强调某些边。
{"title":"Clearing Multiview Structure Graph from Inconsistencies","authors":"S. Kabbour, Pierre-Yves Richard","doi":"10.1109/DICTA.2018.8615787","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615787","url":null,"abstract":"Dealing with repetitive patterns in images proves to be difficult in Multiview structure from motion. Previous work in the field suggests that this problem can be solved by clearing inconsistent rotations in the visual graph that represents pairwise relations between images. So we present a simple and rather effective algorithm, to clear the graph based on cycles. While trying to generate all cycles within the graph is computationally impossible in most cases, we choose to verify only the cycles that we need, and without relying on the spanning tree method because it puts a big emphasis on certain edges.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124546928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image Representation using Bag of Perceptual Curve Features 使用感知曲线特征包的图像表示
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615816
Elham Etemad, Q. Gao
There are many applications such as augmented or mixed reality with limited training data and computing power which results in inapplicability of convolutional neural networks in those domains. In this method, we have extracted the perceptual edge map of the image and grouped its perceptual structure-based edge elements according to gestalt psychology. The connecting points of these groups, called curve partitioning points (CPPs), are descriptive areas of the image and are utilized for image representation. In this method, the global perceptual image features, and local image representation methods are combined to encode the image according to the generated bag of CPPs using the spatial pyramid matching. The experiments on multi-label and single-label datasets show the superiority of the proposed method.
由于训练数据和计算能力有限,卷积神经网络在增强现实或混合现实等领域的应用并不适用。在该方法中,我们提取了图像的感知边缘图,并根据格式塔心理学对其基于感知结构的边缘元素进行分组。这些组的连接点称为曲线划分点(CPPs),是图像的描述区域,用于图像表示。该方法结合全局感知图像特征和局部图像表示方法,利用空间金字塔匹配的方法,根据生成的CPPs包对图像进行编码。在多标签和单标签数据集上的实验表明了该方法的优越性。
{"title":"Image Representation using Bag of Perceptual Curve Features","authors":"Elham Etemad, Q. Gao","doi":"10.1109/DICTA.2018.8615816","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615816","url":null,"abstract":"There are many applications such as augmented or mixed reality with limited training data and computing power which results in inapplicability of convolutional neural networks in those domains. In this method, we have extracted the perceptual edge map of the image and grouped its perceptual structure-based edge elements according to gestalt psychology. The connecting points of these groups, called curve partitioning points (CPPs), are descriptive areas of the image and are utilized for image representation. In this method, the global perceptual image features, and local image representation methods are combined to encode the image according to the generated bag of CPPs using the spatial pyramid matching. The experiments on multi-label and single-label datasets show the superiority of the proposed method.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128797825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image Processing for Traceability: A System Prototype for the Southern Rock Lobster (SRL) Supply Chain 可追溯性的图像处理:南方岩龙虾(SRL)供应链的系统原型
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615842
Son Anh Vo, J. Scanlan, L. Mirowski, P. Turner
This paper describes how conventional image processing techniques can be applied to the grading of Southern Rock Lobsters (SRL) to produce a high quality data layer which could be an input into product traceability. The research is part of a broader investigation into designing a low-cost biometric identification solution for use along the entire lobster supply chain. In approaching the image processing for lobster grading a key consideration is to develop a system capable of using low cost consumer grade cameras readily available in mobile phones. The results confirm that by combining a number of common techniques in computer vision it is possible to capture and process a set of valuable attributes from sampled lobster image including color, length, weight, legs and sex. By combining this image profile with other pre-existing data on catch location and landing port each lobster can be verifiably tracked along the supply chain journey to markets in China. The image processing research results achieved in the laboratory show high accuracy in measuring lobster carapace length that is vital for weight conversion calculations. The results also demonstrate the capability to obtain reliable values for average color, tail shape and number of legs on a lobster used in grading classifications. The findings are a major first step in the development of individual lobster biometric identification and will directly contribute to automating lobster grading in this valuable Australian fishery.
本文描述了传统的图像处理技术如何应用于南方岩龙虾(SRL)的分级,以产生高质量的数据层,这可以作为产品可追溯性的输入。这项研究是一项更广泛的研究的一部分,目的是设计一种低成本的生物识别解决方案,用于整个龙虾供应链。在接近龙虾分级的图像处理时,一个关键的考虑因素是开发一种能够使用移动电话中现成的低成本消费级相机的系统。结果证实,通过结合计算机视觉中的一些常用技术,可以从采样的龙虾图像中捕获和处理一组有价值的属性,包括颜色、长度、重量、腿和性别。通过将该图像配置文件与捕获地点和着陆港的其他现有数据相结合,可以沿着供应链到中国市场的旅程对每只龙虾进行可验证的跟踪。在实验室中取得的图像处理研究结果表明,测量龙虾甲壳长度具有很高的精度,这对体重转换计算至关重要。结果还证明了获得用于分级分类的龙虾的平均颜色、尾巴形状和腿数的可靠值的能力。这些发现是发展个体龙虾生物识别的重要的第一步,并将直接有助于在这个有价值的澳大利亚渔业中自动化龙虾分级。
{"title":"Image Processing for Traceability: A System Prototype for the Southern Rock Lobster (SRL) Supply Chain","authors":"Son Anh Vo, J. Scanlan, L. Mirowski, P. Turner","doi":"10.1109/DICTA.2018.8615842","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615842","url":null,"abstract":"This paper describes how conventional image processing techniques can be applied to the grading of Southern Rock Lobsters (SRL) to produce a high quality data layer which could be an input into product traceability. The research is part of a broader investigation into designing a low-cost biometric identification solution for use along the entire lobster supply chain. In approaching the image processing for lobster grading a key consideration is to develop a system capable of using low cost consumer grade cameras readily available in mobile phones. The results confirm that by combining a number of common techniques in computer vision it is possible to capture and process a set of valuable attributes from sampled lobster image including color, length, weight, legs and sex. By combining this image profile with other pre-existing data on catch location and landing port each lobster can be verifiably tracked along the supply chain journey to markets in China. The image processing research results achieved in the laboratory show high accuracy in measuring lobster carapace length that is vital for weight conversion calculations. The results also demonstrate the capability to obtain reliable values for average color, tail shape and number of legs on a lobster used in grading classifications. The findings are a major first step in the development of individual lobster biometric identification and will directly contribute to automating lobster grading in this valuable Australian fishery.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117154461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A New Method for Removing Asymmetric High Density Salt and Pepper Noise
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615814
Allan Pennings, I. Svalbe
The presence of salt and pepper noise in imaging is a common issue that needs to be overcome in image analysis. Many potential solutions to remove this noise have been discussed over the years, but these algorithms often make the common assumption that salt noise and pepper noise appear in equal densities. This is not necessarily the case. In this paper several filters are proposed and tested across a range of different salt to pepper ratios, which result in higher PSNR and SSIM when compared to other existing filters.
图像中椒盐噪声的存在是图像分析中需要克服的一个常见问题。多年来,人们讨论了许多消除这种噪声的潜在解决方案,但这些算法通常假设盐噪声和胡椒噪声以相同的密度出现。事实并非如此。本文提出了几种过滤器,并在不同的盐与胡椒比例范围内进行了测试,与其他现有过滤器相比,这些过滤器的PSNR和SSIM更高。
{"title":"A New Method for Removing Asymmetric High Density Salt and Pepper Noise","authors":"Allan Pennings, I. Svalbe","doi":"10.1109/DICTA.2018.8615814","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615814","url":null,"abstract":"The presence of salt and pepper noise in imaging is a common issue that needs to be overcome in image analysis. Many potential solutions to remove this noise have been discussed over the years, but these algorithms often make the common assumption that salt noise and pepper noise appear in equal densities. This is not necessarily the case. In this paper several filters are proposed and tested across a range of different salt to pepper ratios, which result in higher PSNR and SSIM when compared to other existing filters.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125767391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Drivers Performance Evaluation using Physiological Measurement in a Driving Simulator 驾驶模拟器中基于生理测量的驾驶员性能评价
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615763
Afsaneh Koohestani, P. Kebria, A. Khosravi, S. Nahavandi
Monitoring the drivers behaviour and detecting their awareness are of vital importance for road safety. Drivers distraction and low awareness are already known to be the main reason for accidents in the world. Distraction-related crashes have greatly increased in recent years due to the proliferation of communication, entertainment, and malfunctioning of driver assistance systems. Accordingly, there is a need for advanced systems to monitor the drivers behaviour and generate a warning if a degradation in a drivers performance is detected. The purpose of this study is to analyse the vehicle and drivers data to detect the onset of distraction. Physiological measurements, such as palm electrodermal activity, heart rate, breathing rate, and perinasal perspiration are analysed and applied for the development of the monitoring system. The dataset used in this research has these measurements for 68 healthy participants (35 male, 33 female/17 elderly, 51 young). These participants completed two driving sessions in a driving simulator, including the normal and loaded drive. In the loaded scenario, drivers were texting back words. The lane deviation of vehicle was recorded as the response variable. Different classification algorithms such as generalised linear, support vector model, K-nearest neighbour and random forest machines are implemented to classify the driver's performance based on input features. Prediction results indicate that random forest performs the best by achieving an area under the curve (AUC) of over 91%. It is also found that biographic features are not informative enough to analyse drivers performance while perinasal perspiration carries the most information.
监控司机的行为和检测他们的意识对道路安全至关重要。司机分心和意识低下已经被认为是世界上发生事故的主要原因。近年来,由于通讯、娱乐和驾驶辅助系统故障的激增,与分心有关的撞车事故大大增加。因此,需要先进的系统来监控驾驶员的行为,并在检测到驾驶员性能下降时发出警告。本研究的目的是分析车辆和驾驶员的数据,以检测分心的发生。生理测量,如手掌的皮肤电活动,心率,呼吸频率,和围鼻汗被分析和应用于监测系统的开发。本研究使用的数据集对68名健康参与者(35名男性,33名女性/17名老年人,51名年轻人)进行了这些测量。这些参与者在驾驶模拟器中完成了两次驾驶会话,包括正常驾驶和加载驾驶。在加载场景中,司机们都在回短信。将车辆的车道偏差作为响应变量。采用广义线性、支持向量模型、k近邻和随机森林等不同的分类算法,根据输入特征对驾驶员的性能进行分类。预测结果表明,随机森林的曲线下面积(AUC)达到91%以上,表现最好。研究还发现,传记特征不能提供足够的信息来分析驾驶员的表现,而周围汗液携带的信息最多。
{"title":"Drivers Performance Evaluation using Physiological Measurement in a Driving Simulator","authors":"Afsaneh Koohestani, P. Kebria, A. Khosravi, S. Nahavandi","doi":"10.1109/DICTA.2018.8615763","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615763","url":null,"abstract":"Monitoring the drivers behaviour and detecting their awareness are of vital importance for road safety. Drivers distraction and low awareness are already known to be the main reason for accidents in the world. Distraction-related crashes have greatly increased in recent years due to the proliferation of communication, entertainment, and malfunctioning of driver assistance systems. Accordingly, there is a need for advanced systems to monitor the drivers behaviour and generate a warning if a degradation in a drivers performance is detected. The purpose of this study is to analyse the vehicle and drivers data to detect the onset of distraction. Physiological measurements, such as palm electrodermal activity, heart rate, breathing rate, and perinasal perspiration are analysed and applied for the development of the monitoring system. The dataset used in this research has these measurements for 68 healthy participants (35 male, 33 female/17 elderly, 51 young). These participants completed two driving sessions in a driving simulator, including the normal and loaded drive. In the loaded scenario, drivers were texting back words. The lane deviation of vehicle was recorded as the response variable. Different classification algorithms such as generalised linear, support vector model, K-nearest neighbour and random forest machines are implemented to classify the driver's performance based on input features. Prediction results indicate that random forest performs the best by achieving an area under the curve (AUC) of over 91%. It is also found that biographic features are not informative enough to analyse drivers performance while perinasal perspiration carries the most information.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123474926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
2018 Digital Image Computing: Techniques and Applications (DICTA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1