首页 > 最新文献

2018 Digital Image Computing: Techniques and Applications (DICTA)最新文献

英文 中文
Memory and Time Efficient 3D Neuron Morphology Tracing in Large-Scale Images 大规模图像中记忆和时间效率高的三维神经元形态跟踪
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615765
Heng Wang, Donghao Zhang, Yang Song, Siqi Liu, Rong Gao, Hanchuan Peng, Weidong (Tom) Cai
3D reconstruction of neuronal morphology is crucial to solving neuron-related problems in neuroscience as it is a key technique for investigating the connectivity and functionality of the neuron system. Many methods have been proposed to improve the accuracy of digital neuron reconstruction. However, the large amount of computer memory and computation time they require to process the large-scale images have posed a new challenge for us. To solve this problem, we introduce a novel Memory (and Time) Efficient Image Tracing (MEIT) framework. Evaluated on the Gold dataset, our proposed method achieves better or competitive performance compared to state-of-the-art neuron tracing methods in most cases while requiring less memory and time.
神经元形态的三维重建对于解决神经科学中神经元相关问题至关重要,因为它是研究神经元系统连接和功能的关键技术。为了提高数字神经元重建的准确性,人们提出了许多方法。然而,处理大规模图像需要大量的计算机内存和计算时间,这对我们提出了新的挑战。为了解决这个问题,我们引入了一种新的记忆(和时间)高效图像跟踪(MEIT)框架。在Gold数据集上进行了评估,在大多数情况下,我们提出的方法与最先进的神经元跟踪方法相比,在需要更少的内存和时间的情况下实现了更好或更具竞争力的性能。
{"title":"Memory and Time Efficient 3D Neuron Morphology Tracing in Large-Scale Images","authors":"Heng Wang, Donghao Zhang, Yang Song, Siqi Liu, Rong Gao, Hanchuan Peng, Weidong (Tom) Cai","doi":"10.1109/DICTA.2018.8615765","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615765","url":null,"abstract":"3D reconstruction of neuronal morphology is crucial to solving neuron-related problems in neuroscience as it is a key technique for investigating the connectivity and functionality of the neuron system. Many methods have been proposed to improve the accuracy of digital neuron reconstruction. However, the large amount of computer memory and computation time they require to process the large-scale images have posed a new challenge for us. To solve this problem, we introduce a novel Memory (and Time) Efficient Image Tracing (MEIT) framework. Evaluated on the Gold dataset, our proposed method achieves better or competitive performance compared to state-of-the-art neuron tracing methods in most cases while requiring less memory and time.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115748643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Enhancing the Effectiveness of Local Descriptor Based Image Matching 增强基于局部描述子的图像匹配的有效性
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615800
Md Tahmid Hossain, S. Teng, Dengsheng Zhang, Suryani Lim, Guojun Lu
Image registration has received great attention from researchers over the last few decades. SIFT (Scale Invariant Feature Transform), a local descriptor-based technique is widely used for registering and matching images. To establish correspondences between images, SIFT uses a Euclidean Distance ratio metric. However, this approach leads to a lot of incorrect matches and eliminating these inaccurate matches has been a challenge. Various methods have been proposed attempting to mitigate this problem. In this paper, we propose a scale and orientation harmony-based pruning method that improves image matching process by successfully eliminating incorrect SIFT descriptor matches. Moreover, our technique can predict the image transformation parameters based on a novel adaptive clustering method with much higher matching accuracy. Our experimental results have shown that the proposed method has achieved averages of approximately 16% and 10% higher matching accuracy compared to the traditional SIFT and a contemporary method respectively.
在过去的几十年里,图像配准受到了研究人员的极大关注。SIFT (Scale Invariant Feature Transform)是一种基于局部描述子的图像配准和匹配技术。为了建立图像之间的对应关系,SIFT使用欧几里得距离比度量。然而,这种方法会导致大量不正确的匹配,消除这些不正确的匹配一直是一个挑战。为了缓解这个问题,已经提出了各种各样的方法。在本文中,我们提出了一种基于尺度和方向调和的剪枝方法,通过成功地消除不正确的SIFT描述符匹配来改进图像匹配过程。此外,我们的技术可以基于一种新的自适应聚类方法预测图像变换参数,具有更高的匹配精度。实验结果表明,与传统SIFT和现代SIFT相比,该方法的平均匹配精度分别提高了约16%和10%。
{"title":"Enhancing the Effectiveness of Local Descriptor Based Image Matching","authors":"Md Tahmid Hossain, S. Teng, Dengsheng Zhang, Suryani Lim, Guojun Lu","doi":"10.1109/DICTA.2018.8615800","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615800","url":null,"abstract":"Image registration has received great attention from researchers over the last few decades. SIFT (Scale Invariant Feature Transform), a local descriptor-based technique is widely used for registering and matching images. To establish correspondences between images, SIFT uses a Euclidean Distance ratio metric. However, this approach leads to a lot of incorrect matches and eliminating these inaccurate matches has been a challenge. Various methods have been proposed attempting to mitigate this problem. In this paper, we propose a scale and orientation harmony-based pruning method that improves image matching process by successfully eliminating incorrect SIFT descriptor matches. Moreover, our technique can predict the image transformation parameters based on a novel adaptive clustering method with much higher matching accuracy. Our experimental results have shown that the proposed method has achieved averages of approximately 16% and 10% higher matching accuracy compared to the traditional SIFT and a contemporary method respectively.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124258544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepParse: A Trainable Postal Address Parser DeepParse:一个可训练的邮政地址解析器
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615844
N. Abid, A. Ul-Hasan, F. Shafait
Postal applications are among the first beneficiaries of the advancements in document image processing techniques due to their economic significance. To automate the process of postal services, it is necessary to integrate contributions from a wide range of image processing domains, from image acquisition and preprocessing to interpretation through symbol, character and word recognition. Lately, machine learning approaches are deployed for postal address processing. Parsing problem has been explored using different techniques, like regular expressions, Conditional Random Fields (CRFs), Hidden Markov Models (HMMs), Decision Trees and Support Vector Machines (SVMs). These traditional techniques are designed on the assumption that the data is free from OCR errors which decreases the adaptability of the architecture in the real-world scenarios. Furthermore, their performance is affected in the presence of non-standardized addresses resulting in intermixing of similar classes. In this paper, we present the first trainable neural network based robust architecture DeepParse for postal address parsing that tackles these issues and can be applied to any Named Entity Recognition (NER) problem. The architecture takes the input at different granularity levels: characters, trigram characters and words to extract and learn the features and classify the addresses. The model was trained on a synthetically generated dataset and tested on the real-world addresses. DeepParse has also been tested on the NER dataset i.e. CoNLL2003 and gave the result of 90.44% which is on par with the state-of-art technique.
由于其经济意义,邮政应用是文件图像处理技术进步的第一批受益者之一。为了使邮政服务过程自动化,有必要综合各种图像处理领域的贡献,从图像采集和预处理到通过符号、字符和单词识别的解释。最近,机器学习方法被用于邮政地址处理。解析问题已经使用不同的技术进行了探索,如正则表达式、条件随机场(CRFs)、隐马尔可夫模型(hmm)、决策树和支持向量机(svm)。这些传统技术是在假设数据没有OCR错误的情况下设计的,这降低了体系结构在现实场景中的适应性。此外,由于存在非标准化地址,导致类似类的混合,它们的性能受到影响。在本文中,我们提出了第一个基于可训练神经网络的鲁棒架构DeepParse,用于邮政地址解析,解决了这些问题,并可应用于任何命名实体识别(NER)问题。该体系结构接受不同粒度级别的输入:字符、三元字符和单词,以提取和学习特征并对地址进行分类。该模型在合成生成的数据集上进行训练,并在真实地址上进行测试。DeepParse也在NER数据集(即CoNLL2003)上进行了测试,并给出了90.44%的结果,这与最先进的技术相当。
{"title":"DeepParse: A Trainable Postal Address Parser","authors":"N. Abid, A. Ul-Hasan, F. Shafait","doi":"10.1109/DICTA.2018.8615844","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615844","url":null,"abstract":"Postal applications are among the first beneficiaries of the advancements in document image processing techniques due to their economic significance. To automate the process of postal services, it is necessary to integrate contributions from a wide range of image processing domains, from image acquisition and preprocessing to interpretation through symbol, character and word recognition. Lately, machine learning approaches are deployed for postal address processing. Parsing problem has been explored using different techniques, like regular expressions, Conditional Random Fields (CRFs), Hidden Markov Models (HMMs), Decision Trees and Support Vector Machines (SVMs). These traditional techniques are designed on the assumption that the data is free from OCR errors which decreases the adaptability of the architecture in the real-world scenarios. Furthermore, their performance is affected in the presence of non-standardized addresses resulting in intermixing of similar classes. In this paper, we present the first trainable neural network based robust architecture DeepParse for postal address parsing that tackles these issues and can be applied to any Named Entity Recognition (NER) problem. The architecture takes the input at different granularity levels: characters, trigram characters and words to extract and learn the features and classify the addresses. The model was trained on a synthetically generated dataset and tested on the real-world addresses. DeepParse has also been tested on the NER dataset i.e. CoNLL2003 and gave the result of 90.44% which is on par with the state-of-art technique.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124345093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Human Ear Surface Reconstruction Through Morphable Model Deformation 基于变形模型的人耳表面重建
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615786
S. Kabbour, Pierre-Yves Richard
In this paper, a novel fully automated method is developed to acquire an accurate surface 3D reconstruction of the human ear by using multi-view stereo vision and morphable model without texture. As the results show, our method outperform state of the art approaches. Our method is based on using a template to estimate the pose and orientation of the camera without relying on correspondences, and after dense reconstruction is done, the ear morphable model is fitted on this point cloud by minimizing the distance between them, the form of the model can be transform as wished by its coefficients, and it only uses shape without relying on texture to converge its coefficients.
本文提出了一种基于多视角立体视觉和无纹理变形模型的高精度人耳表面三维重建方法。结果表明,我们的方法优于最先进的方法。我们的方法是在不依赖对应关系的情况下,利用模板来估计摄像机的姿态和方向,在进行密集重建后,通过最小化它们之间的距离来拟合耳朵变形模型,模型的形式可以根据其系数进行变换,并且只使用形状而不依赖纹理来收敛其系数。
{"title":"Human Ear Surface Reconstruction Through Morphable Model Deformation","authors":"S. Kabbour, Pierre-Yves Richard","doi":"10.1109/DICTA.2018.8615786","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615786","url":null,"abstract":"In this paper, a novel fully automated method is developed to acquire an accurate surface 3D reconstruction of the human ear by using multi-view stereo vision and morphable model without texture. As the results show, our method outperform state of the art approaches. Our method is based on using a template to estimate the pose and orientation of the camera without relying on correspondences, and after dense reconstruction is done, the ear morphable model is fitted on this point cloud by minimizing the distance between them, the form of the model can be transform as wished by its coefficients, and it only uses shape without relying on texture to converge its coefficients.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114906367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Optimization of a Principal Component Analysis Implementation on Field-Programmable Gate Arrays (FPGA) for Analysis of Spectral Images 用于光谱图像分析的现场可编程门阵列(FPGA)主成分分析实现的优化
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615866
M. Schellhorn, G. Notni
For the acceptance of spectral measurement technology for quality assurance and inspection in the industrial sector, the acquisition and processing of spectral images must be adapted to the production cycle. When processing spectral images, variations of the Principal Component Analysis (PCA) are often used as preprocessing steps, for example for segmentation, spectral decomposition or data compression. To speed up this time-consuming algorithm, hardware and software cores were implemented on a system-on-a-programmable-chip (SoPC). This paper deals with the optimization of this implementation to minimize calculation times. Special attention is paid to the cores used to calculate covariances and data derivation. The restructuring of the hardware IP (Intellectual property) cores and fundamental design decisions are discussed. The optimization was implemented and evaluated on a 12-channel spectral camera.
为了在工业领域接受用于质量保证和检验的光谱测量技术,光谱图像的获取和处理必须适应生产周期。在处理光谱图像时,通常使用主成分分析(PCA)的变化作为预处理步骤,例如分割,光谱分解或数据压缩。为了加快这一耗时的算法,硬件和软件核心被实现在一个系统上的可编程芯片(SoPC)。本文将对该实现进行优化,以最小化计算时间。特别注意用于计算协方差和数据推导的核心。讨论了硬件IP(知识产权)核心的重构和基本设计决策。在一台12通道光谱相机上进行了优化并进行了评估。
{"title":"Optimization of a Principal Component Analysis Implementation on Field-Programmable Gate Arrays (FPGA) for Analysis of Spectral Images","authors":"M. Schellhorn, G. Notni","doi":"10.1109/DICTA.2018.8615866","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615866","url":null,"abstract":"For the acceptance of spectral measurement technology for quality assurance and inspection in the industrial sector, the acquisition and processing of spectral images must be adapted to the production cycle. When processing spectral images, variations of the Principal Component Analysis (PCA) are often used as preprocessing steps, for example for segmentation, spectral decomposition or data compression. To speed up this time-consuming algorithm, hardware and software cores were implemented on a system-on-a-programmable-chip (SoPC). This paper deals with the optimization of this implementation to minimize calculation times. Special attention is paid to the cores used to calculate covariances and data derivation. The restructuring of the hardware IP (Intellectual property) cores and fundamental design decisions are discussed. The optimization was implemented and evaluated on a 12-channel spectral camera.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128897239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
3D Multiview Basketball Players Detection and Localization Based on Probabilistic Occupancy 基于概率占用的三维多视角篮球运动员检测与定位
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615798
Yukun Yang, Min Xu, Wanneng Wu, Ruiheng Zhang, Yu Peng
This paper addresses the issue of 3D multiview basketball players detection and localization. Existing methods for this problem typically take background subtraction as input, which limits the accuracy of localization and the performance of further object tracking. Moreover, the performance of background subtraction based methods is heavily impacted by the occlusions in crowded scenes. In this paper, we propose an innovative method which jointly implements deep learning based player detection and occupancy probability based player localization. What's more, a new Bayesian model of the localization algorithms is developed, which uses foreground information from fisheye cameras to setup meaningful initialization values in the first step of iteration, in order to not only eliminate ambiguous detection, but also accelerate computational processes. Experimental results on real basketball game data demonstrate that our methods significantly improve the performance compared with current methods, by eliminating missed and false detection, as well as increasing probabilities of positive results.
本文研究了三维多视角篮球运动员的检测与定位问题。现有的定位方法通常以背景减法作为输入,这限制了定位的准确性和进一步目标跟踪的性能。此外,在拥挤的场景中,基于背景相减的方法的性能会受到遮挡的严重影响。在本文中,我们提出了一种创新的方法,将基于深度学习的玩家检测和基于占用概率的玩家定位结合起来。此外,提出了一种新的贝叶斯定位算法模型,利用鱼眼相机的前景信息在迭代的第一步设置有意义的初始值,既消除了模糊检测,又加快了计算速度。在真实篮球比赛数据上的实验结果表明,我们的方法通过消除漏检和误检以及增加阳性结果的概率,显著提高了现有方法的性能。
{"title":"3D Multiview Basketball Players Detection and Localization Based on Probabilistic Occupancy","authors":"Yukun Yang, Min Xu, Wanneng Wu, Ruiheng Zhang, Yu Peng","doi":"10.1109/DICTA.2018.8615798","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615798","url":null,"abstract":"This paper addresses the issue of 3D multiview basketball players detection and localization. Existing methods for this problem typically take background subtraction as input, which limits the accuracy of localization and the performance of further object tracking. Moreover, the performance of background subtraction based methods is heavily impacted by the occlusions in crowded scenes. In this paper, we propose an innovative method which jointly implements deep learning based player detection and occupancy probability based player localization. What's more, a new Bayesian model of the localization algorithms is developed, which uses foreground information from fisheye cameras to setup meaningful initialization values in the first step of iteration, in order to not only eliminate ambiguous detection, but also accelerate computational processes. Experimental results on real basketball game data demonstrate that our methods significantly improve the performance compared with current methods, by eliminating missed and false detection, as well as increasing probabilities of positive results.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"61 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130720214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Data Augmentation using Evolutionary Image Processing 使用进化图像处理的数据增强
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615799
Kosaku Fujita, Masayuki Kobayashi, T. Nagao
In the machine learning community, data augmentation techniques have been widely used to make deep neural networks invariant to object transition. However, less attention has been paid to data augmentation in traditional classification methods. In this paper, we take a closer look at traditional classification methods and introduce a new data augmentation technique based on the concept of image transformation. Starting with a few existing examples, we add noise and generate new data points to reduce sparseness in a given feature space. Then, we generate images corresponding to the new data points, although this is usually an ill-posed problem. Herein, the novelty is in constructing an image transformation tree and generating new data from a small number of instances. This allows us to reduce sparseness in the feature space and build more robust classifiers. We evaluate our method on the Caltech-101 dataset to verify its potential. In the context of the situation where the amount of training data is limited, we demonstrate that the support vector machine-based classifiers trained with an augmented dataset using our method outperform classifiers trained with the original dataset in most cases.
在机器学习领域,数据增强技术被广泛用于使深度神经网络不受对象转移的影响。然而,传统的分类方法对数据增强的关注较少。本文在分析传统分类方法的基础上,提出了一种新的基于图像变换概念的数据增强技术。从一些现有的例子开始,我们添加噪声并生成新的数据点来减少给定特征空间的稀疏性。然后,我们生成与新数据点相对应的图像,尽管这通常是一个不适定问题。在此,新颖之处在于构建图像转换树并从少量实例中生成新数据。这使我们能够减少特征空间的稀疏性,并构建更健壮的分类器。我们在Caltech-101数据集上评估了我们的方法,以验证其潜力。在训练数据量有限的情况下,我们证明了使用我们的方法使用增强数据集训练的基于支持向量机的分类器在大多数情况下优于使用原始数据集训练的分类器。
{"title":"Data Augmentation using Evolutionary Image Processing","authors":"Kosaku Fujita, Masayuki Kobayashi, T. Nagao","doi":"10.1109/DICTA.2018.8615799","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615799","url":null,"abstract":"In the machine learning community, data augmentation techniques have been widely used to make deep neural networks invariant to object transition. However, less attention has been paid to data augmentation in traditional classification methods. In this paper, we take a closer look at traditional classification methods and introduce a new data augmentation technique based on the concept of image transformation. Starting with a few existing examples, we add noise and generate new data points to reduce sparseness in a given feature space. Then, we generate images corresponding to the new data points, although this is usually an ill-posed problem. Herein, the novelty is in constructing an image transformation tree and generating new data from a small number of instances. This allows us to reduce sparseness in the feature space and build more robust classifiers. We evaluate our method on the Caltech-101 dataset to verify its potential. In the context of the situation where the amount of training data is limited, we demonstrate that the support vector machine-based classifiers trained with an augmented dataset using our method outperform classifiers trained with the original dataset in most cases.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132999215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Image Registration via Geometrically Constrained Total Variation Optical Flow 几何约束全变分光流图像配准
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615805
M. Shoeiby, M. Armin, A. Robles-Kelly
In this paper, we present a method for registration of image pairs. Our method relates both images to one another for registration purposes making use of optical flow. We formulate the problem in a variational setting making use of an L1-norm fidelity term, a total variation (TV) criterion, and a geometric constraint. This treatment leads to a cost function, in which, both the total variation and the homographic constraints are enforced via regularisation. Further, to compute the flow we employ a multiscale pyramid, whereby the total variation is minimized at each layer and the geometric constraint is enforced between layers. In practice, this is carried out by using a Rudin-Osher-Fatemi (ROF) denoising model within each layer and a gated function for the homography computation between layers. We also illustrate the utility of our method for image registration and flow computation and compare our approach to a mainstream non-geometrically constrained variational alternative elsewhere in the literature.
本文提出了一种图像对的配准方法。我们的方法将两个图像相互关联,以便利用光流进行配准。我们利用1范数保真度项、总变差(TV)准则和几何约束,在变分设置中表述问题。这种处理导致成本函数,其中,总变化和同形约束都是通过正则化来强制执行的。此外,为了计算流量,我们采用了一个多尺度金字塔,其中每层的总变化是最小的,层之间的几何约束是强制的。在实践中,这是通过在每层内使用Rudin-Osher-Fatemi (ROF)去噪模型和用于层间单应性计算的门控函数来实现的。我们还说明了我们的方法在图像配准和流计算方面的实用性,并将我们的方法与文献中其他地方的主流非几何约束变分替代方法进行了比较。
{"title":"Image Registration via Geometrically Constrained Total Variation Optical Flow","authors":"M. Shoeiby, M. Armin, A. Robles-Kelly","doi":"10.1109/DICTA.2018.8615805","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615805","url":null,"abstract":"In this paper, we present a method for registration of image pairs. Our method relates both images to one another for registration purposes making use of optical flow. We formulate the problem in a variational setting making use of an L1-norm fidelity term, a total variation (TV) criterion, and a geometric constraint. This treatment leads to a cost function, in which, both the total variation and the homographic constraints are enforced via regularisation. Further, to compute the flow we employ a multiscale pyramid, whereby the total variation is minimized at each layer and the geometric constraint is enforced between layers. In practice, this is carried out by using a Rudin-Osher-Fatemi (ROF) denoising model within each layer and a gated function for the homography computation between layers. We also illustrate the utility of our method for image registration and flow computation and compare our approach to a mainstream non-geometrically constrained variational alternative elsewhere in the literature.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122425470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Demodulation of Multi-Level Data using Convolutional Neural Network in Holographic Data Storage 全息数据存储中多层数据的卷积神经网络解调
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615863
Yutaro Katano, Tetsuhiko Muroi, N. Kinoshita, Norihiko Ishii
We evaluated a deep learning-based data demodulation method for multi-level recording data in holographic data storage. This method demodulates reproduced data as pattern recognition using a convolutional neural network. The network learns the rule of demodulation in consideration of optical noise that deteriorates the quality of reproduced data. Unlike with a conventional hard decision method, the learnt network demodulated the noise-added data accurately and decreased demodulation errors.
我们评估了一种基于深度学习的数据解调方法,用于全息数据存储中的多级记录数据。该方法利用卷积神经网络将再现数据解调为模式识别。考虑到影响再现数据质量的光噪声,网络学习了解调规则。与传统的硬决策方法不同,学习后的网络能准确地解调加噪数据,减小了解调误差。
{"title":"Demodulation of Multi-Level Data using Convolutional Neural Network in Holographic Data Storage","authors":"Yutaro Katano, Tetsuhiko Muroi, N. Kinoshita, Norihiko Ishii","doi":"10.1109/DICTA.2018.8615863","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615863","url":null,"abstract":"We evaluated a deep learning-based data demodulation method for multi-level recording data in holographic data storage. This method demodulates reproduced data as pattern recognition using a convolutional neural network. The network learns the rule of demodulation in consideration of optical noise that deteriorates the quality of reproduced data. Unlike with a conventional hard decision method, the learnt network demodulated the noise-added data accurately and decreased demodulation errors.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122487442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Generative Adversarial Network (GAN) Based Data Augmentation for Palmprint Recognition 基于生成对抗网络(GAN)的掌纹识别数据增强
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615782
Gengxing Wang, Wenxiong Kang, Qiuxia Wu, Zhiyong Wang, Junbin Gao
Palmprint recognition is a very important field of biometrics, and has been intensively researched on both feature extraction and classification methods. Recently, deep learning techniques such as convolutional neural networks have demonstrated clear advantages over traditional learning algorithms for various image classification tasks such as object recognition and detection. However, a large amount of data is needed to train deep networks, which limits its application to some tasks such as palmprint recognition where it lacks of sufficient training samples for each class (i.e., each individual). In this paper, we propose a Generative Adversarial Net (GAN) based solution to augment training data for improved performance of palmprint recognition. An improved Deep Convolutional Generative Adversarial Net (DCGAN) is first devised to generate high quality plamprint images by replacing convolutional transpose layer with linear upsampling and introducing Structure Similarity (SSIM) index into loss function. As a result, the generated images have discriminative features, increased smoothness and consistency, and less variance compared to those generated by the baseline DCGAN. Then, a mixing training strategy via a combination of GAN-based and classical data augmentation techniques is adopted to further improve recognition performance. The experimental results on two publicly available datasets demonstrate the effectiveness of our proposed GAN based data augmentation method in palmprint recognition. Our method is able to achieve 1.52% and 0.37% Equal Error Rates (EER) on IIT Delhi and CASIA palmprint datasets, respectively, which outperforms other existing methods.
掌纹识别是生物特征识别的一个重要领域,其特征提取和分类方法都得到了广泛的研究。最近,卷积神经网络等深度学习技术在各种图像分类任务(如物体识别和检测)中表现出了明显优于传统学习算法的优势。然而,训练深度网络需要大量的数据,这限制了它在一些任务中的应用,比如掌纹识别,在这些任务中,每个类(即每个个体)缺乏足够的训练样本。在本文中,我们提出了一种基于生成对抗网络(GAN)的解决方案来增强训练数据,以提高掌纹识别的性能。提出了一种改进的深度卷积生成对抗网络(DCGAN),通过线性上采样取代卷积转置层,并在损失函数中引入结构相似度(SSIM)指标来生成高质量的平面图像。因此,与基线DCGAN生成的图像相比,生成的图像具有判别特征,增加了平滑度和一致性,并且方差更小。然后,采用基于gan和经典数据增强技术相结合的混合训练策略,进一步提高识别性能。在两个公开数据集上的实验结果证明了我们提出的基于GAN的数据增强方法在掌纹识别中的有效性。该方法在IIT Delhi和CASIA掌纹数据集上的等效错误率(EER)分别为1.52%和0.37%,优于其他现有方法。
{"title":"Generative Adversarial Network (GAN) Based Data Augmentation for Palmprint Recognition","authors":"Gengxing Wang, Wenxiong Kang, Qiuxia Wu, Zhiyong Wang, Junbin Gao","doi":"10.1109/DICTA.2018.8615782","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615782","url":null,"abstract":"Palmprint recognition is a very important field of biometrics, and has been intensively researched on both feature extraction and classification methods. Recently, deep learning techniques such as convolutional neural networks have demonstrated clear advantages over traditional learning algorithms for various image classification tasks such as object recognition and detection. However, a large amount of data is needed to train deep networks, which limits its application to some tasks such as palmprint recognition where it lacks of sufficient training samples for each class (i.e., each individual). In this paper, we propose a Generative Adversarial Net (GAN) based solution to augment training data for improved performance of palmprint recognition. An improved Deep Convolutional Generative Adversarial Net (DCGAN) is first devised to generate high quality plamprint images by replacing convolutional transpose layer with linear upsampling and introducing Structure Similarity (SSIM) index into loss function. As a result, the generated images have discriminative features, increased smoothness and consistency, and less variance compared to those generated by the baseline DCGAN. Then, a mixing training strategy via a combination of GAN-based and classical data augmentation techniques is adopted to further improve recognition performance. The experimental results on two publicly available datasets demonstrate the effectiveness of our proposed GAN based data augmentation method in palmprint recognition. Our method is able to achieve 1.52% and 0.37% Equal Error Rates (EER) on IIT Delhi and CASIA palmprint datasets, respectively, which outperforms other existing methods.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122549782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
期刊
2018 Digital Image Computing: Techniques and Applications (DICTA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1