首页 > 最新文献

2018 Digital Image Computing: Techniques and Applications (DICTA)最新文献

英文 中文
Strategies for Merging Hyperspectral Data of Different Spectral and Spatial Resoultion 不同光谱和空间分辨率高光谱数据合并策略
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615875
R. Illmann, M. Rosenberger, G. Notni
Increasing applications for hyperspectral measurement make increasing demands on the handling of big measurement data. Push broom imaging is a promising measurement technique for many applications. The combined registration of hyperspectral and spatial data reveal a lot of information about the measurement object. An exemplary well-known further processing technique is to extract feature vectors from such a dataset. For increasing quality and quantity of possible information, it is advantageously to have a spectral wide range dataset. Nevertheless, different spectral data mainly needs different imaging systems. A major problem in using hyperspectral data from different hyperspectral imaging systems is the combination of those to a wide range data set, called spectral cube. The aim of this work is to show which methods are principal conceivable and usable under different circumstances for merging such datasets with a profound analytical view. In addition, some work that was done in the theory and the design of a calibration model prototype is included.
高光谱测量的应用日益增多,对大测量数据的处理提出了更高的要求。推扫帚成像是一种很有前途的测量技术,应用广泛。高光谱数据与空间数据的结合配准揭示了被测目标的大量信息。一个典型的众所周知的进一步处理技术是从这样的数据集中提取特征向量。为了提高可能信息的质量和数量,具有光谱宽范围的数据集是有利的。然而,不同的光谱数据主要需要不同的成像系统。使用来自不同高光谱成像系统的高光谱数据的一个主要问题是将这些数据组合成一个大范围的数据集,称为光谱立方体。这项工作的目的是展示哪些方法是主要的可想象的和可用的,在不同的情况下合并这些数据集具有深刻的分析观点。此外,还介绍了在理论研究和标定模型样机设计方面所做的一些工作。
{"title":"Strategies for Merging Hyperspectral Data of Different Spectral and Spatial Resoultion","authors":"R. Illmann, M. Rosenberger, G. Notni","doi":"10.1109/DICTA.2018.8615875","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615875","url":null,"abstract":"Increasing applications for hyperspectral measurement make increasing demands on the handling of big measurement data. Push broom imaging is a promising measurement technique for many applications. The combined registration of hyperspectral and spatial data reveal a lot of information about the measurement object. An exemplary well-known further processing technique is to extract feature vectors from such a dataset. For increasing quality and quantity of possible information, it is advantageously to have a spectral wide range dataset. Nevertheless, different spectral data mainly needs different imaging systems. A major problem in using hyperspectral data from different hyperspectral imaging systems is the combination of those to a wide range data set, called spectral cube. The aim of this work is to show which methods are principal conceivable and usable under different circumstances for merging such datasets with a profound analytical view. In addition, some work that was done in the theory and the design of a calibration model prototype is included.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123928487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Single Hierarchical Network for Face, Action Unit and Emotion Detection 一种用于人脸、动作单元和情感检测的单一层次网络
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615852
Shreyank Jyoti, Garima Sharma, Abhinav Dhall
The deep neural network shows a consequential performance for a set of specific tasks. A system designed for some correlated task altogether can be feasible for ‘in the wild’ applications. This paper proposes a method for the face localization, Action Unit (AU) and emotion detection. The three different tasks are performed by a simultaneous hierarchical network which exploits the way of learning of neural networks. Such network can represent more relevant features than the individual network. Due to more complex structures and very deep networks, the deployment of neural networks for real life applications is a challenging task. The paper focuses to find an efficient trade-off between the performance and the complexity of the given tasks. This is done by exploring the advantages of optimization of the network for the given tasks by using separable convolutions, binarization and quantization. Four different databases (AffectNet, EmotioNet, RAF-DB and WiderFace) are used to evaluate the performance of our proposed approach by having a separate task specific database.
深度神经网络在一系列特定任务中表现出相应的性能。为一些相关任务设计的系统对于“野外”应用程序是可行的。提出了一种人脸定位、动作单元(AU)和情感检测的方法。这三种不同的任务由一个分层网络同时执行,该网络利用神经网络的学习方式。这样的网络可以代表比单个网络更多的相关特征。由于更复杂的结构和非常深的网络,将神经网络部署到现实生活中的应用是一项具有挑战性的任务。本文的重点是在给定任务的性能和复杂性之间找到一个有效的权衡。这是通过使用可分离卷积、二值化和量化来探索给定任务的网络优化的优势来实现的。四个不同的数据库(AffectNet, EmotioNet, RAF-DB和WiderFace)被用来评估我们提出的方法的性能,通过有一个单独的任务特定的数据库。
{"title":"A Single Hierarchical Network for Face, Action Unit and Emotion Detection","authors":"Shreyank Jyoti, Garima Sharma, Abhinav Dhall","doi":"10.1109/DICTA.2018.8615852","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615852","url":null,"abstract":"The deep neural network shows a consequential performance for a set of specific tasks. A system designed for some correlated task altogether can be feasible for ‘in the wild’ applications. This paper proposes a method for the face localization, Action Unit (AU) and emotion detection. The three different tasks are performed by a simultaneous hierarchical network which exploits the way of learning of neural networks. Such network can represent more relevant features than the individual network. Due to more complex structures and very deep networks, the deployment of neural networks for real life applications is a challenging task. The paper focuses to find an efficient trade-off between the performance and the complexity of the given tasks. This is done by exploring the advantages of optimization of the network for the given tasks by using separable convolutions, binarization and quantization. Four different databases (AffectNet, EmotioNet, RAF-DB and WiderFace) are used to evaluate the performance of our proposed approach by having a separate task specific database.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116671814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Detecting Splicing and Copy-Move Attacks in Color Images 检测彩色图像中的拼接和复制移动攻击
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615874
Mohammad Manzurul Islam, G. Karmakar, J. Kamruzzaman, Manzur Murshed, G. Kahandawa, N. Parvin
Image sensors are generating limitless digital images every day. Image forgery like splicing and copy-move are very common type of attacks that are easy to execute using sophisticated photo editing tools. As a result, digital forensics has attracted much attention to identify such tampering on digital images. In this paper, a passive (blind) image tampering identification method based on Discrete Cosine Transformation (DCT) and Local Binary Pattern (LBP) has been proposed. First, the chroma components of an image is divided into fixed sized non-overlapping blocks and 2D block DCT is applied to identify the changes due to forgery in local frequency distribution of the image. Then a texture descriptor, LBP is applied on the magnitude component of the 2D-DCT array to enhance the artifacts introduced by the tampering operation. The resulting LBP image is again divided into non-overlapping blocks. Finally, summations of corresponding inter-cell values of all the LBP blocks are computed and arranged as a feature vector. These features are fed into a Support Vector Machine (SVM) with Radial Basis Function (RBF) as kernel to distinguish forged images from authentic ones. The proposed method has been experimented extensively on three publicly available well-known image splicing and copy-move detection benchmark datasets of color images. Results demonstrate the superiority of the proposed method over recently proposed state-of-the-art approaches in terms of well accepted performance metrics such as accuracy, area under ROC curve and others.
图像传感器每天都在产生无限的数字图像。像拼接和复制移动这样的图像伪造是非常常见的攻击类型,使用复杂的照片编辑工具很容易执行。因此,数字取证已经引起了人们对数字图像篡改行为的关注。提出了一种基于离散余弦变换(DCT)和局部二值模式(LBP)的被动(盲)图像篡改识别方法。首先,将图像的色度分量划分为固定大小的不重叠块,利用二维分块DCT识别图像局部频率分布因伪造而产生的变化。然后在2D-DCT阵列的幅度分量上应用纹理描述符LBP来增强篡改操作带来的伪影。得到的LBP图像再次被划分为不重叠的块。最后,计算所有LBP块对应的胞间值之和,并将其排列为特征向量。将这些特征输入到以径向基函数(RBF)为核心的支持向量机(SVM)中,用于区分伪造图像和真实图像。该方法已在三个公开可用的彩色图像拼接和复制移动检测基准数据集上进行了广泛的实验。结果表明,就精度、ROC曲线下面积等公认的性能指标而言,所提出的方法优于最近提出的最先进的方法。
{"title":"Detecting Splicing and Copy-Move Attacks in Color Images","authors":"Mohammad Manzurul Islam, G. Karmakar, J. Kamruzzaman, Manzur Murshed, G. Kahandawa, N. Parvin","doi":"10.1109/DICTA.2018.8615874","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615874","url":null,"abstract":"Image sensors are generating limitless digital images every day. Image forgery like splicing and copy-move are very common type of attacks that are easy to execute using sophisticated photo editing tools. As a result, digital forensics has attracted much attention to identify such tampering on digital images. In this paper, a passive (blind) image tampering identification method based on Discrete Cosine Transformation (DCT) and Local Binary Pattern (LBP) has been proposed. First, the chroma components of an image is divided into fixed sized non-overlapping blocks and 2D block DCT is applied to identify the changes due to forgery in local frequency distribution of the image. Then a texture descriptor, LBP is applied on the magnitude component of the 2D-DCT array to enhance the artifacts introduced by the tampering operation. The resulting LBP image is again divided into non-overlapping blocks. Finally, summations of corresponding inter-cell values of all the LBP blocks are computed and arranged as a feature vector. These features are fed into a Support Vector Machine (SVM) with Radial Basis Function (RBF) as kernel to distinguish forged images from authentic ones. The proposed method has been experimented extensively on three publicly available well-known image splicing and copy-move detection benchmark datasets of color images. Results demonstrate the superiority of the proposed method over recently proposed state-of-the-art approaches in terms of well accepted performance metrics such as accuracy, area under ROC curve and others.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116824616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Long-Term Recurrent Predictive Model for Intent Prediction of Pedestrians via Inverse Reinforcement Learning 基于逆强化学习的行人意图预测长期循环模型
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615854
Khaled Saleh, M. Hossny, S. Nahavandi
Recently, the problem of intent and trajectory prediction of pedestrians in urban traffic environments has got some attention from the intelligent transportation research community. One of the main challenges that make this problem even harder is the uncertainty exists in the actions of pedestrians in urban traffic environments, as well as the difficulty in inferring their end goals. In this work, we are proposing a data-driven framework based on Inverse Reinforcement Learning (IRL) and the bidirectional recurrent neural network architecture (B-LSTM) for long-term prediction of pedestrians' trajectories. We evaluated our framework on real-life datasets for agent behavior modeling in traffic environments and it has achieved an overall average displacement error of only 2.93 and 4.12 pixels over 2.0 secs and 3.0 secs ahead prediction horizons respectively. Additionally, we compared our framework against other baseline models based on sequence prediction models only. We have outperformed these models with the lowest margin of average displacement error of more than 5 pixels.
近年来,城市交通环境中行人的意图和轨迹预测问题受到智能交通研究界的关注。使这个问题更加困难的主要挑战之一是城市交通环境中行人行为的不确定性,以及推断其最终目标的困难。在这项工作中,我们提出了一个基于逆强化学习(IRL)和双向循环神经网络架构(B-LSTM)的数据驱动框架,用于行人轨迹的长期预测。我们在交通环境中的智能体行为建模的真实数据集上评估了我们的框架,它在2.0秒和3.0秒的预测范围内分别实现了2.93和4.12像素的总体平均位移误差。此外,我们还将我们的框架与其他仅基于序列预测模型的基线模型进行了比较。我们的平均位移误差最小裕度超过5个像素,优于这些模型。
{"title":"Long-Term Recurrent Predictive Model for Intent Prediction of Pedestrians via Inverse Reinforcement Learning","authors":"Khaled Saleh, M. Hossny, S. Nahavandi","doi":"10.1109/DICTA.2018.8615854","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615854","url":null,"abstract":"Recently, the problem of intent and trajectory prediction of pedestrians in urban traffic environments has got some attention from the intelligent transportation research community. One of the main challenges that make this problem even harder is the uncertainty exists in the actions of pedestrians in urban traffic environments, as well as the difficulty in inferring their end goals. In this work, we are proposing a data-driven framework based on Inverse Reinforcement Learning (IRL) and the bidirectional recurrent neural network architecture (B-LSTM) for long-term prediction of pedestrians' trajectories. We evaluated our framework on real-life datasets for agent behavior modeling in traffic environments and it has achieved an overall average displacement error of only 2.93 and 4.12 pixels over 2.0 secs and 3.0 secs ahead prediction horizons respectively. Additionally, we compared our framework against other baseline models based on sequence prediction models only. We have outperformed these models with the lowest margin of average displacement error of more than 5 pixels.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115540297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Combining Deep and Handcrafted Image Features for Vehicle Classification in Drone Imagery 结合深度和手工图像特征在无人机图像中的车辆分类
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615853
Xuesong Le, Yufei Wang, Jun Jo
Using unmanned aerial vehicles (UAVs) as devices for traffic data collection exhibits many advantages in collecting traffic information. This paper presents an efficient method based on the deep learning and handcrafted features to classify vehicles taken from drone imagery. Experimental results show that compared to classification algorithms based on pre-trained CNN or hand-crafted features, the proposed algorithm exhibits higher accuracy in vehicle recognition at different UAV altitudes with different view scopes, which can be used in future traffic monitoring and control in metropolitan areas.
使用无人机作为交通数据采集设备,在交通信息采集方面具有许多优势。本文提出了一种基于深度学习和手工特征的无人机图像车辆分类方法。实验结果表明,与基于预训练CNN或手工特征的分类算法相比,本文算法在不同无人机高度、不同视场范围下的车辆识别精度更高,可用于未来城市交通监控。
{"title":"Combining Deep and Handcrafted Image Features for Vehicle Classification in Drone Imagery","authors":"Xuesong Le, Yufei Wang, Jun Jo","doi":"10.1109/DICTA.2018.8615853","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615853","url":null,"abstract":"Using unmanned aerial vehicles (UAVs) as devices for traffic data collection exhibits many advantages in collecting traffic information. This paper presents an efficient method based on the deep learning and handcrafted features to classify vehicles taken from drone imagery. Experimental results show that compared to classification algorithms based on pre-trained CNN or hand-crafted features, the proposed algorithm exhibits higher accuracy in vehicle recognition at different UAV altitudes with different view scopes, which can be used in future traffic monitoring and control in metropolitan areas.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"21 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122597464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Classifier-Free Extraction of Power Line Wires from Point Cloud Data 从点云数据中提取无分类器的电力线
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615869
M. Awrangjeb, Yongsheng Gao, Guojun Lu
This paper proposes a classifier-free method for extraction of power line wires from aerial point cloud data. It combines the advantages of both grid- and point-based processing of the input data. In addition to the non-ground point cloud data, the input to the proposed method includes the pylon locations, which are automatically extracted by a previous method. The proposed method first counts the number of wires in a span between the two successive pylons using two masks: vertical and horizontal. Then, the initial wire segments are obtained and refined iteratively. Finally, the initial segments are extended on both ends and each individual wire points are modelled as a 3D polynomial curve. Experimental results show both the object-based completeness and correctness are 97%, while the point-based completeness and correctness are 99% and 88%, respectively.
本文提出了一种从空中点云数据中提取电力线的无分类器方法。它结合了基于网格和基于点的输入数据处理的优点。除了非地面点云数据外,该方法的输入还包括塔的位置,这些位置由先前的方法自动提取。提出的方法首先使用垂直和水平两个掩模来计算两个连续的塔之间的电线数量。然后,得到初始线段并迭代细化。最后,在两端扩展初始段,并将每个单独的线点建模为三维多项式曲线。实验结果表明,基于对象的完整性和正确性分别为97%,基于点的完整性和正确性分别为99%和88%。
{"title":"Classifier-Free Extraction of Power Line Wires from Point Cloud Data","authors":"M. Awrangjeb, Yongsheng Gao, Guojun Lu","doi":"10.1109/DICTA.2018.8615869","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615869","url":null,"abstract":"This paper proposes a classifier-free method for extraction of power line wires from aerial point cloud data. It combines the advantages of both grid- and point-based processing of the input data. In addition to the non-ground point cloud data, the input to the proposed method includes the pylon locations, which are automatically extracted by a previous method. The proposed method first counts the number of wires in a span between the two successive pylons using two masks: vertical and horizontal. Then, the initial wire segments are obtained and refined iteratively. Finally, the initial segments are extended on both ends and each individual wire points are modelled as a 3D polynomial curve. Experimental results show both the object-based completeness and correctness are 97%, while the point-based completeness and correctness are 99% and 88%, respectively.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128168852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Spectral Super-resolution for RGB Images using Class-based BP Neural Networks 基于类的BP神经网络的光谱超分辨率RGB图像
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615862
Xiaolin Han, Jing Yu, Jing-Hao Xue, Weidong Sun
Hyperspectral images are of high spectral resolution and have been widely used in many applications, but the imaging process to achieve high spectral resolution is at the expense of spatial resolution. This paper aims to construct a high-spatial-resolution hyperspectral (HHS) image from a high-spatial-resolution RGB image, by proposing a novel class-based spectral super-resolution method. With the help of a set of RGB and HHS image-pairs, our proposed method learns nonlinear spectral mappings between RGB and HHS image-pairs using class-based back propagation neural networks (BPNNs). In the training stage, unsupervised clustering is used to divide an RGB image into several classes according to spectral correlation, and the spectrum-pairs from the classified RGB images and the corresponding HHS images are used to train the BPNNs, to establish the nonlinear spectral mapping for each class. In the spectral super-resolution stage, a supervised classification is used to classify the given RGB image into the classes determined during the training stage, and the final HHS image is reconstructed from the classified given RGB image using the trained BPNNs. Comparisons on three standard datasets, ICVL, CAVE and NUS, demonstrate that, our proposed method achieves a better spectral super-resolution quality than related state-of-the-art methods.
高光谱图像具有很高的光谱分辨率,在许多应用中得到了广泛的应用,但实现高光谱分辨率的成像过程是以牺牲空间分辨率为代价的。本文提出了一种基于类的光谱超分辨率方法,旨在将高空间分辨率RGB图像构建为高空间分辨率高光谱(HHS)图像。该方法利用一组RGB和HHS图像对,利用基于类的反向传播神经网络(bpnn)学习RGB和HHS图像对之间的非线性光谱映射。在训练阶段,采用无监督聚类方法根据光谱相关性将RGB图像划分为若干类,并利用分类后的RGB图像与相应HHS图像的光谱对对bpnn进行训练,建立各类的非线性光谱映射。在光谱超分辨率阶段,使用监督分类将给定的RGB图像分类到训练阶段确定的类别中,并使用训练好的bpnn从分类后的给定RGB图像重建最终的HHS图像。在ICVL、CAVE和NUS三个标准数据集上的比较表明,我们的方法比现有的相关方法获得了更好的光谱超分辨率质量。
{"title":"Spectral Super-resolution for RGB Images using Class-based BP Neural Networks","authors":"Xiaolin Han, Jing Yu, Jing-Hao Xue, Weidong Sun","doi":"10.1109/DICTA.2018.8615862","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615862","url":null,"abstract":"Hyperspectral images are of high spectral resolution and have been widely used in many applications, but the imaging process to achieve high spectral resolution is at the expense of spatial resolution. This paper aims to construct a high-spatial-resolution hyperspectral (HHS) image from a high-spatial-resolution RGB image, by proposing a novel class-based spectral super-resolution method. With the help of a set of RGB and HHS image-pairs, our proposed method learns nonlinear spectral mappings between RGB and HHS image-pairs using class-based back propagation neural networks (BPNNs). In the training stage, unsupervised clustering is used to divide an RGB image into several classes according to spectral correlation, and the spectrum-pairs from the classified RGB images and the corresponding HHS images are used to train the BPNNs, to establish the nonlinear spectral mapping for each class. In the spectral super-resolution stage, a supervised classification is used to classify the given RGB image into the classes determined during the training stage, and the final HHS image is reconstructed from the classified given RGB image using the trained BPNNs. Comparisons on three standard datasets, ICVL, CAVE and NUS, demonstrate that, our proposed method achieves a better spectral super-resolution quality than related state-of-the-art methods.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128435845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Systematic Analysis of Direct Sparse Odometry 直接稀疏里程计的系统分析
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615807
F. Particke, A. Kalisz, Christian Hofmann, M. Hiller, Henrik Bey, J. Thielecke
In the field of robotics and autonomous driving, the camera as a sensor gets more and more important, as the camera is cheap and robust against environmental influences. One challenging task is the localization of the robot on an unknown map. This leads to the so-called Simultaneous Localization and Mapping (SLAM) problem. For the Visual SLAM problem, a plethora of algorithms was proposed in the last years, but the algorithms were rarely evaluated regarding the robustness of the approaches. This contribution motivates the systematic analysis of Visual SLAMs in simulation by using heterogeneous environments in Blender. For this purpose, three different environments are used for evaluation ranging from very low detailed to high detailed worlds. In this contribution, the Direct Sparse Odometry (DSO) is evaluated as an exemplary Visual SLAM. It is shown that the DSO is very sensitive to rotations of the camera. In addition, it is presented that if the scene does not provide sufficient clues about the depth, an estimation of the trajectory is not possible. The results are complemented by real-world experiments.
在机器人和自动驾驶领域,摄像头作为传感器变得越来越重要,因为摄像头价格便宜,对环境影响也很强大。一项具有挑战性的任务是在未知地图上对机器人进行定位。这就导致了所谓的同时定位和映射(SLAM)问题。对于Visual SLAM问题,在过去的几年中提出了大量的算法,但是很少对算法的鲁棒性进行评估。这一贡献激发了在Blender中使用异构环境进行可视化slam仿真的系统分析。为此,我们使用了三种不同的环境进行评估,从非常低的细节到高细节的世界。在这篇文章中,直接稀疏里程计(DSO)被评价为一个典型的视觉SLAM。结果表明,DSO对相机的旋转非常敏感。此外,还提出了如果场景没有提供足够的深度线索,则无法估计轨迹。这些结果与现实世界的实验相辅相成。
{"title":"Systematic Analysis of Direct Sparse Odometry","authors":"F. Particke, A. Kalisz, Christian Hofmann, M. Hiller, Henrik Bey, J. Thielecke","doi":"10.1109/DICTA.2018.8615807","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615807","url":null,"abstract":"In the field of robotics and autonomous driving, the camera as a sensor gets more and more important, as the camera is cheap and robust against environmental influences. One challenging task is the localization of the robot on an unknown map. This leads to the so-called Simultaneous Localization and Mapping (SLAM) problem. For the Visual SLAM problem, a plethora of algorithms was proposed in the last years, but the algorithms were rarely evaluated regarding the robustness of the approaches. This contribution motivates the systematic analysis of Visual SLAMs in simulation by using heterogeneous environments in Blender. For this purpose, three different environments are used for evaluation ranging from very low detailed to high detailed worlds. In this contribution, the Direct Sparse Odometry (DSO) is evaluated as an exemplary Visual SLAM. It is shown that the DSO is very sensitive to rotations of the camera. In addition, it is presented that if the scene does not provide sufficient clues about the depth, an estimation of the trajectory is not possible. The results are complemented by real-world experiments.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134390411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Combination of Supervised Learning and Unsupervised Learning Based on Object Association for Land Cover Classification 基于目标关联的有监督学习与无监督学习相结合的土地覆盖分类
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615871
Na Li, Arnaud Martin, R. Estival
Conventional supervised classification approaches have significant limitations in the land cover classification from remote sensing data because a large amount of high quality labeled samples are difficult to guarantee. To overcome this limitation, combination with unsupervised approach is considered as one promising candidate. In this paper, we propose a novel framework to achieve the combination through object association based on Dempster-Shafer theory. Inspired by object association, the framework can label the unsupervised clusters according to the supervised classes even though they have different numbers. The proposed framework has been tested on the different combinations of commonly used supervised and unsupervised methods. Compared with the supervise methods, our proposed framework can furthest enhance the overall accuracy approximately by 8.2%. The experiment results proved that our proposed framework has achieved twofold performance gain: better performance on the insufficient training data case and the possibility to apply on a large area.
传统的监督分类方法在遥感土地覆盖分类中存在很大的局限性,因为难以保证大量高质量的标记样本。为了克服这一限制,与无监督方法相结合被认为是一种很有前途的方法。本文基于Dempster-Shafer理论,提出了一种通过对象关联实现组合的新框架。受对象关联的启发,该框架可以根据监督类的数量来标记无监督类,即使它们的数量不同。所提出的框架已在常用的监督和非监督方法的不同组合上进行了测试。与监督方法相比,我们提出的框架最大限度地提高了总体精度约8.2%。实验结果证明,我们提出的框架实现了双重性能增益:在训练数据不足的情况下具有更好的性能,并且可以应用于更大的区域。
{"title":"Combination of Supervised Learning and Unsupervised Learning Based on Object Association for Land Cover Classification","authors":"Na Li, Arnaud Martin, R. Estival","doi":"10.1109/DICTA.2018.8615871","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615871","url":null,"abstract":"Conventional supervised classification approaches have significant limitations in the land cover classification from remote sensing data because a large amount of high quality labeled samples are difficult to guarantee. To overcome this limitation, combination with unsupervised approach is considered as one promising candidate. In this paper, we propose a novel framework to achieve the combination through object association based on Dempster-Shafer theory. Inspired by object association, the framework can label the unsupervised clusters according to the supervised classes even though they have different numbers. The proposed framework has been tested on the different combinations of commonly used supervised and unsupervised methods. Compared with the supervise methods, our proposed framework can furthest enhance the overall accuracy approximately by 8.2%. The experiment results proved that our proposed framework has achieved twofold performance gain: better performance on the insufficient training data case and the possibility to apply on a large area.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131431280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Bi-Modal Content Based Image Retrieval using Multi-class Cycle-GAN 基于多类循环gan的双模态图像检索
Pub Date : 2018-12-01 DOI: 10.1109/DICTA.2018.8615838
Girraj Pahariya
Content Based Image Retrieval (CBIR) systems retrieve relevant images from a database based on the content of the query. Most CBIR systems take a query image as input and retrieve similar images from a gallery, based on the global features (such as texture, shape, and color) extracted from an image. There are several ways of querying from an image database for retrieval purpose. Some of which are text, image, and sketch. However, the traditional methodologies support only one of the domains at a time. There is a need of bridging the gap between different domains (sketch and image) for enabling a Multi-Modal CBIR system. In this work, we propose a novel bimodal query based retrieval framework, which can take inputs from both sketch and image domains. The proposed framework aims at reducing the domain gap by learning a mapping function using Generative Adversarial Networks (GANs) and supervised deep domain adaptation techniques. Extensive experimentation and comparison with several baselines on two popular sketch datasets (Sketchy and TU-Berlin) show the effectiveness of our proposed framework.
基于内容的图像检索(CBIR)系统根据查询的内容从数据库中检索相关图像。大多数CBIR系统将查询图像作为输入,并基于从图像中提取的全局特征(如纹理、形状和颜色)从图库中检索相似的图像。有几种方法可以从图像数据库中查询用于检索目的。其中一些是文本,图像和草图。然而,传统的方法一次只支持一个领域。为了实现多模态CBIR系统,需要弥合不同领域(草图和图像)之间的差距。在这项工作中,我们提出了一种新的基于双峰查询的检索框架,它可以同时从草图和图像域获取输入。该框架旨在通过使用生成对抗网络(GANs)和监督深度域自适应技术学习映射函数来减小域间隙。在两个流行的草图数据集(Sketchy和TU-Berlin)上进行了大量的实验和几个基线的比较,表明了我们提出的框架的有效性。
{"title":"Bi-Modal Content Based Image Retrieval using Multi-class Cycle-GAN","authors":"Girraj Pahariya","doi":"10.1109/DICTA.2018.8615838","DOIUrl":"https://doi.org/10.1109/DICTA.2018.8615838","url":null,"abstract":"Content Based Image Retrieval (CBIR) systems retrieve relevant images from a database based on the content of the query. Most CBIR systems take a query image as input and retrieve similar images from a gallery, based on the global features (such as texture, shape, and color) extracted from an image. There are several ways of querying from an image database for retrieval purpose. Some of which are text, image, and sketch. However, the traditional methodologies support only one of the domains at a time. There is a need of bridging the gap between different domains (sketch and image) for enabling a Multi-Modal CBIR system. In this work, we propose a novel bimodal query based retrieval framework, which can take inputs from both sketch and image domains. The proposed framework aims at reducing the domain gap by learning a mapping function using Generative Adversarial Networks (GANs) and supervised deep domain adaptation techniques. Extensive experimentation and comparison with several baselines on two popular sketch datasets (Sketchy and TU-Berlin) show the effectiveness of our proposed framework.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"547 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120939092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2018 Digital Image Computing: Techniques and Applications (DICTA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1