首页 > 最新文献

2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)最新文献

英文 中文
Analyzing relationship between maize height and spectral indices derived from remotely sensed multispectral imagery 遥感多光谱影像玉米高程与光谱指标的关系分析
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707373
Aleem Khaliq, M. Musci, M. Chiaberge
For maize crop, biophysical parameters such as canopy height and above ground biomass are the crucial agro-ecological indicator that can be used to describe the crop growth, photosynthetic efficiency and carbon stock. Remote sensing is widely used approach and most appropriate source in terms of area coverage that can be used to monitor vegetative conditions over the large area. In this study, sentinel-2 multispectral imagery is used to calculate spectral vegetation indices over the different maize growth period using some visible bands including near infrared spectrum. The relationship has been established and analyzed between maize biophysical variables (height of the canopy and above ground biomass) collected during the field measurements and derived spectral vegetation indices using simple linear regression and pearson correlation to exploit the possibility of using satellite imagery for estimation of crop biophysical parameters.
对玉米作物而言,冠层高度和地上生物量等生物物理参数是描述作物生长、光合效率和碳储量的重要农业生态指标。遥感是一种广泛使用的方法,在面积覆盖方面是最合适的来源,可以用来监测大面积的植被状况。本研究利用sentinel-2多光谱影像,利用包括近红外光谱在内的部分可见光波段,计算不同玉米生育期的光谱植被指数。利用简单线性回归和pearson相关方法,建立并分析了田间测量中收集的玉米生物物理变量(冠层高度和地上生物量)与光谱植被指数之间的关系,探索了利用卫星图像估计作物生物物理参数的可能性。
{"title":"Analyzing relationship between maize height and spectral indices derived from remotely sensed multispectral imagery","authors":"Aleem Khaliq, M. Musci, M. Chiaberge","doi":"10.1109/AIPR.2018.8707373","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707373","url":null,"abstract":"For maize crop, biophysical parameters such as canopy height and above ground biomass are the crucial agro-ecological indicator that can be used to describe the crop growth, photosynthetic efficiency and carbon stock. Remote sensing is widely used approach and most appropriate source in terms of area coverage that can be used to monitor vegetative conditions over the large area. In this study, sentinel-2 multispectral imagery is used to calculate spectral vegetation indices over the different maize growth period using some visible bands including near infrared spectrum. The relationship has been established and analyzed between maize biophysical variables (height of the canopy and above ground biomass) collected during the field measurements and derived spectral vegetation indices using simple linear regression and pearson correlation to exploit the possibility of using satellite imagery for estimation of crop biophysical parameters.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122261054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Using Semantic Relationships among Objects for Geospatial Land Use Classification 基于对象间语义关系的地理空间土地利用分类
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707405
G. Rotich, Sathyanarayanan N. Aakur, R. Minetto, Maurício Pamplona Segundo, Sudeep Sarkar
The geospatial land recognition is often cast as a local-region based classification problem. We show in this work, that prior knowledge, in terms of global semantic relationships among detected regions, allows us to leverage semantics and visual features to enhance land use classification in aerial imagery. To this end, we first estimate the top-k labels for each region using an ensemble of CNNs called Hydra. Twelve different models based on two state-of-the-art CNN architectures, ResNet and DenseNet, compose this ensemble. Then, we use Grenander’s canonical pattern theory formalism coupled with the common-sense knowledge base, ConceptNet, to impose context constraints on the labels obtained by deep learning algorithms. These constraints are captured in a multi-graph representation involving generators and bonds with a flexible topology, unlike an MRF or Bayesian networks, which have fixed structures. Minimizing the energy of this graph representation results in a graphical representation of the semantics in the given image. We show our results on the recent fMoW challenge dataset. It consists of 1,047,691 images with 62 different classes of land use, plus a false detection category. The biggest improvement in performance with the use of semantics was for false detections. Other categories with significantly improved performance were: zoo, nuclear power plant, park, police station, and space facility. For the subset of fMow images with multiple bounding boxes the accuracy is 72.79% without semantics and 74.06% with semantics. Overall, without semantic context, the classification performance was 77.04%. With semantics, it reached 77.98%. Considering that less than 20% of the dataset contained more than one ROI for context, this is a significant improvement that shows the promise of the proposed approach.
地理空间土地识别通常被认为是一个基于局部区域的分类问题。我们在这项工作中表明,就检测区域之间的全局语义关系而言,先验知识使我们能够利用语义和视觉特征来增强航空图像中的土地利用分类。为此,我们首先使用一个称为Hydra的cnn集合来估计每个区域的top-k标签。基于两种最先进的CNN架构(ResNet和DenseNet)的12种不同模型组成了这个集合。然后,我们使用Grenander的规范模式理论形式主义,结合常识知识库ConceptNet,对深度学习算法获得的标签施加上下文约束。这些约束在多图表示中被捕获,涉及具有灵活拓扑结构的生成器和键,不像具有固定结构的MRF或贝叶斯网络。最小化这种图形表示的能量会得到给定图像中语义的图形表示。我们在最近的fMoW挑战数据集上展示了我们的结果。它由1,047,691张图像组成,其中包含62种不同的土地使用类别,以及一个虚假检测类别。使用语义对性能的最大改进是对错误检测的改进。其他表现显著改善的类别还包括:动物园、核电站、公园、警察局和太空设施。对于具有多个边界框的fMow图像子集,不带语义的准确率为72.79%,带语义的准确率为74.06%。总体而言,在没有语义上下文的情况下,分类性能为77.04%。在语义学上,它达到了77.98%。考虑到不到20%的数据集包含一个以上的上下文ROI,这是一个显着的改进,显示了所提出方法的前景。
{"title":"Using Semantic Relationships among Objects for Geospatial Land Use Classification","authors":"G. Rotich, Sathyanarayanan N. Aakur, R. Minetto, Maurício Pamplona Segundo, Sudeep Sarkar","doi":"10.1109/AIPR.2018.8707405","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707405","url":null,"abstract":"The geospatial land recognition is often cast as a local-region based classification problem. We show in this work, that prior knowledge, in terms of global semantic relationships among detected regions, allows us to leverage semantics and visual features to enhance land use classification in aerial imagery. To this end, we first estimate the top-k labels for each region using an ensemble of CNNs called Hydra. Twelve different models based on two state-of-the-art CNN architectures, ResNet and DenseNet, compose this ensemble. Then, we use Grenander’s canonical pattern theory formalism coupled with the common-sense knowledge base, ConceptNet, to impose context constraints on the labels obtained by deep learning algorithms. These constraints are captured in a multi-graph representation involving generators and bonds with a flexible topology, unlike an MRF or Bayesian networks, which have fixed structures. Minimizing the energy of this graph representation results in a graphical representation of the semantics in the given image. We show our results on the recent fMoW challenge dataset. It consists of 1,047,691 images with 62 different classes of land use, plus a false detection category. The biggest improvement in performance with the use of semantics was for false detections. Other categories with significantly improved performance were: zoo, nuclear power plant, park, police station, and space facility. For the subset of fMow images with multiple bounding boxes the accuracy is 72.79% without semantics and 74.06% with semantics. Overall, without semantic context, the classification performance was 77.04%. With semantics, it reached 77.98%. Considering that less than 20% of the dataset contained more than one ROI for context, this is a significant improvement that shows the promise of the proposed approach.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132580165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Autonomous Precision Landing for the Joint Tactical Aerial Resupply Vehicle 联合战术空中补给车的自主精确着陆
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707418
S. Recker, C. Gribble, M. Butkiewicz
We discuss the precision autonomous landing features of the Joint Tactical Aerial Resupply Vehicle (JTARV) platform. Autonomous navigation for aerial vehicles demands that computer vision algorithms provide not only relevant, actionable information, but that they do so in a timely manner—i.e., the algorithms must operate in real-time. This requirement for high performance dictates optimization at every level, which is the focus of our on-going research and development efforts for adding autonomous features to JTARV. Autonomous precision landing capabilities are enabled by high-performance deep learning and structure-from-motion techniques optimized for NVIDIA mobile GPUs. The system uses a single downward-facing camera to guide the vehicle to a coded photogrammetry target, ultimately enabling fully autonomous aerial resupply for troops on the ground. This paper details the system architecture and perception system design and evaluates performance on a scale vehicle. Results demonstrate that the system is capable of landing on stationary targets within relatively narrow spaces.
讨论了联合战术空中补给车(JTARV)平台的精确自主着陆特性。飞行器的自主导航要求计算机视觉算法不仅要提供相关的、可操作的信息,而且要及时地提供这些信息。,算法必须实时运行。这种对高性能的要求要求在每个级别进行优化,这是我们为JTARV添加自主功能而进行的持续研究和开发工作的重点。通过针对NVIDIA移动gpu优化的高性能深度学习和运动结构技术,自主精确着陆能力得以实现。该系统使用一个向下的摄像头引导车辆到一个编码的摄影测量目标,最终为地面部队提供完全自主的空中补给。本文详细介绍了系统的结构和感知系统的设计,并在一个规模车辆上进行了性能评估。结果表明,该系统能够在相对狭窄的空间内对静止目标进行着陆。
{"title":"Autonomous Precision Landing for the Joint Tactical Aerial Resupply Vehicle","authors":"S. Recker, C. Gribble, M. Butkiewicz","doi":"10.1109/AIPR.2018.8707418","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707418","url":null,"abstract":"We discuss the precision autonomous landing features of the Joint Tactical Aerial Resupply Vehicle (JTARV) platform. Autonomous navigation for aerial vehicles demands that computer vision algorithms provide not only relevant, actionable information, but that they do so in a timely manner—i.e., the algorithms must operate in real-time. This requirement for high performance dictates optimization at every level, which is the focus of our on-going research and development efforts for adding autonomous features to JTARV. Autonomous precision landing capabilities are enabled by high-performance deep learning and structure-from-motion techniques optimized for NVIDIA mobile GPUs. The system uses a single downward-facing camera to guide the vehicle to a coded photogrammetry target, ultimately enabling fully autonomous aerial resupply for troops on the ground. This paper details the system architecture and perception system design and evaluates performance on a scale vehicle. Results demonstrate that the system is capable of landing on stationary targets within relatively narrow spaces.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116442487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
End-to-End Text Classification via Image-based Embedding using Character-level Networks 基于图像嵌入的字符级网络端到端文本分类
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707407
Shunsuke Kitada, Ryunosuke Kotani, H. Iyatomi
For analysing and/or understanding languages having no word boundaries based on morphological analysis such as Japanese, Chinese, and Thai, it is desirable to perform appropriate word segmentation before word embeddings. But it is inherently difficult in these languages. In recent years, various language models based on deep learning have made remarkable progress, and some of these methodologies utilizing character-level features have successfully avoided such a difficult problem. However, when a model is fed character-level features of the above languages, it often causes overfitting due to a large number of character types. In this paper, we propose a CE-CLCNN, character-level convolutional neural networks using a character encoder to tackle these problems. The proposed CE-CLCNN is an end-to-end learning model and has an image-based character encoder, i.e. the CE-CLCNN handles each character in the target document as an image. Through various experiments, we found and confirmed that our CE-CLCNN captured closely embedded features for visually and semantically similar characters and achieves state-of-the-art results on several open document classification tasks. In this paper we report the performance of our CE-CLCNN with the Wikipedia title estimation task and analyse the internal behaviour.
为了分析和/或理解基于形态学分析的没有词边界的语言,如日语、中文和泰语,需要在词嵌入之前执行适当的分词。但在这些语言中,这本身就很困难。近年来,基于深度学习的各种语言模型取得了显著的进展,其中一些利用字符级特征的方法成功地避免了这一难题。然而,当向模型提供上述语言的字符级特征时,由于大量的字符类型,通常会导致过拟合。在本文中,我们提出了一种使用字符编码器的字符级卷积神经网络CE-CLCNN来解决这些问题。所提出的CE-CLCNN是一个端到端学习模型,并具有基于图像的字符编码器,即CE-CLCNN将目标文档中的每个字符作为图像处理。通过各种实验,我们发现并确认我们的CE-CLCNN为视觉和语义相似的字符捕获了紧密嵌入的特征,并在几个开放文档分类任务上获得了最先进的结果。在本文中,我们报告了我们的CE-CLCNN在维基百科标题估计任务中的性能,并分析了其内部行为。
{"title":"End-to-End Text Classification via Image-based Embedding using Character-level Networks","authors":"Shunsuke Kitada, Ryunosuke Kotani, H. Iyatomi","doi":"10.1109/AIPR.2018.8707407","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707407","url":null,"abstract":"For analysing and/or understanding languages having no word boundaries based on morphological analysis such as Japanese, Chinese, and Thai, it is desirable to perform appropriate word segmentation before word embeddings. But it is inherently difficult in these languages. In recent years, various language models based on deep learning have made remarkable progress, and some of these methodologies utilizing character-level features have successfully avoided such a difficult problem. However, when a model is fed character-level features of the above languages, it often causes overfitting due to a large number of character types. In this paper, we propose a CE-CLCNN, character-level convolutional neural networks using a character encoder to tackle these problems. The proposed CE-CLCNN is an end-to-end learning model and has an image-based character encoder, i.e. the CE-CLCNN handles each character in the target document as an image. Through various experiments, we found and confirmed that our CE-CLCNN captured closely embedded features for visually and semantically similar characters and achieves state-of-the-art results on several open document classification tasks. In this paper we report the performance of our CE-CLCNN with the Wikipedia title estimation task and analyse the internal behaviour.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129278296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Machine Learning System for Classification of EMG Signals to Assist Exoskeleton Performance 一种用于肌电信号分类的机器学习系统,以辅助外骨骼性能
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707426
Nagaswathi Amamcherla, A. Turlapaty, B. Gokaraju
A surface electromyographic signal can provide information on neuromuscular activity and can be used as an input in a myoelectric control system for applications such as orthotic exoskeletons. In this process, a key step is to extract useful information from the EMG signals using the pattern recognition tools. Our research focus is on identification of a set of relevant features for efficient EMG signal classification. Specifically in this work, from the pre-processed myoelectric signals, we extracted auto regression coefficients, different time-domain features such as Hjorth features, integral absolute value, mean absolute value, root mean square and cepstral features. Next a subset consisting of a few selected features are fed to the multiclass SVM classifier. Using a radial basis function kernel a classification accuracy of 92.3% has been achieved.
表面肌电图信号可以提供神经肌肉活动的信息,并可作为肌电控制系统的输入,用于矫形外骨骼等应用。在此过程中,关键的一步是利用模式识别工具从肌电信号中提取有用的信息。我们的研究重点是识别一组有效的肌电信号分类的相关特征。具体而言,从预处理后的肌电信号中提取自回归系数、Hjorth特征、积分绝对值、平均绝对值、均方根和倒谱特征等不同的时域特征。然后,将由几个选定的特征组成的子集馈送到多类支持向量机分类器。采用径向基函数核进行分类,准确率达到92.3%。
{"title":"A Machine Learning System for Classification of EMG Signals to Assist Exoskeleton Performance","authors":"Nagaswathi Amamcherla, A. Turlapaty, B. Gokaraju","doi":"10.1109/AIPR.2018.8707426","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707426","url":null,"abstract":"A surface electromyographic signal can provide information on neuromuscular activity and can be used as an input in a myoelectric control system for applications such as orthotic exoskeletons. In this process, a key step is to extract useful information from the EMG signals using the pattern recognition tools. Our research focus is on identification of a set of relevant features for efficient EMG signal classification. Specifically in this work, from the pre-processed myoelectric signals, we extracted auto regression coefficients, different time-domain features such as Hjorth features, integral absolute value, mean absolute value, root mean square and cepstral features. Next a subset consisting of a few selected features are fed to the multiclass SVM classifier. Using a radial basis function kernel a classification accuracy of 92.3% has been achieved.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127217453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Diagnosis of Multiple Cucumber Infections with Convolutional Neural Networks 基于卷积神经网络的黄瓜多重感染诊断
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707385
Hiroki Tani, Ryunosuke Kotani, S. Kagiwada, H. Uga, H. Iyatomi
Recent machine learning approaches have shown promising results in the field of automated plant diagnosis. However, all of the systems were designed to diagnose single infections, thus they do not assume multiple infections. In this paper, we created our original on-site cucumber leaf dataset including multiple infections to build a practical plant diagnosis system. Our dataset has a total of 48,311 cucumber leaf images (38,821 leaves infected with any of 11 kinds of diseases, 1,814 leaves infected with multiple diseases, and 7,676 healthy leaves). We developed a convolutional neural networks (CNN) classifier having the sigmoid function with a tunable threshold on each node in the last output layer. Our model attained on average a 95.5% classification accuracy on the entire dataset. On only multiple infected cases, the result was 85.9% and it accurately identified at least one disease in 1,808 out of the total of 1,814 (99.7%).
最近的机器学习方法在自动化工厂诊断领域显示出有希望的结果。然而,所有这些系统都是为诊断单一感染而设计的,因此它们没有假设多重感染。为了构建一个实用的植物诊断系统,我们创建了包含多种感染的黄瓜叶片原始现场数据集。我们的数据集共有48311张黄瓜叶片图像(其中38821张叶子感染了11种疾病中的任何一种,1814张叶子感染了多种疾病,7676张叶子健康)。我们开发了一个卷积神经网络(CNN)分类器,该分类器在最后一个输出层的每个节点上具有可调阈值的sigmoid函数。我们的模型在整个数据集上平均达到了95.5%的分类准确率。仅在多重感染病例中,结果为85.9%,并且在1,814例总数中的1,808例(99.7%)中准确地确定了至少一种疾病。
{"title":"Diagnosis of Multiple Cucumber Infections with Convolutional Neural Networks","authors":"Hiroki Tani, Ryunosuke Kotani, S. Kagiwada, H. Uga, H. Iyatomi","doi":"10.1109/AIPR.2018.8707385","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707385","url":null,"abstract":"Recent machine learning approaches have shown promising results in the field of automated plant diagnosis. However, all of the systems were designed to diagnose single infections, thus they do not assume multiple infections. In this paper, we created our original on-site cucumber leaf dataset including multiple infections to build a practical plant diagnosis system. Our dataset has a total of 48,311 cucumber leaf images (38,821 leaves infected with any of 11 kinds of diseases, 1,814 leaves infected with multiple diseases, and 7,676 healthy leaves). We developed a convolutional neural networks (CNN) classifier having the sigmoid function with a tunable threshold on each node in the last output layer. Our model attained on average a 95.5% classification accuracy on the entire dataset. On only multiple infected cases, the result was 85.9% and it accurately identified at least one disease in 1,808 out of the total of 1,814 (99.7%).","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"22 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116593748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Fusion based Heterogeneous Convolutional Neural Networks Architecture 基于融合的异构卷积神经网络结构
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707371
David Kornish, Soundararajan Ezekiel, Maria Scalzo-Cornacchia
In recent years, Deep Convolutional Neural Networks (DCNNs) have gained lots of attention and won many competitions in machine learning, object detection, image classification, and pattern recognition. The breakthroughs in the development of graphical processing units have made it possible to train DCNNs quickly for state-of-the-art tasks such as image classification, speech recognition, and many others. However, to solve complex problems, these multilayered convolutional neural networks become increasingly large, complex, and abstract. We propose methods to improve the performance of neural networks while reducing their dimensionality, enabling a better understanding of the learning process. To leverage the extensive training, as well as strengths of several pretrained models, we explored new approaches for combining features from fully connected layers of models with heterogeneous architectures. The proposed approach combines features extracted from the penultimate fully connected layer from three different DCNNs. We merge the features of all three DCNNs together and apply principal component analysis or linear discriminant analysis. Our approach aims to reduce the dimensionality of the feature vector and find the smallest feature vector dimension that can maintain the classifier performance. For this task we use a linear Support Vector Machine as a classifier. We also investigate whether it is advantageous to fuse only penultimate fully connected layers, or to perform fusion based on other fully connected layers using multiple homogenous or heterogeneous networks. The results show that the fusion method outperformed both individual networks in terms of accuracy and computational time in all of our various trial sizes. Overall our fusion methods are faster and more accurate than individual networks in both training and testing. Finally, we compared heterogeneous with homogenous fusion methods and the results show heterogeneous methods outperform homogeneous methods.
近年来,深度卷积神经网络(Deep Convolutional Neural Networks, DCNNs)在机器学习、目标检测、图像分类、模式识别等领域获得了广泛的关注,并赢得了许多竞赛。图形处理单元发展的突破使得训练DCNNs快速完成图像分类、语音识别等最先进的任务成为可能。然而,为了解决复杂的问题,这些多层卷积神经网络变得越来越庞大、复杂和抽象。我们提出了一些方法来提高神经网络的性能,同时降低它们的维数,从而更好地理解学习过程。为了利用广泛的训练,以及几个预训练模型的优势,我们探索了将来自完全连接的模型层的特征与异构体系结构相结合的新方法。该方法结合了从三个不同的DCNNs的倒数第二个完全连接层提取的特征。我们将所有三个DCNNs的特征合并在一起,并应用主成分分析或线性判别分析。我们的方法旨在降低特征向量的维数,找到能够保持分类器性能的最小特征向量维数。对于这个任务,我们使用线性支持向量机作为分类器。我们还研究了只融合倒数第二个完全连接层,还是使用多个同质或异构网络在其他完全连接层的基础上进行融合是有利的。结果表明,在我们所有不同的试验规模下,融合方法在精度和计算时间方面都优于单独的网络。总的来说,我们的融合方法在训练和测试中都比单独的网络更快、更准确。最后,我们比较了异质和同质融合方法,结果表明异质融合方法优于同质融合方法。
{"title":"Fusion based Heterogeneous Convolutional Neural Networks Architecture","authors":"David Kornish, Soundararajan Ezekiel, Maria Scalzo-Cornacchia","doi":"10.1109/AIPR.2018.8707371","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707371","url":null,"abstract":"In recent years, Deep Convolutional Neural Networks (DCNNs) have gained lots of attention and won many competitions in machine learning, object detection, image classification, and pattern recognition. The breakthroughs in the development of graphical processing units have made it possible to train DCNNs quickly for state-of-the-art tasks such as image classification, speech recognition, and many others. However, to solve complex problems, these multilayered convolutional neural networks become increasingly large, complex, and abstract. We propose methods to improve the performance of neural networks while reducing their dimensionality, enabling a better understanding of the learning process. To leverage the extensive training, as well as strengths of several pretrained models, we explored new approaches for combining features from fully connected layers of models with heterogeneous architectures. The proposed approach combines features extracted from the penultimate fully connected layer from three different DCNNs. We merge the features of all three DCNNs together and apply principal component analysis or linear discriminant analysis. Our approach aims to reduce the dimensionality of the feature vector and find the smallest feature vector dimension that can maintain the classifier performance. For this task we use a linear Support Vector Machine as a classifier. We also investigate whether it is advantageous to fuse only penultimate fully connected layers, or to perform fusion based on other fully connected layers using multiple homogenous or heterogeneous networks. The results show that the fusion method outperformed both individual networks in terms of accuracy and computational time in all of our various trial sizes. Overall our fusion methods are faster and more accurate than individual networks in both training and testing. Finally, we compared heterogeneous with homogenous fusion methods and the results show heterogeneous methods outperform homogeneous methods.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129869916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
BioSense: Real-Time Object Tracking for Animal Movement and Behavior Research BioSense:动物运动和行为研究的实时目标跟踪
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707411
Jon Patman, Sabrina C. J. Michael, Marvin M. F. Lutnesky, K. Palaniappan
Video object tracking has been used with great success in numerous applications ranging from autonomous vehicle navigation to medical image analysis. A broad and emerging domain for exploration is in the field of automatic video-based animal behavior understanding. Interesting, yet difficult-to-test hypotheses, can be evaluated by high-throughput processing of animal movements and interactions collected from both laboratory and field experiments. In this paper we describe BioSense, a new standalone software platform and user interface that provides researchers with an open-source framework for collecting and quantitatively analyzing video data characterizing animal movement and behavior (e.g. spatial location, velocity, region preference, etc.). BioSense is capable of tracking multiple objects in real-time, using various object detection methods suitable for a range of environments and animals. Real-time operation also provides a tactical approach to object tracking by allowing users the ability to manipulate the control of the software while seeing visual feedback immediately. We evaluate the capabilities of BioSense in a series of video tracking benchmarks representative of the challenges present in animal behavior research.
视频目标跟踪已经在从自动驾驶汽车导航到医学图像分析的众多应用中获得了巨大的成功。基于自动视频的动物行为理解是一个广泛而新兴的探索领域。有趣的,但难以验证的假设,可以通过高通量处理从实验室和现场实验收集的动物运动和相互作用来评估。在本文中,我们描述了BioSense,一个新的独立软件平台和用户界面,为研究人员提供了一个开源框架,用于收集和定量分析表征动物运动和行为的视频数据(例如空间位置,速度,区域偏好等)。BioSense能够实时跟踪多个物体,使用适用于各种环境和动物的各种物体检测方法。实时操作还提供了一种战术方法的对象跟踪,允许用户能够操纵软件的控制,同时立即看到视觉反馈。我们在一系列视频跟踪基准中评估了BioSense的能力,这些基准代表了动物行为研究中存在的挑战。
{"title":"BioSense: Real-Time Object Tracking for Animal Movement and Behavior Research","authors":"Jon Patman, Sabrina C. J. Michael, Marvin M. F. Lutnesky, K. Palaniappan","doi":"10.1109/AIPR.2018.8707411","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707411","url":null,"abstract":"Video object tracking has been used with great success in numerous applications ranging from autonomous vehicle navigation to medical image analysis. A broad and emerging domain for exploration is in the field of automatic video-based animal behavior understanding. Interesting, yet difficult-to-test hypotheses, can be evaluated by high-throughput processing of animal movements and interactions collected from both laboratory and field experiments. In this paper we describe BioSense, a new standalone software platform and user interface that provides researchers with an open-source framework for collecting and quantitatively analyzing video data characterizing animal movement and behavior (e.g. spatial location, velocity, region preference, etc.). BioSense is capable of tracking multiple objects in real-time, using various object detection methods suitable for a range of environments and animals. Real-time operation also provides a tactical approach to object tracking by allowing users the ability to manipulate the control of the software while seeing visual feedback immediately. We evaluate the capabilities of BioSense in a series of video tracking benchmarks representative of the challenges present in animal behavior research.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128405266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Integration of Deep Learning and Graph Theory for Analyzing Histopathology Whole-slide Images 整合深度学习与图论分析组织病理学整张影像
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707424
Hyun Jung, Christian Suloway, Tianyi Miao, E. Edmondson, D. Morcock, C. Deleage, Yanling Liu, Jack R. Collins, C. Lisle
Characterization of collagen deposition in immunostained images is relevant to various pathological conditions, particularly in human immunodeficiency virus (HIV) infection. Accurate segmentation of these collagens and extracting representative features of underlying diseases are important steps to achieve quantitative diagnosis. While a first order statistic derived from the segmented collagens can be useful in representing pathological evolutions at different timepoints, it fails to capture morphological changes and spatial arrangements. In this work, we demonstrate a complete pipeline for extracting key histopathology features representing underlying disease progression from histopathology whole-slide images (WSIs) via integration of deep learning and graph theory. A convolutional neural network is trained and utilized for histopathological WSI segmentation. Parallel processing is applied to convert 100K ~ 150K segmented collagen fibrils into a single collective attributed relational graph, and graph theory is applied to extract topological and relational information from the collagenous framework. Results are in good agreement with the expected pathogenicity induced by collagen deposition, highlighting potentials in clinical applications for analyzing various meshwork-structures in whole-slide histology images.
免疫染色图像中胶原沉积的特征与各种病理状况有关,特别是在人类免疫缺陷病毒(HIV)感染中。准确分割这些胶原并提取潜在疾病的代表性特征是实现定量诊断的重要步骤。虽然从分割的胶原中得出的一阶统计量可以用于表示不同时间点的病理演变,但它无法捕获形态变化和空间排列。在这项工作中,我们展示了一个完整的管道,通过整合深度学习和图论,从组织病理学全幻灯片图像(wsi)中提取代表潜在疾病进展的关键组织病理学特征。训练卷积神经网络并将其用于组织病理学WSI分割。采用并行处理方法将100K ~ 150K的分割胶原原纤维转化为单个集体属性关系图,并应用图论从胶原框架中提取拓扑信息和关系信息。结果与预期的由胶原沉积引起的致病性一致,突出了在全切片组织学图像中分析各种网状结构的临床应用潜力。
{"title":"Integration of Deep Learning and Graph Theory for Analyzing Histopathology Whole-slide Images","authors":"Hyun Jung, Christian Suloway, Tianyi Miao, E. Edmondson, D. Morcock, C. Deleage, Yanling Liu, Jack R. Collins, C. Lisle","doi":"10.1109/AIPR.2018.8707424","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707424","url":null,"abstract":"Characterization of collagen deposition in immunostained images is relevant to various pathological conditions, particularly in human immunodeficiency virus (HIV) infection. Accurate segmentation of these collagens and extracting representative features of underlying diseases are important steps to achieve quantitative diagnosis. While a first order statistic derived from the segmented collagens can be useful in representing pathological evolutions at different timepoints, it fails to capture morphological changes and spatial arrangements. In this work, we demonstrate a complete pipeline for extracting key histopathology features representing underlying disease progression from histopathology whole-slide images (WSIs) via integration of deep learning and graph theory. A convolutional neural network is trained and utilized for histopathological WSI segmentation. Parallel processing is applied to convert 100K ~ 150K segmented collagen fibrils into a single collective attributed relational graph, and graph theory is applied to extract topological and relational information from the collagenous framework. Results are in good agreement with the expected pathogenicity induced by collagen deposition, highlighting potentials in clinical applications for analyzing various meshwork-structures in whole-slide histology images.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130385483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An Improved Star Detection Algorithm Using a Combination of Statistical and Morphological Image Processing Techniques 一种结合统计和形态学图像处理技术的改进恒星检测算法
Pub Date : 2018-10-01 DOI: 10.1109/AIPR.2018.8707438
AL Samed, I. Karagoz, Ali Dogan
A star detection algorithm determines the position and magnitude of stars on an observed space scene. In this study, a robust star detection algorithm is presented that filters the noise out in astronomical images and accurately estimates the centroid of stars in a way that preserving their native circular shapes. The proposed algorithm suggests the usage of different filters including global and local filters as well as morphological operations. The global filter has been utilized to eliminate the blurring effect of the images due to system-induced noises with Point Spread Function (PSF) characteristics while the local filter aims to remove the noises with Gaussian distribution. The local filter should perform optimum noise reduction as well as not damaging the structure of the stars, therefore, a PCA (Principal Component Analysis) based denoising filter have been preferred to use. Although the PCA method is even good at preserving the mass integrity of stars, it may also have disruptive effects on the shape of them. Morphological operations help to restore this deformation. In order to verify the proposed algorithm, different types of noises having the Gaussian characteristics with different variance values have been inserted to astronomical star images to simulate the varied conditions of near space. Structural Similarity Index (SSIM) and Peak Signal to Noise Ratio (PSNR) parameters have been used as a performance metrics to show the accuracy of the filtering process. Furthermore, to demonstrate the overall accuracy of this method against to noise, the Mean Error of Centroid Estimation (MECE) has been achieved by means of the Monte Carlo analysis. Also, the performance of this algorithm has been compared with similar algorithms and the results show that this algorithm outperforms others.
恒星探测算法确定观测到的空间场景中恒星的位置和星等。在这项研究中,提出了一种鲁棒的恒星探测算法,该算法可以滤除天文图像中的噪声,并在保留其原始圆形的情况下准确估计恒星的质心。该算法建议使用不同的滤波器,包括全局滤波器和局部滤波器以及形态学运算。采用全局滤波器消除具有点扩散函数(PSF)特征的系统噪声对图像的模糊影响,采用局部滤波器消除具有高斯分布的噪声。局部滤波器既要达到最佳降噪效果,又不能破坏恒星的结构,因此,首选基于主成分分析(PCA)的去噪滤波器。尽管PCA方法在保持恒星质量完整性方面做得很好,但它也可能对恒星的形状产生破坏性影响。形态学手术有助于恢复这种变形。为了验证所提出的算法,在天文星图中插入不同类型的具有不同方差值的高斯特征的噪声,模拟近空间的变化情况。结构相似指数(SSIM)和峰值信噪比(PSNR)参数被用作显示滤波过程准确性的性能指标。此外,为了证明该方法对噪声的总体准确性,通过蒙特卡罗分析获得了质心估计的平均误差(MECE)。并将该算法的性能与同类算法进行了比较,结果表明该算法的性能优于其他算法。
{"title":"An Improved Star Detection Algorithm Using a Combination of Statistical and Morphological Image Processing Techniques","authors":"AL Samed, I. Karagoz, Ali Dogan","doi":"10.1109/AIPR.2018.8707438","DOIUrl":"https://doi.org/10.1109/AIPR.2018.8707438","url":null,"abstract":"A star detection algorithm determines the position and magnitude of stars on an observed space scene. In this study, a robust star detection algorithm is presented that filters the noise out in astronomical images and accurately estimates the centroid of stars in a way that preserving their native circular shapes. The proposed algorithm suggests the usage of different filters including global and local filters as well as morphological operations. The global filter has been utilized to eliminate the blurring effect of the images due to system-induced noises with Point Spread Function (PSF) characteristics while the local filter aims to remove the noises with Gaussian distribution. The local filter should perform optimum noise reduction as well as not damaging the structure of the stars, therefore, a PCA (Principal Component Analysis) based denoising filter have been preferred to use. Although the PCA method is even good at preserving the mass integrity of stars, it may also have disruptive effects on the shape of them. Morphological operations help to restore this deformation. In order to verify the proposed algorithm, different types of noises having the Gaussian characteristics with different variance values have been inserted to astronomical star images to simulate the varied conditions of near space. Structural Similarity Index (SSIM) and Peak Signal to Noise Ratio (PSNR) parameters have been used as a performance metrics to show the accuracy of the filtering process. Furthermore, to demonstrate the overall accuracy of this method against to noise, the Mean Error of Centroid Estimation (MECE) has been achieved by means of the Monte Carlo analysis. Also, the performance of this algorithm has been compared with similar algorithms and the results show that this algorithm outperforms others.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116163541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1