首页 > 最新文献

中国图象图形学报最新文献

英文 中文
Systematic Configuration for Hyperparameters Optimization in Transferring of CNN Model to Disaster Events Classification from UAV Images CNN模型转移到无人机图像灾害事件分类中的超参数优化系统配置
Q3 Computer Science Pub Date : 2023-09-01 DOI: 10.18178/joig.11.3.263-270
Supaporn Bunrit, Nittaya Kerdprasop, Kittisak Kerdprasop
Deep learning and computer vision-based approaches incorporated with the evolution of the relevant technologies of Unmanned Aerial Vehicles (UAVs) and drones have significantly motivated the advancements of disaster management applications. This research studied a classification method for disaster event identification from UAV images that is suitable for disaster monitoring. A Convolution Neural Network (CNN) of GoogleNet models that were pretrained from ImageNet and Place365 datasets was explored to find the appropriate one for fine-tuning to classify the disaster events. In order to get the optimal performance, a systematic configuration for searching the hyperparameters in fine-tuning the CNN model was proposed. The top three hyperparameters that affect the performance, which are the initial learning rate, the number of epochs, and the minibatch size, were systematically set and tuned for each configuration. The proposed approach consists of five stages, during which three types of trials were used to monitor different sets of the hyperparameters. The experimental result revealed that by applying the proposed approach the model performance can increase up to 5%. The optimal performance achieved was 98.77 percent accuracy. For UAV/drone applications, where a small onboard model is preferred, GoogleNet that is quite small in model size and has a good structure for further fine tuning is suitable to deploy.
深度学习和基于计算机视觉的方法与无人机(uav)和无人机相关技术的发展相结合,极大地推动了灾害管理应用的进步。本文研究了一种适用于灾害监测的无人机图像灾害事件识别分类方法。通过对ImageNet和Place365数据集预训练的GoogleNet模型的卷积神经网络(CNN)进行探索,找到合适的卷积神经网络进行微调,以对灾难事件进行分类。为了获得最优的性能,提出了一种用于微调CNN模型超参数搜索的系统配置。影响性能的三个最重要的超参数是初始学习率、epoch数和minibatch大小,这些都是针对每种配置系统设置和调优的。提出的方法包括五个阶段,在此期间,三种类型的试验被用来监测不同的超参数集。实验结果表明,采用该方法可使模型性能提高5%。获得的最佳性能为98.77%的准确率。对于无人机/无人机应用,首选小型机载模型,模型尺寸相当小且具有良好的结构以进行进一步微调的GoogleNet适合部署。
{"title":"Systematic Configuration for Hyperparameters Optimization in Transferring of CNN Model to Disaster Events Classification from UAV Images","authors":"Supaporn Bunrit, Nittaya Kerdprasop, Kittisak Kerdprasop","doi":"10.18178/joig.11.3.263-270","DOIUrl":"https://doi.org/10.18178/joig.11.3.263-270","url":null,"abstract":"Deep learning and computer vision-based approaches incorporated with the evolution of the relevant technologies of Unmanned Aerial Vehicles (UAVs) and drones have significantly motivated the advancements of disaster management applications. This research studied a classification method for disaster event identification from UAV images that is suitable for disaster monitoring. A Convolution Neural Network (CNN) of GoogleNet models that were pretrained from ImageNet and Place365 datasets was explored to find the appropriate one for fine-tuning to classify the disaster events. In order to get the optimal performance, a systematic configuration for searching the hyperparameters in fine-tuning the CNN model was proposed. The top three hyperparameters that affect the performance, which are the initial learning rate, the number of epochs, and the minibatch size, were systematically set and tuned for each configuration. The proposed approach consists of five stages, during which three types of trials were used to monitor different sets of the hyperparameters. The experimental result revealed that by applying the proposed approach the model performance can increase up to 5%. The optimal performance achieved was 98.77 percent accuracy. For UAV/drone applications, where a small onboard model is preferred, GoogleNet that is quite small in model size and has a good structure for further fine tuning is suitable to deploy.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74403775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Efficient Backbone for Early Forest Fire Detection Based on Convolutional Neural Networks 基于卷积神经网络的森林火灾早期检测的高效主干
Q3 Computer Science Pub Date : 2023-09-01 DOI: 10.18178/joig.11.3.227-232
D. Mahanta, D. Hazarika, V. K. Nath
Forest fires cause disastrous damage to both human life and ecosystem. Therefore, it is essential to detect forest fires in the early stage to reduce the damage. Convolutional Neural Networks (CNNs) are widely used for forest fire detection. This paper proposes a new backbone network for a CNN-based forest fire detection model. The proposed backbone network can detect the plumes of smoke well by decomposing the conventional convolution into depth-wise and coordinate ones to better extract information from objects that spread along the vertical dimension. Experimental results show that the proposed backbone network outperforms other popular ones by achieving a detection accuracy of up to 52.6 AP.1
森林火灾给人类生活和生态系统造成了灾难性的破坏。因此,及早发现森林火灾,减少损失至关重要。卷积神经网络(cnn)被广泛应用于森林火灾探测。本文提出了一种基于cnn的森林火灾探测模型的新型骨干网。该骨干网络将传统的卷积分解为深度卷积和坐标卷积,从而更好地提取沿垂直方向扩散的目标信息,从而很好地检测出烟雾羽流。实验结果表明,所提出的骨干网的检测精度高达52.6 AP.1,优于其他常用的骨干网
{"title":"An Efficient Backbone for Early Forest Fire Detection Based on Convolutional Neural Networks","authors":"D. Mahanta, D. Hazarika, V. K. Nath","doi":"10.18178/joig.11.3.227-232","DOIUrl":"https://doi.org/10.18178/joig.11.3.227-232","url":null,"abstract":"Forest fires cause disastrous damage to both human life and ecosystem. Therefore, it is essential to detect forest fires in the early stage to reduce the damage. Convolutional Neural Networks (CNNs) are widely used for forest fire detection. This paper proposes a new backbone network for a CNN-based forest fire detection model. The proposed backbone network can detect the plumes of smoke well by decomposing the conventional convolution into depth-wise and coordinate ones to better extract information from objects that spread along the vertical dimension. Experimental results show that the proposed backbone network outperforms other popular ones by achieving a detection accuracy of up to 52.6 AP.1","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88046094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Enhanced Security in Medical Image Encryption Based on Multi-level Chaotic DNA Diffusion 基于多级混沌DNA扩散的医学图像加密安全性提高
Q3 Computer Science Pub Date : 2023-06-01 DOI: 10.18178/joig.11.2.153-160
Mousumi Gupta, Snehashish Bhattacharjee, Biswajoy Chatterjee
A novel medical image encryption technique has been proposed based on the features of DNA encodingdecoding in combination with Logistic map approach. The approach is proven for encryption of highly sensitive medical images with 100 percent integrity or negligible data loss. Testing is done on both high and low-resolution images. Proposed encryption technique consists of two levels of diffusion using the actual structure of the DNA. In the first level of diffusion process, we have used DNA encoding and decoding operations to generate DNA sequence of each pixel. The originality of the work is to use a long DNA structure stored in a text file stored on both sender and receiver’s end to improve the performance of the proposed method. In this initial level of diffusion, DNA sequences are generated for each pixe-land in each of the DNA sequence. Index values are obtained by employing a search operation on the DNA structure. This index values are further modified and ready to be used for next diffusion process. In the second level diffusion, a highly chaotic logistic map is iterated to generate sequences and is employed to extract the chaotic values to form the cipher images. The correlation coefficient analysis, Histogram analysis, Entropy analysis, NPCR, and UACI exhibit significant results. Therefore; the proposed technique can play an important role in the security of low-resolution medical images as well as other visible highly sensitive images.
基于DNA编码解码的特点,结合Logistic映射方法,提出了一种新的医学图像加密技术。该方法已被证明用于高度敏感的医学图像的加密,具有100%的完整性或可忽略不计的数据丢失。在高分辨率和低分辨率图像上都进行了测试。提出的加密技术包括利用DNA的实际结构进行两级扩散。在第一级扩散过程中,我们使用DNA编码和解码操作来生成每个像素的DNA序列。这项工作的独创性在于使用存储在发送方和接收方两端的文本文件中的长DNA结构来提高所提出方法的性能。在这个初始的扩散水平上,DNA序列是为每个DNA序列中的每个像素域生成的。索引值是通过对DNA结构进行搜索操作获得的。该指标值被进一步修改,准备用于下一个扩散过程。在第二级扩散中,迭代一个高度混沌的逻辑映射生成序列,然后提取混沌值形成密码图像。相关系数分析、直方图分析、熵分析、NPCR和UACI均有显著性结果。因此;该技术对低分辨率医学图像以及其他可见的高灵敏度图像的安全防护具有重要作用。
{"title":"An Enhanced Security in Medical Image Encryption Based on Multi-level Chaotic DNA Diffusion","authors":"Mousumi Gupta, Snehashish Bhattacharjee, Biswajoy Chatterjee","doi":"10.18178/joig.11.2.153-160","DOIUrl":"https://doi.org/10.18178/joig.11.2.153-160","url":null,"abstract":"A novel medical image encryption technique has been proposed based on the features of DNA encodingdecoding in combination with Logistic map approach. The approach is proven for encryption of highly sensitive medical images with 100 percent integrity or negligible data loss. Testing is done on both high and low-resolution images. Proposed encryption technique consists of two levels of diffusion using the actual structure of the DNA. In the first level of diffusion process, we have used DNA encoding and decoding operations to generate DNA sequence of each pixel. The originality of the work is to use a long DNA structure stored in a text file stored on both sender and receiver’s end to improve the performance of the proposed method. In this initial level of diffusion, DNA sequences are generated for each pixe-land in each of the DNA sequence. Index values are obtained by employing a search operation on the DNA structure. This index values are further modified and ready to be used for next diffusion process. In the second level diffusion, a highly chaotic logistic map is iterated to generate sequences and is employed to extract the chaotic values to form the cipher images. The correlation coefficient analysis, Histogram analysis, Entropy analysis, NPCR, and UACI exhibit significant results. Therefore; the proposed technique can play an important role in the security of low-resolution medical images as well as other visible highly sensitive images.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84970956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mobile Dermatoscopy: Class Imbalance Management Based on Blurring Augmentation, Iterative Refining and Cost-Weighted Recall Loss 移动皮肤镜:基于模糊增强、迭代精炼和成本加权召回损失的类不平衡管理
Q3 Computer Science Pub Date : 2023-06-01 DOI: 10.18178/joig.11.2.161-169
Nauman Ullah Gilal, Samah Ahmed Mustapha Ahmed, J. Schneider, Mowafa J Househ, Marco Agus
We present an end-to-end framework for real-time melanoma detection on mole images acquired with mobile devices equipped with off-the-shelf magnifying lens. We trained our models by using transfer learning through EfficientNet convolutional neural networks by using public domain The International Skin Imaging Collaboration (ISIC)-2019 and ISIC-2020 datasets. To reduce the class imbalance issue, we integrated the standard training pipeline with schemes for effective data balance using oversampling and iterative cleaning through loss ranking. We also introduce a blurring scheme able to emulate the aberrations produced by commonly available magnifying lenses, and a novel loss function incorporating the difference in cost between false positive (melanoma misses) and false negative (benignant misses) predictions. Through preliminary experiments, we show that our framework is able to create models for real-time mobile inference with controlled tradeoff between false positive rate and false negative rate. The obtained performances on ISIC-2020 dataset are the following: accuracy 96.9%, balanced accuracy 98%, ROCAUC=0.98, benign recall 97.7%, malignant recall 97.2%.
我们提出了一个端到端的框架,实时黑色素瘤检测的痣图像与配备现成的放大镜的移动设备获得。我们通过使用公共领域的国际皮肤成像协作(ISIC)-2019和ISIC-2020数据集,通过高效网络卷积神经网络使用迁移学习来训练我们的模型。为了减少类不平衡问题,我们将标准训练管道与使用过采样和通过损失排序进行迭代清理的有效数据平衡方案集成在一起。我们还引入了一种能够模拟常用放大镜产生的像差的模糊方案,以及一种新的损失函数,该函数结合了假阳性(黑色素瘤遗漏)和假阴性(良性遗漏)预测之间的成本差异。通过初步实验,我们表明我们的框架能够创建实时移动推理模型,并在假阳性率和假阴性率之间进行可控权衡。在ISIC-2020数据集上获得的性能如下:准确率96.9%,平衡准确率98%,ROCAUC=0.98,良性召回率97.7%,恶性召回率97.2%。
{"title":"Mobile Dermatoscopy: Class Imbalance Management Based on Blurring Augmentation, Iterative Refining and Cost-Weighted Recall Loss","authors":"Nauman Ullah Gilal, Samah Ahmed Mustapha Ahmed, J. Schneider, Mowafa J Househ, Marco Agus","doi":"10.18178/joig.11.2.161-169","DOIUrl":"https://doi.org/10.18178/joig.11.2.161-169","url":null,"abstract":"We present an end-to-end framework for real-time melanoma detection on mole images acquired with mobile devices equipped with off-the-shelf magnifying lens. We trained our models by using transfer learning through EfficientNet convolutional neural networks by using public domain The International Skin Imaging Collaboration (ISIC)-2019 and ISIC-2020 datasets. To reduce the class imbalance issue, we integrated the standard training pipeline with schemes for effective data balance using oversampling and iterative cleaning through loss ranking. We also introduce a blurring scheme able to emulate the aberrations produced by commonly available magnifying lenses, and a novel loss function incorporating the difference in cost between false positive (melanoma misses) and false negative (benignant misses) predictions. Through preliminary experiments, we show that our framework is able to create models for real-time mobile inference with controlled tradeoff between false positive rate and false negative rate. The obtained performances on ISIC-2020 dataset are the following: accuracy 96.9%, balanced accuracy 98%, ROCAUC=0.98, benign recall 97.7%, malignant recall 97.2%.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82984587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Performance Analysis of Facial Expression Recognition System Using Local Regions and Features 基于局部区域和特征的面部表情识别系统性能分析
Q3 Computer Science Pub Date : 2023-06-01 DOI: 10.18178/joig.11.2.104-114
Yining Yang, Vuksanovic Branislav, Hongjie Ma
Different parts of our face contribute to overall facial expressions, such as anger, happiness and sadness in distinct ways. This paper investigates the degree of importance of different human face parts to the accuracy of Facial Expression Recognition (FER). In the context of machine learning, FER refers to a problem where a computer vision system is trained to automatically detect the facial expression from a presented facial image. This is a difficult image classification problem that is not yet fully solved and has received significant attention in recent years, mainly due to the increased number of possible applications in daily life. To establish the extent to which different human face parts contribute to overall facial expression, various sections have been extracted from a set of facial images and then used as inputs into three different FER systems. In terms of the recognition rates for each facial section, this result confirms that various regions of the face have different levels of importance regarding the accuracy rate achieved by an associated FER system.
我们面部的不同部位以不同的方式影响着整体的面部表情,比如愤怒、快乐和悲伤。本文研究了人脸不同部位对人脸表情识别准确性的影响程度。在机器学习的背景下,FER是指训练计算机视觉系统从呈现的面部图像中自动检测面部表情的问题。这是一个尚未完全解决的困难的图像分类问题,近年来受到了极大的关注,主要是因为在日常生活中可能的应用越来越多。为了确定人脸的不同部分对整体面部表情的贡献程度,从一组面部图像中提取了不同的部分,然后将其作为三种不同的FER系统的输入。就每个面部部分的识别率而言,这一结果证实了面部的不同区域对于相关的FER系统所达到的准确率具有不同的重要程度。
{"title":"The Performance Analysis of Facial Expression Recognition System Using Local Regions and Features","authors":"Yining Yang, Vuksanovic Branislav, Hongjie Ma","doi":"10.18178/joig.11.2.104-114","DOIUrl":"https://doi.org/10.18178/joig.11.2.104-114","url":null,"abstract":"Different parts of our face contribute to overall facial expressions, such as anger, happiness and sadness in distinct ways. This paper investigates the degree of importance of different human face parts to the accuracy of Facial Expression Recognition (FER). In the context of machine learning, FER refers to a problem where a computer vision system is trained to automatically detect the facial expression from a presented facial image. This is a difficult image classification problem that is not yet fully solved and has received significant attention in recent years, mainly due to the increased number of possible applications in daily life. To establish the extent to which different human face parts contribute to overall facial expression, various sections have been extracted from a set of facial images and then used as inputs into three different FER systems. In terms of the recognition rates for each facial section, this result confirms that various regions of the face have different levels of importance regarding the accuracy rate achieved by an associated FER system.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90231457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ResMLP_GGR: Residual Multilayer Perceptrons- Based Genotype-Guided Recurrence Prediction of Non-small Cell Lung Cancer 基于残差多层感知器的非小细胞肺癌基因型复发预测
Q3 Computer Science Pub Date : 2023-06-01 DOI: 10.18178/joig.11.2.185-194
Yang Ai, Yinhao Li, Yen-Wei Chen, Panyanat Aonpong, Xianhua Han
Non-small Cell Lung Cancer (NSCLC) is one of the malignant tumors with the highest morbidity and mortality. The postoperative recurrence rate in patients with NSCLC is high, which directly endangers the lives of patients. In recent years, many studies have used Computed Tomography (CT) images to predict NSCLC recurrence. Although this approach is inexpensive, it has low prediction accuracy. Gene expression data can achieve high accuracy. However, gene acquisition is expensive and invasive, and cannot meet the recurrence prediction requirements of all patients. In this study, a low-cost, high-accuracy residual multilayer perceptrons-based genotype-guided recurrence (ResMLP_GGR) prediction method is proposed that uses a gene estimation model to guide recurrence prediction. First, a gene estimation model is proposed to construct a mapping function of mixed features (handcrafted and deep features) and gene data to estimate the genetic information of tumor heterogeneity. Then, from gene estimation data obtained using a regression model, representations related to recurrence are learned to realize NSCLC recurrence prediction. In the testing phase, NSCLC recurrence prediction can be achieved with only CT images. The experimental results show that the proposed method has few parameters, strong generalization ability, and is suitable for small datasets. Compared with state-of-the-art methods, the proposed method significantly improves recurrence prediction accuracy by 3.39% with only 1% of parameters.
非小细胞肺癌(NSCLC)是发病率和死亡率最高的恶性肿瘤之一。NSCLC患者术后复发率高,直接危及患者生命。近年来,许多研究使用计算机断层扫描(CT)图像预测非小细胞肺癌的复发。这种方法虽然成本低廉,但预测精度较低。基因表达数据可以达到较高的准确性。然而,基因获取是昂贵的和侵入性的,并不能满足所有患者的复发预测要求。本研究提出了一种基于残差多层感知器的低成本、高精度基因型引导复发(ResMLP_GGR)预测方法,该方法利用基因估计模型指导复发预测。首先,提出了一种基因估计模型,构建混合特征(手工特征和深度特征)与基因数据的映射函数来估计肿瘤异质性的遗传信息。然后,利用回归模型获得基因估计数据,学习与复发相关的表征,实现对NSCLC复发的预测。在检测阶段,仅通过CT图像即可预测NSCLC的复发。实验结果表明,该方法参数少,泛化能力强,适用于小数据集。与现有方法相比,该方法仅使用1%的参数,递归预测准确率显著提高3.39%。
{"title":"ResMLP_GGR: Residual Multilayer Perceptrons- Based Genotype-Guided Recurrence Prediction of Non-small Cell Lung Cancer","authors":"Yang Ai, Yinhao Li, Yen-Wei Chen, Panyanat Aonpong, Xianhua Han","doi":"10.18178/joig.11.2.185-194","DOIUrl":"https://doi.org/10.18178/joig.11.2.185-194","url":null,"abstract":"Non-small Cell Lung Cancer (NSCLC) is one of the malignant tumors with the highest morbidity and mortality. The postoperative recurrence rate in patients with NSCLC is high, which directly endangers the lives of patients. In recent years, many studies have used Computed Tomography (CT) images to predict NSCLC recurrence. Although this approach is inexpensive, it has low prediction accuracy. Gene expression data can achieve high accuracy. However, gene acquisition is expensive and invasive, and cannot meet the recurrence prediction requirements of all patients. In this study, a low-cost, high-accuracy residual multilayer perceptrons-based genotype-guided recurrence (ResMLP_GGR) prediction method is proposed that uses a gene estimation model to guide recurrence prediction. First, a gene estimation model is proposed to construct a mapping function of mixed features (handcrafted and deep features) and gene data to estimate the genetic information of tumor heterogeneity. Then, from gene estimation data obtained using a regression model, representations related to recurrence are learned to realize NSCLC recurrence prediction. In the testing phase, NSCLC recurrence prediction can be achieved with only CT images. The experimental results show that the proposed method has few parameters, strong generalization ability, and is suitable for small datasets. Compared with state-of-the-art methods, the proposed method significantly improves recurrence prediction accuracy by 3.39% with only 1% of parameters.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"71 9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83616358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Local Bit-Plane Domain 3D Oriented Arbitrary and Circular Shaped Scanning Patterns for Bio-Medical Image Retrieval 局部位平面域三维定向任意和圆形扫描模式的生物医学图像检索
Q3 Computer Science Pub Date : 2023-06-01 DOI: 10.18178/joig.11.2.212-226
D. Mahanta, D. Hazarika, V. K. Nath
A new feature descriptor called local bit-plane domain 3D oriented arbitrary and circular shaped scanning pattern (LB-3D-OACSP) is proposed for biomedical image retrieval in this study. Unlike the circular, zigzag and other scanning structures, the LB-3D-OACSP descriptor calculates the association between reference-pixel and its surrounding pixels in bit-plane domain in a 3D plane using multi-directional 3D arbitrary and 3D circular shaped scanning patterns. In contrast to other scanning structures, the multi-directional 3-D arbitrary shaped patterns provide more continual angular dissimilarity among the sampling positions with the aim to capture more frequent changes in the local textures. The total of sixteen number of discriminative 3D arbitrary and 3D circular shaped patterns oriented in various directions are applied on a 3D plane constructed using respective bit-planes of three multi-scale images which ensures the maximum extraction of inter-scale geometrical information across the scales which very effectively captures not only the uniform but non-uniform textures too. The multi-scale images are generated by processing the input image with Gaussian filter banks generating three multi-scale images. The LB-3D-OACSP descriptor is able to capture most of the very fine to coarse image textures through encoding of bit-planes. The performance of LB-3D-OACSP is tested on three popular biomedical image databases both in terms of % average retrieval precision (ARP) and % average retrieval recall (ARR). The experiments demonstrate an encouraging enhancement in terms of %ARP and %ARR as compared to many existing state of the art descriptors.
提出了一种新的生物医学图像检索特征描述符——局部位平面域三维定向任意圆形扫描模式(LB-3D-OACSP)。与圆形、之字形和其他扫描结构不同,LB-3D-OACSP描述符使用多向3D任意和3D圆形扫描模式计算三维平面位平面域内参考像素与其周围像素之间的关联。与其他扫描结构相比,多向三维任意形状图案在采样位置之间提供了更连续的角度不相似性,旨在捕获更频繁的局部纹理变化。在由三幅多尺度图像各自的位平面构成的三维平面上,应用了共16个不同方向的判别性三维任意和三维圆形图案,保证了最大程度地提取跨尺度的尺度间几何信息,既能有效地捕获均匀纹理,也能有效地捕获非均匀纹理。通过高斯滤波器组对输入图像进行处理,生成三幅多尺度图像,从而生成多尺度图像。LB-3D-OACSP描述符能够通过对位平面进行编码来捕获大多数非常精细到粗糙的图像纹理。在三种常用的生物医学图像数据库上测试了LB-3D-OACSP的平均检索精度(ARP)和平均检索召回率(ARR)。实验表明,与许多现有的艺术描述符相比,在%ARP和%ARR方面有了令人鼓舞的增强。
{"title":"Local Bit-Plane Domain 3D Oriented Arbitrary and Circular Shaped Scanning Patterns for Bio-Medical Image Retrieval","authors":"D. Mahanta, D. Hazarika, V. K. Nath","doi":"10.18178/joig.11.2.212-226","DOIUrl":"https://doi.org/10.18178/joig.11.2.212-226","url":null,"abstract":"A new feature descriptor called local bit-plane domain 3D oriented arbitrary and circular shaped scanning pattern (LB-3D-OACSP) is proposed for biomedical image retrieval in this study. Unlike the circular, zigzag and other scanning structures, the LB-3D-OACSP descriptor calculates the association between reference-pixel and its surrounding pixels in bit-plane domain in a 3D plane using multi-directional 3D arbitrary and 3D circular shaped scanning patterns. In contrast to other scanning structures, the multi-directional 3-D arbitrary shaped patterns provide more continual angular dissimilarity among the sampling positions with the aim to capture more frequent changes in the local textures. The total of sixteen number of discriminative 3D arbitrary and 3D circular shaped patterns oriented in various directions are applied on a 3D plane constructed using respective bit-planes of three multi-scale images which ensures the maximum extraction of inter-scale geometrical information across the scales which very effectively captures not only the uniform but non-uniform textures too. The multi-scale images are generated by processing the input image with Gaussian filter banks generating three multi-scale images. The LB-3D-OACSP descriptor is able to capture most of the very fine to coarse image textures through encoding of bit-planes. The performance of LB-3D-OACSP is tested on three popular biomedical image databases both in terms of % average retrieval precision (ARP) and % average retrieval recall (ARR). The experiments demonstrate an encouraging enhancement in terms of %ARP and %ARR as compared to many existing state of the art descriptors.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91355411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Classification Model Based on U-Net for Crack Detection from Asphalt Pavement Images 基于U-Net的沥青路面图像裂缝检测分类模型
Q3 Computer Science Pub Date : 2023-06-01 DOI: 10.18178/joig.11.2.121-126
Y. Fujita, Taisei Tanaka, Tomoki Hori, Y. Hamamoto
The purpose of our study is to detect cracks accurately from asphalt pavement surface images, which includes unexpected objects, non-uniform illumination, and irregularities in surfaces. We propose a method to construct a classification Convolutional Neural Network (CNN) model based on the pre-trained U-Net, which is a well-known semantic segmentation model. Firstly, we train the U-Net with a limited amount of the asphalt pavement surface dataset which is obtained by a Mobile Mapping System (MMS). Then, we use the encoder of the trained U-Net as a feature extractor to construct a classification model, and train by fine-tuning. We describe comparative evaluations with VGG11, ResNet18, and GoogLeNet as well-known models constructed by transfer learning using ImageNet, which is a large size dataset of natural images. Experimental results show our model has high classification performance, compared to the other models constructed by transfer learning using ImageNet. Our method is effective to construct convolutional neural network model using the limited training dataset.
我们的研究目的是从沥青路面表面图像中准确检测裂缝,这些图像包括意外物体、不均匀光照和表面不规则。本文提出了一种基于预训练的U-Net的分类卷积神经网络(CNN)模型的构建方法,U-Net是一种著名的语义分割模型。首先,我们使用由移动地图系统(MMS)获得的有限数量的沥青路面表面数据集训练U-Net。然后,我们使用训练好的U-Net的编码器作为特征提取器来构建分类模型,并进行微调训练。我们将VGG11、ResNet18和GoogLeNet的比较评估描述为使用ImageNet(一个大型自然图像数据集)迁移学习构建的知名模型。实验结果表明,与使用ImageNet迁移学习构建的其他模型相比,我们的模型具有较高的分类性能。该方法可以有效地利用有限的训练数据集构建卷积神经网络模型。
{"title":"Classification Model Based on U-Net for Crack Detection from Asphalt Pavement Images","authors":"Y. Fujita, Taisei Tanaka, Tomoki Hori, Y. Hamamoto","doi":"10.18178/joig.11.2.121-126","DOIUrl":"https://doi.org/10.18178/joig.11.2.121-126","url":null,"abstract":"The purpose of our study is to detect cracks accurately from asphalt pavement surface images, which includes unexpected objects, non-uniform illumination, and irregularities in surfaces. We propose a method to construct a classification Convolutional Neural Network (CNN) model based on the pre-trained U-Net, which is a well-known semantic segmentation model. Firstly, we train the U-Net with a limited amount of the asphalt pavement surface dataset which is obtained by a Mobile Mapping System (MMS). Then, we use the encoder of the trained U-Net as a feature extractor to construct a classification model, and train by fine-tuning. We describe comparative evaluations with VGG11, ResNet18, and GoogLeNet as well-known models constructed by transfer learning using ImageNet, which is a large size dataset of natural images. Experimental results show our model has high classification performance, compared to the other models constructed by transfer learning using ImageNet. Our method is effective to construct convolutional neural network model using the limited training dataset.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84582636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mobile Surveillance Siren Against Moving Object as a Support System for Blind PeopleMobile Surveillance Siren Against Moving Object as a Support System for Blind People 作为盲人支持系统的移动防移动物监控警报器移动防移动物监控警报器
Q3 Computer Science Pub Date : 2023-06-01 DOI: 10.18178/joig.11.2.170-177
D. H. Hareva, A. Sebastian, A. Mitra, Irene A. Lazarusli, C. Haryani
Visually impaired people can use smartphone navigation applications to arrive at their destination. However, those applications do not provide the means to detect moving objects. This paper presents an Android application that uses the smartphone’s camera to provide real-time object detection. Images captured by the camera are to be processed digitally. The model then predicts objects from the processed image using a Convolutional Neural Network (CNN) stored in mobile devices. The model returns bounding boxes for each of the detected objects. These bounding boxes are used to calculate the distance from the object to the camera. The model used is SSD MobileNet V1, which is pre-trained using the Common Objects in Context (COCO) dataset. System testing is divided into object distance and accuracy testing. Results show that the margin of error for calculating distance is below 5% for distances under 8 meters. The mean average precision is 0.9393, while the mean average recall is 0.4479. It means that the system can recognize moving objects through the embedded model in a smartphone.
视障人士可以使用智能手机导航应用程序到达目的地。然而,这些应用程序不提供检测移动物体的手段。本文介绍了一个Android应用程序,该应用程序使用智能手机的摄像头提供实时目标检测。照相机捕捉到的图像要进行数字处理。然后,该模型使用存储在移动设备中的卷积神经网络(CNN)从处理后的图像中预测物体。该模型为每个检测到的对象返回边界框。这些边界框用于计算物体到相机的距离。使用的模型是SSD MobileNet V1,该模型使用COCO (Common Objects in Context)数据集进行预训练。系统测试分为对象距离测试和精度测试。结果表明,对于8米以下的距离,计算距离的误差范围在5%以下。平均精密度为0.9393,平均召回率为0.4479。这意味着该系统可以通过智能手机中的嵌入式模型识别移动物体。
{"title":"Mobile Surveillance Siren Against Moving Object as a Support System for Blind PeopleMobile Surveillance Siren Against Moving Object as a Support System for Blind People","authors":"D. H. Hareva, A. Sebastian, A. Mitra, Irene A. Lazarusli, C. Haryani","doi":"10.18178/joig.11.2.170-177","DOIUrl":"https://doi.org/10.18178/joig.11.2.170-177","url":null,"abstract":"Visually impaired people can use smartphone navigation applications to arrive at their destination. However, those applications do not provide the means to detect moving objects. This paper presents an Android application that uses the smartphone’s camera to provide real-time object detection. Images captured by the camera are to be processed digitally. The model then predicts objects from the processed image using a Convolutional Neural Network (CNN) stored in mobile devices. The model returns bounding boxes for each of the detected objects. These bounding boxes are used to calculate the distance from the object to the camera. The model used is SSD MobileNet V1, which is pre-trained using the Common Objects in Context (COCO) dataset. System testing is divided into object distance and accuracy testing. Results show that the margin of error for calculating distance is below 5% for distances under 8 meters. The mean average precision is 0.9393, while the mean average recall is 0.4479. It means that the system can recognize moving objects through the embedded model in a smartphone.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83167665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Instant Counting & Vehicle Detection during Hajj Using Drones 在朝觐期间使用无人机进行即时计数和车辆检测
Q3 Computer Science Pub Date : 2023-06-01 DOI: 10.18178/joig.11.2.204-211
Abdullah M. Algamdi, Hammam M. AlGhamdi
During the past decade, artificial intelligence technologies, especially Computer Vision (CV) technologies, have experienced significant breakthroughs due to the development of deep learning models, particularly Convolutional Neural Networks (CNNs). These networks have been utilized in various research applications, including astronomy, marine sciences, security, medicine, and pathology. In this paper, we build a framework utilizing CV technology to support decision-makers during the Hajj season. We collect and process real-time/instant images from multiple aircraft/drones, which follow the pilgrims while they move around the holy sites during Hajj. These images, taken by multiple drones, are processed in two stages. First, we purify the images collected from multiple drones and stitch them, producing one image that captures the whole holy site. Second, the stitched image is processed using a CNN to provide two pieces of information: (1) the number of buses and ambulances; and (2) the estimated count of pilgrims. This information could help decision-makers identify needs for further support during Hajj, such as logistics services, security personnel, and/or ambulances.
在过去的十年中,人工智能技术,特别是计算机视觉(CV)技术,由于深度学习模型的发展,特别是卷积神经网络(cnn),经历了重大突破。这些网络已用于各种研究应用,包括天文学、海洋科学、安全、医学和病理学。在本文中,我们利用CV技术构建了一个框架来支持朝觐期间的决策者。我们收集和处理来自多架飞机/无人机的实时/即时图像,这些飞机/无人机在朝觐期间跟随朝圣者在圣地周围移动。这些由多架无人机拍摄的图像分两个阶段进行处理。首先,我们净化从多个无人机收集的图像,并将它们缝合,产生一个图像,捕捉整个圣地。其次,对拼接后的图像进行CNN处理,提供两条信息:(1)公交车和救护车的数量;(2)朝圣者的估计人数。这些信息可以帮助决策者确定在朝觐期间需要进一步的支持,如后勤服务、保安人员和/或救护车。
{"title":"Instant Counting & Vehicle Detection during Hajj Using Drones","authors":"Abdullah M. Algamdi, Hammam M. AlGhamdi","doi":"10.18178/joig.11.2.204-211","DOIUrl":"https://doi.org/10.18178/joig.11.2.204-211","url":null,"abstract":"During the past decade, artificial intelligence technologies, especially Computer Vision (CV) technologies, have experienced significant breakthroughs due to the development of deep learning models, particularly Convolutional Neural Networks (CNNs). These networks have been utilized in various research applications, including astronomy, marine sciences, security, medicine, and pathology. In this paper, we build a framework utilizing CV technology to support decision-makers during the Hajj season. We collect and process real-time/instant images from multiple aircraft/drones, which follow the pilgrims while they move around the holy sites during Hajj. These images, taken by multiple drones, are processed in two stages. First, we purify the images collected from multiple drones and stitch them, producing one image that captures the whole holy site. Second, the stitched image is processed using a CNN to provide two pieces of information: (1) the number of buses and ambulances; and (2) the estimated count of pilgrims. This information could help decision-makers identify needs for further support during Hajj, such as logistics services, security personnel, and/or ambulances.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"88 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72488825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
中国图象图形学报
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1