首页 > 最新文献

2021 6th International Conference on Multimedia and Image Processing最新文献

英文 中文
Research on Behavior Recognition of Dairy Goat Based on Multi-model Fusion 基于多模型融合的奶山羊行为识别研究
Pub Date : 2021-01-08 DOI: 10.1145/3449388.3449395
Yi Li, Jinglei Tang, Dongjian He
In order to accurately identify the behavior of dairy goats in the image, a multi-model fusion convolutional neural network (CNN) method based on the image of dairy goats is proposed. At first, the AlexNet, ResNet50 and Vgg16 models are trained respectively, and the best recognition results of each model are obtained. Then, the attention weight of each model is calculated by feature stitching and other operations. Finally,The feature information of AlexNet, ResNet50 and Vgg16 is combined with attention mechanism to re-weight,and the parameters of the fused multi-model convolutional neural networks are adjusted to obtain the best recognition results of fusion models. Experimental results show that compared with single model and multi-model, the ARV fusion model we proposed achieves higher recognition accuracy, and the average accuracy of each dairy goat behavior is as high as 98.50%.
为了准确识别图像中奶山羊的行为,提出了一种基于奶山羊图像的多模型融合卷积神经网络(CNN)方法。首先分别对AlexNet、ResNet50和Vgg16模型进行训练,得到了每个模型的最佳识别结果。然后,通过特征拼接等操作计算各模型的关注权。最后,将AlexNet、ResNet50和Vgg16的特征信息结合注意机制进行重权重,并对融合多模型卷积神经网络的参数进行调整,获得融合模型的最佳识别结果。实验结果表明,与单模型和多模型相比,我们提出的ARV融合模型具有更高的识别精度,对每只奶山羊行为的平均准确率高达98.50%。
{"title":"Research on Behavior Recognition of Dairy Goat Based on Multi-model Fusion","authors":"Yi Li, Jinglei Tang, Dongjian He","doi":"10.1145/3449388.3449395","DOIUrl":"https://doi.org/10.1145/3449388.3449395","url":null,"abstract":"In order to accurately identify the behavior of dairy goats in the image, a multi-model fusion convolutional neural network (CNN) method based on the image of dairy goats is proposed. At first, the AlexNet, ResNet50 and Vgg16 models are trained respectively, and the best recognition results of each model are obtained. Then, the attention weight of each model is calculated by feature stitching and other operations. Finally,The feature information of AlexNet, ResNet50 and Vgg16 is combined with attention mechanism to re-weight,and the parameters of the fused multi-model convolutional neural networks are adjusted to obtain the best recognition results of fusion models. Experimental results show that compared with single model and multi-model, the ARV fusion model we proposed achieves higher recognition accuracy, and the average accuracy of each dairy goat behavior is as high as 98.50%.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122021149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Target Classification Algorithms Based on Multispectral Imaging: A Review 基于多光谱成像的目标分类算法综述
Pub Date : 2021-01-08 DOI: 10.1145/3449388.3449393
Zimu Zeng, Weifeng Wang, Wenpeng Zhang
Multispectral imaging extracts rich spectral information from targets, which greatly expands the function of traditional imaging technology. Multispectral imaging is widely used in agriculture, military, medicine, industry, and meteorology. Because of the information redundancy in multispectral images, it is necessary to reduce the dimension by pre-processing. In recent years, most of the researchers have adopted the methods of pre-processing before classification. Based on the principles of feature selection, feature transformation, and feature extraction, common dimensionality reduction methods are introduced, and the advantages and disadvantages of them are discussed. Afterwards, classification methods are divided into traditional methods and deep learning methods, and their characteristics and application prospect are discussed. Through comparison, the former are cost-effective and have the mature theories, while the latter have strong adaptability and high classification accuracy. At present, methods could be optimized from the perspective of saving computing resources and using spectral information efficiently. In the future, traditional methods will be improved and comprehensively used, while new methods with stronger adaptability and precision will be developed.
多光谱成像从目标中提取丰富的光谱信息,极大地扩展了传统成像技术的功能。多光谱成像广泛应用于农业、军事、医学、工业、气象等领域。由于多光谱图像的信息冗余性,需要对多光谱图像进行降维预处理。近年来,研究人员大多采用分类前预处理的方法。基于特征选择、特征变换和特征提取的原理,介绍了常用的降维方法,并讨论了各种降维方法的优缺点。然后,将分类方法分为传统方法和深度学习方法,并讨论了它们的特点和应用前景。通过比较,前者具有成本效益和成熟的理论,而后者具有较强的适应性和较高的分类精度。目前,可以从节省计算资源和有效利用光谱信息的角度对方法进行优化。未来将对传统方法进行改进和综合利用,同时开发适应性和精度更强的新方法。
{"title":"Target Classification Algorithms Based on Multispectral Imaging: A Review","authors":"Zimu Zeng, Weifeng Wang, Wenpeng Zhang","doi":"10.1145/3449388.3449393","DOIUrl":"https://doi.org/10.1145/3449388.3449393","url":null,"abstract":"Multispectral imaging extracts rich spectral information from targets, which greatly expands the function of traditional imaging technology. Multispectral imaging is widely used in agriculture, military, medicine, industry, and meteorology. Because of the information redundancy in multispectral images, it is necessary to reduce the dimension by pre-processing. In recent years, most of the researchers have adopted the methods of pre-processing before classification. Based on the principles of feature selection, feature transformation, and feature extraction, common dimensionality reduction methods are introduced, and the advantages and disadvantages of them are discussed. Afterwards, classification methods are divided into traditional methods and deep learning methods, and their characteristics and application prospect are discussed. Through comparison, the former are cost-effective and have the mature theories, while the latter have strong adaptability and high classification accuracy. At present, methods could be optimized from the perspective of saving computing resources and using spectral information efficiently. In the future, traditional methods will be improved and comprehensively used, while new methods with stronger adaptability and precision will be developed.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114757963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Nonlinear Filtered Compressed Sensing Applied on Image De-noising 非线性滤波压缩感知在图像去噪中的应用
Pub Date : 2021-01-08 DOI: 10.1145/3449388.3449390
Jian Dong, Yang Ding, H. Kudo
In the present era, the need for studies on noise removal by image processing is still considerable. In this paper, we developed a compressed sensing (CS) based algorithm for image de-nosing. Optimization theory was utilized. A cost function consisting of data fidelity term and penalty term was proposed. The minimization of cost function was achieved by proximal minimization method. The advantage of the algorithm is two-fold. First, we embedded the filtering procedure into a CS framework. It enhanced the effectiveness of filtering strategy. As known, repetitive post filters make images blurred, but CS in the proposed algorithm could keep the image clarity while achieving noise depression. Second, selectivity of filter type, especially nonlinear filters, strengthened the effectiveness and practicability of CS. With increasing number of literatures revealing the failure of total variation (TV) method in processing images with rich details, the new algorithm could preserve image textures and object boundaries accurately. Convergence property of the novel algorithm was also proved by the de-nosing instance. Among the nonlinear filters, nonlocal weighted median filter based CS presented the best de-noising effectiveness. The algorithm is considered to have a potential application value in other image processing issues, such as image restoration and reconstruction.
在当今时代,对图像处理去噪的研究仍有很大的需求。本文提出了一种基于压缩感知(CS)的图像去噪算法。运用最优化理论。提出了一种由数据保真度项和惩罚项组成的代价函数。采用最近邻最小法实现了代价函数的最小化。该算法的优点是双重的。首先,我们将过滤过程嵌入到CS框架中。提高了过滤策略的有效性。众所周知,重复的后期滤波会使图像模糊,但本文算法中的CS可以在保持图像清晰度的同时达到抑制噪声的目的。其次,滤波器类型的选择性,特别是非线性滤波器的选择性,增强了CS的有效性和实用性。随着越来越多的文献揭示全变分(TV)方法在处理丰富细节图像时的失败,新算法可以准确地保留图像纹理和目标边界。通过去噪实例验证了该算法的收敛性。在非线性滤波器中,基于CS的非局部加权中值滤波器的去噪效果最好。该算法被认为在其他图像处理问题,如图像恢复和重建中具有潜在的应用价值。
{"title":"Nonlinear Filtered Compressed Sensing Applied on Image De-noising","authors":"Jian Dong, Yang Ding, H. Kudo","doi":"10.1145/3449388.3449390","DOIUrl":"https://doi.org/10.1145/3449388.3449390","url":null,"abstract":"In the present era, the need for studies on noise removal by image processing is still considerable. In this paper, we developed a compressed sensing (CS) based algorithm for image de-nosing. Optimization theory was utilized. A cost function consisting of data fidelity term and penalty term was proposed. The minimization of cost function was achieved by proximal minimization method. The advantage of the algorithm is two-fold. First, we embedded the filtering procedure into a CS framework. It enhanced the effectiveness of filtering strategy. As known, repetitive post filters make images blurred, but CS in the proposed algorithm could keep the image clarity while achieving noise depression. Second, selectivity of filter type, especially nonlinear filters, strengthened the effectiveness and practicability of CS. With increasing number of literatures revealing the failure of total variation (TV) method in processing images with rich details, the new algorithm could preserve image textures and object boundaries accurately. Convergence property of the novel algorithm was also proved by the de-nosing instance. Among the nonlinear filters, nonlocal weighted median filter based CS presented the best de-noising effectiveness. The algorithm is considered to have a potential application value in other image processing issues, such as image restoration and reconstruction.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133052667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Effect of A Visual Novel Application on Students’ Learning Motivation in Biology for Secondary School in Malaysia 视觉小说应用对马来西亚中学生生物学习动机的影响
Pub Date : 2021-01-08 DOI: 10.1145/3449388.3449399
K. T. Chau, N. Nasir
This research work investigates the capability of a genre of a digital game called “Grey Plague” visual novel in enhancing students’ motivation in Biology. This research was driven by the situation that many systems are quite unsuccessful in teaching Biology concepts to the students. Quantitative research utilising a set of questionnaires from 30 students had been conducted. The results of the user evaluation show that the mean scores for Learning Motivation and Aesthetics were 4.762 and 4.111 respectively. Aesthetics were found positively correlated to Learning Motivation as well whereby the correlation values were 0.083. This concludes that the visual novel application motivates and stimulates Malaysian science secondary students’ interest to pursue their learning in Biology.
这项研究工作调查了一种名为“灰色瘟疫”的视觉小说类型的数字游戏在提高学生学习生物学的动机方面的能力。这项研究是由许多系统在向学生教授生物学概念方面相当不成功的情况推动的。对30名学生进行了一套问卷调查,并进行了定量研究。用户评价结果显示,学习动机和美学的平均得分分别为4.762分和4.111分。美学与学习动机也存在正相关,相关值为0.083。综上所述,视觉小说的应用激发和激发了马来西亚理科中学生继续学习生物学的兴趣。
{"title":"The Effect of A Visual Novel Application on Students’ Learning Motivation in Biology for Secondary School in Malaysia","authors":"K. T. Chau, N. Nasir","doi":"10.1145/3449388.3449399","DOIUrl":"https://doi.org/10.1145/3449388.3449399","url":null,"abstract":"This research work investigates the capability of a genre of a digital game called “Grey Plague” visual novel in enhancing students’ motivation in Biology. This research was driven by the situation that many systems are quite unsuccessful in teaching Biology concepts to the students. Quantitative research utilising a set of questionnaires from 30 students had been conducted. The results of the user evaluation show that the mean scores for Learning Motivation and Aesthetics were 4.762 and 4.111 respectively. Aesthetics were found positively correlated to Learning Motivation as well whereby the correlation values were 0.083. This concludes that the visual novel application motivates and stimulates Malaysian science secondary students’ interest to pursue their learning in Biology.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126007405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Correlation Filters with Pre-position by Reconstruction Error for Visual Tracking 基于重建误差预定位的视觉跟踪相关滤波器
Pub Date : 2021-01-08 DOI: 10.1145/3449388.3449392
Sheng-liang Hu, Mingwu Ren
Correlation filter based on deep neural network is a kind of mainstream method for real-time object tracking. It combines the high efficiency of correlation filtering and the great representation ability of convolutional neural network. However, this method inherits most shortcomings of correlation filter such as boundary effects. If an object is close to the boundary of a search area due to a large displacement, the useful information will be filtered out by cosine window and padding. In order to alleviate boundary effects, we propose a coarse positioning module to fine tune the search area before cosine window and padding. The core of the proposed module is saliency detection based on reconstruction error. This enables the improved trackers to retain more object information than the prototypes. Experimental results show that our method obviously promotes the baseline model, namely DCFNet, in the case of fast motion. Due to the low computational cost of our coarse positioning module, the improved trackers still have real-time rate.
基于深度神经网络的相关滤波是一种主流的实时目标跟踪方法。它结合了高效的相关滤波和卷积神经网络的强大表示能力。然而,该方法继承了相关滤波器的大部分缺点,如边界效应。如果目标由于较大的位移而靠近搜索区域的边界,则通过余弦窗和填充过滤掉有用的信息。为了减轻边界效应,我们提出了一个粗定位模块,在余弦窗口和填充前对搜索区域进行微调。该模块的核心是基于重构误差的显著性检测。这使得改进的跟踪器能够比原型保留更多的对象信息。实验结果表明,在快速运动情况下,我们的方法对基线模型(DCFNet)有明显的促进作用。由于我们的粗定位模块计算成本低,改进后的跟踪器仍然具有实时性。
{"title":"Correlation Filters with Pre-position by Reconstruction Error for Visual Tracking","authors":"Sheng-liang Hu, Mingwu Ren","doi":"10.1145/3449388.3449392","DOIUrl":"https://doi.org/10.1145/3449388.3449392","url":null,"abstract":"Correlation filter based on deep neural network is a kind of mainstream method for real-time object tracking. It combines the high efficiency of correlation filtering and the great representation ability of convolutional neural network. However, this method inherits most shortcomings of correlation filter such as boundary effects. If an object is close to the boundary of a search area due to a large displacement, the useful information will be filtered out by cosine window and padding. In order to alleviate boundary effects, we propose a coarse positioning module to fine tune the search area before cosine window and padding. The core of the proposed module is saliency detection based on reconstruction error. This enables the improved trackers to retain more object information than the prototypes. Experimental results show that our method obviously promotes the baseline model, namely DCFNet, in the case of fast motion. Due to the low computational cost of our coarse positioning module, the improved trackers still have real-time rate.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122456980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Registration between MVCT reconstructed from EPID and kVCT EPID重建的MVCT与kVCT的配准
Pub Date : 2021-01-08 DOI: 10.1145/3449388.3449398
Miaomiao Lu, Jun Zhang, Zhibiao Cheng, Junhai Wen
The registration of two-dimensional MV electronic portal imaging device (EPID) images and digital reconstruction radiograph (DRR) images has been widely used for the setup error correction of radiotherapy, the approaches estimate the 3D transformation is not very accurate. The purpose of this paper is to verify the feasibility of a new setup error estimation method that registers 3D planning CT image and 3D image that is reconstructed based on EPID. EPID images were acquired and used to reconstruct the MVCT by Algebraic Reconstruction Technique (ART) algorithm. The reconstructed image and the planning CT were registered by maximizing the mutual information (MI) between two 3D images. The registration error is less than 3mm, which is suitable for clinical implementation. The study demonstrated that the 3D/3D registration method proposed for the setup error correction of radiotherapy is feasible.
二维MV电子门静脉成像装置(EPID)图像与数字重建放射成像(DRR)图像的配准一直被广泛用于放疗的设置误差校正,但该方法估计的三维变换精度不高。本文的目的是验证一种新的设置误差估计方法的可行性,该方法将三维规划CT图像与基于EPID重建的三维图像进行配准。获取EPID图像,利用代数重建技术(ART)算法重建MVCT。利用最大互信息(MI)对重建图像和规划CT进行配准。配准误差小于3mm,适合临床实施。研究表明,提出的三维/三维配准方法对放疗的设置误差校正是可行的。
{"title":"Registration between MVCT reconstructed from EPID and kVCT","authors":"Miaomiao Lu, Jun Zhang, Zhibiao Cheng, Junhai Wen","doi":"10.1145/3449388.3449398","DOIUrl":"https://doi.org/10.1145/3449388.3449398","url":null,"abstract":"The registration of two-dimensional MV electronic portal imaging device (EPID) images and digital reconstruction radiograph (DRR) images has been widely used for the setup error correction of radiotherapy, the approaches estimate the 3D transformation is not very accurate. The purpose of this paper is to verify the feasibility of a new setup error estimation method that registers 3D planning CT image and 3D image that is reconstructed based on EPID. EPID images were acquired and used to reconstruct the MVCT by Algebraic Reconstruction Technique (ART) algorithm. The reconstructed image and the planning CT were registered by maximizing the mutual information (MI) between two 3D images. The registration error is less than 3mm, which is suitable for clinical implementation. The study demonstrated that the 3D/3D registration method proposed for the setup error correction of radiotherapy is feasible.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130060215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Lipid droplet recognition based on watershed algorithm and convolutional neural network 基于分水岭算法和卷积神经网络的脂滴识别
Pub Date : 2021-01-08 DOI: 10.1145/3449388.3449400
Shiwei Li, Shiqun Yin, Haibo Deng
Unbalanced storage and utilization of lipids in the liver can easily lead to non-alcoholic fatty liver, obesity and metabolic syndrome. Therefore, it is very significant to detect and classify lipids in cell pathology pictures. In order to achieve accurate identification of lipid droplets, we improved the watershed algorithm to achieve the segmentation of lipid droplets, and classified the lipid droplets based on transfer learning through a convolutional neural network. The experiment shows that the improved watershed algorithm is used to segment the lipid droplets and has achieved good results. The convolutional neural network transfer learning has achieved a classification accuracy of about 99%.
脂质在肝脏中的储存和利用不平衡,容易导致非酒精性脂肪肝、肥胖和代谢综合征。因此,在细胞病理图像中检测和分类脂质具有重要意义。为了实现对脂滴的准确识别,我们改进分水岭算法实现对脂滴的分割,并通过卷积神经网络基于迁移学习对脂滴进行分类。实验表明,采用改进的分水岭算法对脂滴进行分割,取得了较好的效果。卷积神经网络迁移学习的分类准确率达到了99%左右。
{"title":"Lipid droplet recognition based on watershed algorithm and convolutional neural network","authors":"Shiwei Li, Shiqun Yin, Haibo Deng","doi":"10.1145/3449388.3449400","DOIUrl":"https://doi.org/10.1145/3449388.3449400","url":null,"abstract":"Unbalanced storage and utilization of lipids in the liver can easily lead to non-alcoholic fatty liver, obesity and metabolic syndrome. Therefore, it is very significant to detect and classify lipids in cell pathology pictures. In order to achieve accurate identification of lipid droplets, we improved the watershed algorithm to achieve the segmentation of lipid droplets, and classified the lipid droplets based on transfer learning through a convolutional neural network. The experiment shows that the improved watershed algorithm is used to segment the lipid droplets and has achieved good results. The convolutional neural network transfer learning has achieved a classification accuracy of about 99%.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125970914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of Commodity image recognition based on deep learning 基于深度学习的商品图像识别分析
Pub Date : 2021-01-08 DOI: 10.1145/3449388.3449389
Lijuan Xie
Deep learning has developed rapidly in recent years, especially in the field of image recognition. In this paper, the commodity recognition based on object detection method using deep convolutional neutral networks is investigated. Firstly, the commodity image dataset in real-world retail product checkout situations is constructed. Then, the image data is trained via object detection deep networks. Finally, three representative deep learning methods involving YOLOv3, Faster R-CNN and RetinaNet are analyzed in detail. The experimental results show the effectiveness of our proposed approach.
本文研究了基于深度卷积神经网络的商品识别方法。首先,构建真实零售商品结账场景下的商品图像数据集。然后,通过目标检测深度网络对图像数据进行训练。最后,详细分析了YOLOv3、Faster R-CNN和RetinaNet三种具有代表性的深度学习方法。实验结果表明了该方法的有效性。
{"title":"Analysis of Commodity image recognition based on deep learning","authors":"Lijuan Xie","doi":"10.1145/3449388.3449389","DOIUrl":"https://doi.org/10.1145/3449388.3449389","url":null,"abstract":"Deep learning has developed rapidly in recent years, especially in the field of image recognition. In this paper, the commodity recognition based on object detection method using deep convolutional neutral networks is investigated. Firstly, the commodity image dataset in real-world retail product checkout situations is constructed. Then, the image data is trained via object detection deep networks. Finally, three representative deep learning methods involving YOLOv3, Faster R-CNN and RetinaNet are analyzed in detail. The experimental results show the effectiveness of our proposed approach.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"&NA; 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126025578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Improved SSD for small target detection 改进的SSD,用于小目标检测
Pub Date : 2021-01-08 DOI: 10.1145/3449388.3449391
Xiang Li, Haibo Luo
SSD is one of heuristic one-stage target detection approaches. Although it has got impressive results in general target detection, it still struggles in small-size object detection and precise location. In this paper, we proposed an improved SSD which forces on the small-size target detection. We include a shallow and high resolution feature into the hierarchical detection feature which are used for prediction. Then, we fuse the detection features (including the shallow and high resolution one) as a feature pyramid through some convolution layers and unsample operations to pass information from deep features to the shallow ones, aiming to enrich the semantic information of the shallow features. To make the network easier to converge, we add a L2 normalization to the bottom detection feature of the feature pyramid to make a norm balance between each pyramid feature. The experimental results on the VEDAI dataset show that the proposed method has obtained impressive progress than the original SSD for the small targets detection.
SSD是一种启发式的单阶段目标检测方法。虽然它在一般目标检测方面取得了令人瞩目的成绩,但在小尺寸目标检测和精确定位方面仍然存在困难。在本文中,我们提出了一种改进的固态硬盘,用于小尺寸目标的检测。我们在分层检测特征中加入了用于预测的浅分辨率和高分辨率特征。然后,我们通过卷积层和反采样操作将检测特征(包括浅分辨率和高分辨率特征)融合成一个特征金字塔,将深度特征的信息传递给浅分辨率特征,以丰富浅分辨率特征的语义信息。为了使网络更容易收敛,我们在特征金字塔的底部检测特征上增加了L2归一化,使每个金字塔特征之间实现了范数平衡。在VEDAI数据集上的实验结果表明,该方法在小目标检测方面取得了显著的进步。
{"title":"An Improved SSD for small target detection","authors":"Xiang Li, Haibo Luo","doi":"10.1145/3449388.3449391","DOIUrl":"https://doi.org/10.1145/3449388.3449391","url":null,"abstract":"SSD is one of heuristic one-stage target detection approaches. Although it has got impressive results in general target detection, it still struggles in small-size object detection and precise location. In this paper, we proposed an improved SSD which forces on the small-size target detection. We include a shallow and high resolution feature into the hierarchical detection feature which are used for prediction. Then, we fuse the detection features (including the shallow and high resolution one) as a feature pyramid through some convolution layers and unsample operations to pass information from deep features to the shallow ones, aiming to enrich the semantic information of the shallow features. To make the network easier to converge, we add a L2 normalization to the bottom detection feature of the feature pyramid to make a norm balance between each pyramid feature. The experimental results on the VEDAI dataset show that the proposed method has obtained impressive progress than the original SSD for the small targets detection.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127120721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Document Fragments Restoration via Similarity Measurement 基于相似性度量的文档片段修复
Pub Date : 2021-01-08 DOI: 10.1145/3449388.3449401
Yuelan Liu, Yuefan Liu, Fanyu Meng
The automatic restoration technology of shredded paper is an important branch in computer science. It plays an important role in judicial evidence restoration, the restoration of secret documents, and many other areas. In this article, we establish a similarity measurement model by data mining. This article mainly focuses on Chinese text files with regular cutting. The mathematic model is established and used for restoration, we provide several measurements to achieve the restoration and reduce the workload of manual intervention. At the same time, this article provides a way to restore two-side printing shredded documents. This paper gives experimental results that prove the effectiveness of the proposed method.
碎纸自动修复技术是计算机科学的一个重要分支。它在司法证据复原、秘密文件复原等诸多领域发挥着重要作用。本文采用数据挖掘的方法建立了相似度度量模型。本文主要研究规则裁剪的中文文本文件。建立了数学模型并用于修复,提出了几种实现修复的措施,减少了人工干预的工作量。同时,本文提供了一种双面打印粉碎文件的复原方法。实验结果证明了该方法的有效性。
{"title":"Document Fragments Restoration via Similarity Measurement","authors":"Yuelan Liu, Yuefan Liu, Fanyu Meng","doi":"10.1145/3449388.3449401","DOIUrl":"https://doi.org/10.1145/3449388.3449401","url":null,"abstract":"The automatic restoration technology of shredded paper is an important branch in computer science. It plays an important role in judicial evidence restoration, the restoration of secret documents, and many other areas. In this article, we establish a similarity measurement model by data mining. This article mainly focuses on Chinese text files with regular cutting. The mathematic model is established and used for restoration, we provide several measurements to achieve the restoration and reduce the workload of manual intervention. At the same time, this article provides a way to restore two-side printing shredded documents. This paper gives experimental results that prove the effectiveness of the proposed method.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"26 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134259953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 6th International Conference on Multimedia and Image Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1