In order to accurately identify the behavior of dairy goats in the image, a multi-model fusion convolutional neural network (CNN) method based on the image of dairy goats is proposed. At first, the AlexNet, ResNet50 and Vgg16 models are trained respectively, and the best recognition results of each model are obtained. Then, the attention weight of each model is calculated by feature stitching and other operations. Finally,The feature information of AlexNet, ResNet50 and Vgg16 is combined with attention mechanism to re-weight,and the parameters of the fused multi-model convolutional neural networks are adjusted to obtain the best recognition results of fusion models. Experimental results show that compared with single model and multi-model, the ARV fusion model we proposed achieves higher recognition accuracy, and the average accuracy of each dairy goat behavior is as high as 98.50%.
{"title":"Research on Behavior Recognition of Dairy Goat Based on Multi-model Fusion","authors":"Yi Li, Jinglei Tang, Dongjian He","doi":"10.1145/3449388.3449395","DOIUrl":"https://doi.org/10.1145/3449388.3449395","url":null,"abstract":"In order to accurately identify the behavior of dairy goats in the image, a multi-model fusion convolutional neural network (CNN) method based on the image of dairy goats is proposed. At first, the AlexNet, ResNet50 and Vgg16 models are trained respectively, and the best recognition results of each model are obtained. Then, the attention weight of each model is calculated by feature stitching and other operations. Finally,The feature information of AlexNet, ResNet50 and Vgg16 is combined with attention mechanism to re-weight,and the parameters of the fused multi-model convolutional neural networks are adjusted to obtain the best recognition results of fusion models. Experimental results show that compared with single model and multi-model, the ARV fusion model we proposed achieves higher recognition accuracy, and the average accuracy of each dairy goat behavior is as high as 98.50%.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122021149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multispectral imaging extracts rich spectral information from targets, which greatly expands the function of traditional imaging technology. Multispectral imaging is widely used in agriculture, military, medicine, industry, and meteorology. Because of the information redundancy in multispectral images, it is necessary to reduce the dimension by pre-processing. In recent years, most of the researchers have adopted the methods of pre-processing before classification. Based on the principles of feature selection, feature transformation, and feature extraction, common dimensionality reduction methods are introduced, and the advantages and disadvantages of them are discussed. Afterwards, classification methods are divided into traditional methods and deep learning methods, and their characteristics and application prospect are discussed. Through comparison, the former are cost-effective and have the mature theories, while the latter have strong adaptability and high classification accuracy. At present, methods could be optimized from the perspective of saving computing resources and using spectral information efficiently. In the future, traditional methods will be improved and comprehensively used, while new methods with stronger adaptability and precision will be developed.
{"title":"Target Classification Algorithms Based on Multispectral Imaging: A Review","authors":"Zimu Zeng, Weifeng Wang, Wenpeng Zhang","doi":"10.1145/3449388.3449393","DOIUrl":"https://doi.org/10.1145/3449388.3449393","url":null,"abstract":"Multispectral imaging extracts rich spectral information from targets, which greatly expands the function of traditional imaging technology. Multispectral imaging is widely used in agriculture, military, medicine, industry, and meteorology. Because of the information redundancy in multispectral images, it is necessary to reduce the dimension by pre-processing. In recent years, most of the researchers have adopted the methods of pre-processing before classification. Based on the principles of feature selection, feature transformation, and feature extraction, common dimensionality reduction methods are introduced, and the advantages and disadvantages of them are discussed. Afterwards, classification methods are divided into traditional methods and deep learning methods, and their characteristics and application prospect are discussed. Through comparison, the former are cost-effective and have the mature theories, while the latter have strong adaptability and high classification accuracy. At present, methods could be optimized from the perspective of saving computing resources and using spectral information efficiently. In the future, traditional methods will be improved and comprehensively used, while new methods with stronger adaptability and precision will be developed.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114757963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the present era, the need for studies on noise removal by image processing is still considerable. In this paper, we developed a compressed sensing (CS) based algorithm for image de-nosing. Optimization theory was utilized. A cost function consisting of data fidelity term and penalty term was proposed. The minimization of cost function was achieved by proximal minimization method. The advantage of the algorithm is two-fold. First, we embedded the filtering procedure into a CS framework. It enhanced the effectiveness of filtering strategy. As known, repetitive post filters make images blurred, but CS in the proposed algorithm could keep the image clarity while achieving noise depression. Second, selectivity of filter type, especially nonlinear filters, strengthened the effectiveness and practicability of CS. With increasing number of literatures revealing the failure of total variation (TV) method in processing images with rich details, the new algorithm could preserve image textures and object boundaries accurately. Convergence property of the novel algorithm was also proved by the de-nosing instance. Among the nonlinear filters, nonlocal weighted median filter based CS presented the best de-noising effectiveness. The algorithm is considered to have a potential application value in other image processing issues, such as image restoration and reconstruction.
{"title":"Nonlinear Filtered Compressed Sensing Applied on Image De-noising","authors":"Jian Dong, Yang Ding, H. Kudo","doi":"10.1145/3449388.3449390","DOIUrl":"https://doi.org/10.1145/3449388.3449390","url":null,"abstract":"In the present era, the need for studies on noise removal by image processing is still considerable. In this paper, we developed a compressed sensing (CS) based algorithm for image de-nosing. Optimization theory was utilized. A cost function consisting of data fidelity term and penalty term was proposed. The minimization of cost function was achieved by proximal minimization method. The advantage of the algorithm is two-fold. First, we embedded the filtering procedure into a CS framework. It enhanced the effectiveness of filtering strategy. As known, repetitive post filters make images blurred, but CS in the proposed algorithm could keep the image clarity while achieving noise depression. Second, selectivity of filter type, especially nonlinear filters, strengthened the effectiveness and practicability of CS. With increasing number of literatures revealing the failure of total variation (TV) method in processing images with rich details, the new algorithm could preserve image textures and object boundaries accurately. Convergence property of the novel algorithm was also proved by the de-nosing instance. Among the nonlinear filters, nonlocal weighted median filter based CS presented the best de-noising effectiveness. The algorithm is considered to have a potential application value in other image processing issues, such as image restoration and reconstruction.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133052667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This research work investigates the capability of a genre of a digital game called “Grey Plague” visual novel in enhancing students’ motivation in Biology. This research was driven by the situation that many systems are quite unsuccessful in teaching Biology concepts to the students. Quantitative research utilising a set of questionnaires from 30 students had been conducted. The results of the user evaluation show that the mean scores for Learning Motivation and Aesthetics were 4.762 and 4.111 respectively. Aesthetics were found positively correlated to Learning Motivation as well whereby the correlation values were 0.083. This concludes that the visual novel application motivates and stimulates Malaysian science secondary students’ interest to pursue their learning in Biology.
{"title":"The Effect of A Visual Novel Application on Students’ Learning Motivation in Biology for Secondary School in Malaysia","authors":"K. T. Chau, N. Nasir","doi":"10.1145/3449388.3449399","DOIUrl":"https://doi.org/10.1145/3449388.3449399","url":null,"abstract":"This research work investigates the capability of a genre of a digital game called “Grey Plague” visual novel in enhancing students’ motivation in Biology. This research was driven by the situation that many systems are quite unsuccessful in teaching Biology concepts to the students. Quantitative research utilising a set of questionnaires from 30 students had been conducted. The results of the user evaluation show that the mean scores for Learning Motivation and Aesthetics were 4.762 and 4.111 respectively. Aesthetics were found positively correlated to Learning Motivation as well whereby the correlation values were 0.083. This concludes that the visual novel application motivates and stimulates Malaysian science secondary students’ interest to pursue their learning in Biology.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126007405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Correlation filter based on deep neural network is a kind of mainstream method for real-time object tracking. It combines the high efficiency of correlation filtering and the great representation ability of convolutional neural network. However, this method inherits most shortcomings of correlation filter such as boundary effects. If an object is close to the boundary of a search area due to a large displacement, the useful information will be filtered out by cosine window and padding. In order to alleviate boundary effects, we propose a coarse positioning module to fine tune the search area before cosine window and padding. The core of the proposed module is saliency detection based on reconstruction error. This enables the improved trackers to retain more object information than the prototypes. Experimental results show that our method obviously promotes the baseline model, namely DCFNet, in the case of fast motion. Due to the low computational cost of our coarse positioning module, the improved trackers still have real-time rate.
{"title":"Correlation Filters with Pre-position by Reconstruction Error for Visual Tracking","authors":"Sheng-liang Hu, Mingwu Ren","doi":"10.1145/3449388.3449392","DOIUrl":"https://doi.org/10.1145/3449388.3449392","url":null,"abstract":"Correlation filter based on deep neural network is a kind of mainstream method for real-time object tracking. It combines the high efficiency of correlation filtering and the great representation ability of convolutional neural network. However, this method inherits most shortcomings of correlation filter such as boundary effects. If an object is close to the boundary of a search area due to a large displacement, the useful information will be filtered out by cosine window and padding. In order to alleviate boundary effects, we propose a coarse positioning module to fine tune the search area before cosine window and padding. The core of the proposed module is saliency detection based on reconstruction error. This enables the improved trackers to retain more object information than the prototypes. Experimental results show that our method obviously promotes the baseline model, namely DCFNet, in the case of fast motion. Due to the low computational cost of our coarse positioning module, the improved trackers still have real-time rate.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122456980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The registration of two-dimensional MV electronic portal imaging device (EPID) images and digital reconstruction radiograph (DRR) images has been widely used for the setup error correction of radiotherapy, the approaches estimate the 3D transformation is not very accurate. The purpose of this paper is to verify the feasibility of a new setup error estimation method that registers 3D planning CT image and 3D image that is reconstructed based on EPID. EPID images were acquired and used to reconstruct the MVCT by Algebraic Reconstruction Technique (ART) algorithm. The reconstructed image and the planning CT were registered by maximizing the mutual information (MI) between two 3D images. The registration error is less than 3mm, which is suitable for clinical implementation. The study demonstrated that the 3D/3D registration method proposed for the setup error correction of radiotherapy is feasible.
{"title":"Registration between MVCT reconstructed from EPID and kVCT","authors":"Miaomiao Lu, Jun Zhang, Zhibiao Cheng, Junhai Wen","doi":"10.1145/3449388.3449398","DOIUrl":"https://doi.org/10.1145/3449388.3449398","url":null,"abstract":"The registration of two-dimensional MV electronic portal imaging device (EPID) images and digital reconstruction radiograph (DRR) images has been widely used for the setup error correction of radiotherapy, the approaches estimate the 3D transformation is not very accurate. The purpose of this paper is to verify the feasibility of a new setup error estimation method that registers 3D planning CT image and 3D image that is reconstructed based on EPID. EPID images were acquired and used to reconstruct the MVCT by Algebraic Reconstruction Technique (ART) algorithm. The reconstructed image and the planning CT were registered by maximizing the mutual information (MI) between two 3D images. The registration error is less than 3mm, which is suitable for clinical implementation. The study demonstrated that the 3D/3D registration method proposed for the setup error correction of radiotherapy is feasible.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130060215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Unbalanced storage and utilization of lipids in the liver can easily lead to non-alcoholic fatty liver, obesity and metabolic syndrome. Therefore, it is very significant to detect and classify lipids in cell pathology pictures. In order to achieve accurate identification of lipid droplets, we improved the watershed algorithm to achieve the segmentation of lipid droplets, and classified the lipid droplets based on transfer learning through a convolutional neural network. The experiment shows that the improved watershed algorithm is used to segment the lipid droplets and has achieved good results. The convolutional neural network transfer learning has achieved a classification accuracy of about 99%.
{"title":"Lipid droplet recognition based on watershed algorithm and convolutional neural network","authors":"Shiwei Li, Shiqun Yin, Haibo Deng","doi":"10.1145/3449388.3449400","DOIUrl":"https://doi.org/10.1145/3449388.3449400","url":null,"abstract":"Unbalanced storage and utilization of lipids in the liver can easily lead to non-alcoholic fatty liver, obesity and metabolic syndrome. Therefore, it is very significant to detect and classify lipids in cell pathology pictures. In order to achieve accurate identification of lipid droplets, we improved the watershed algorithm to achieve the segmentation of lipid droplets, and classified the lipid droplets based on transfer learning through a convolutional neural network. The experiment shows that the improved watershed algorithm is used to segment the lipid droplets and has achieved good results. The convolutional neural network transfer learning has achieved a classification accuracy of about 99%.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125970914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep learning has developed rapidly in recent years, especially in the field of image recognition. In this paper, the commodity recognition based on object detection method using deep convolutional neutral networks is investigated. Firstly, the commodity image dataset in real-world retail product checkout situations is constructed. Then, the image data is trained via object detection deep networks. Finally, three representative deep learning methods involving YOLOv3, Faster R-CNN and RetinaNet are analyzed in detail. The experimental results show the effectiveness of our proposed approach.
{"title":"Analysis of Commodity image recognition based on deep learning","authors":"Lijuan Xie","doi":"10.1145/3449388.3449389","DOIUrl":"https://doi.org/10.1145/3449388.3449389","url":null,"abstract":"Deep learning has developed rapidly in recent years, especially in the field of image recognition. In this paper, the commodity recognition based on object detection method using deep convolutional neutral networks is investigated. Firstly, the commodity image dataset in real-world retail product checkout situations is constructed. Then, the image data is trained via object detection deep networks. Finally, three representative deep learning methods involving YOLOv3, Faster R-CNN and RetinaNet are analyzed in detail. The experimental results show the effectiveness of our proposed approach.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"&NA; 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126025578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SSD is one of heuristic one-stage target detection approaches. Although it has got impressive results in general target detection, it still struggles in small-size object detection and precise location. In this paper, we proposed an improved SSD which forces on the small-size target detection. We include a shallow and high resolution feature into the hierarchical detection feature which are used for prediction. Then, we fuse the detection features (including the shallow and high resolution one) as a feature pyramid through some convolution layers and unsample operations to pass information from deep features to the shallow ones, aiming to enrich the semantic information of the shallow features. To make the network easier to converge, we add a L2 normalization to the bottom detection feature of the feature pyramid to make a norm balance between each pyramid feature. The experimental results on the VEDAI dataset show that the proposed method has obtained impressive progress than the original SSD for the small targets detection.
{"title":"An Improved SSD for small target detection","authors":"Xiang Li, Haibo Luo","doi":"10.1145/3449388.3449391","DOIUrl":"https://doi.org/10.1145/3449388.3449391","url":null,"abstract":"SSD is one of heuristic one-stage target detection approaches. Although it has got impressive results in general target detection, it still struggles in small-size object detection and precise location. In this paper, we proposed an improved SSD which forces on the small-size target detection. We include a shallow and high resolution feature into the hierarchical detection feature which are used for prediction. Then, we fuse the detection features (including the shallow and high resolution one) as a feature pyramid through some convolution layers and unsample operations to pass information from deep features to the shallow ones, aiming to enrich the semantic information of the shallow features. To make the network easier to converge, we add a L2 normalization to the bottom detection feature of the feature pyramid to make a norm balance between each pyramid feature. The experimental results on the VEDAI dataset show that the proposed method has obtained impressive progress than the original SSD for the small targets detection.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127120721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The automatic restoration technology of shredded paper is an important branch in computer science. It plays an important role in judicial evidence restoration, the restoration of secret documents, and many other areas. In this article, we establish a similarity measurement model by data mining. This article mainly focuses on Chinese text files with regular cutting. The mathematic model is established and used for restoration, we provide several measurements to achieve the restoration and reduce the workload of manual intervention. At the same time, this article provides a way to restore two-side printing shredded documents. This paper gives experimental results that prove the effectiveness of the proposed method.
{"title":"Document Fragments Restoration via Similarity Measurement","authors":"Yuelan Liu, Yuefan Liu, Fanyu Meng","doi":"10.1145/3449388.3449401","DOIUrl":"https://doi.org/10.1145/3449388.3449401","url":null,"abstract":"The automatic restoration technology of shredded paper is an important branch in computer science. It plays an important role in judicial evidence restoration, the restoration of secret documents, and many other areas. In this article, we establish a similarity measurement model by data mining. This article mainly focuses on Chinese text files with regular cutting. The mathematic model is established and used for restoration, we provide several measurements to achieve the restoration and reduce the workload of manual intervention. At the same time, this article provides a way to restore two-side printing shredded documents. This paper gives experimental results that prove the effectiveness of the proposed method.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"26 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134259953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}