首页 > 最新文献

Proceedings of the 2nd International Conference on Image and Graphics Processing最新文献

英文 中文
Interactive reconstruction of the 3D-models using single-view images and user markup 使用单视图图像和用户标记对3d模型进行交互式重建
A. Fedorov, I. Ivashnev, V. Afanasiev, Valery Krivtsov, Aleksandr Zatolokin, S. Zyrin
This paper presents a method for reconstruction of 3D-model using single view image of the object. To do this, we need to detect the edges of the object on the image and get a small markup from the user. In addition, our method allows us to obtain the texture of the 3D model.
本文提出了一种利用物体单视图图像重建三维模型的方法。要做到这一点,我们需要检测图像上对象的边缘,并从用户那里获得一个小标记。此外,我们的方法允许我们获得3D模型的纹理。
{"title":"Interactive reconstruction of the 3D-models using single-view images and user markup","authors":"A. Fedorov, I. Ivashnev, V. Afanasiev, Valery Krivtsov, Aleksandr Zatolokin, S. Zyrin","doi":"10.1145/3313950.3313953","DOIUrl":"https://doi.org/10.1145/3313950.3313953","url":null,"abstract":"This paper presents a method for reconstruction of 3D-model using single view image of the object. To do this, we need to detect the edges of the object on the image and get a small markup from the user. In addition, our method allows us to obtain the texture of the 3D model.","PeriodicalId":392037,"journal":{"name":"Proceedings of the 2nd International Conference on Image and Graphics Processing","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115232117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Speech recognition and Filipino sign language E-tutor system: an assistive multimodal learning approach 语音识别与菲律宾手语电子辅导系统:一种辅助多模态学习方法
M. Samonte
Speech recognition technology facilitates student learning. It has potential benefits for students with physical disabilities and the technology has been implemented in the classroom over the years in order to learn in a more efficient way. This study provides deaf students with various methods of studying, learning, and remembering new information. Aside from speech-to-text, the developed system is also provided with speech-to-visual approach, which represents information associated with objects. Also, Filipino Sign Language was used utilize as an alternative way of presenting Statistics lessons included in K-12 curriculum. Practical real world approach in presenting Statistics lessons are used to enhance delivery in face to face class se-up or in self-phasing learning. These multiple learning strategies were combined together to have a balance approach for a greater practice and recall will be more successful, especially for the target users. From the initial results, this research showed a significant advantage of using speech recognition and Filipino Sign Language in learning basic Statistics lessons compared from the traditional method.
语音识别技术有利于学生的学习。它对身体残疾的学生有潜在的好处,多年来,为了更有效地学习,这项技术已经在课堂上得到了应用。这项研究为聋哑学生提供了各种学习、学习和记忆新信息的方法。除了语音到文本之外,所开发的系统还提供了语音到视觉的方法,它表示与对象相关的信息。此外,菲律宾手语被用作K-12课程中统计课程的另一种呈现方式。实用的现实世界的方法,在介绍统计课程,以加强交付在面对面的课堂上的准备或在自我分阶段学习。这些多种学习策略结合在一起,形成了一种平衡的方法,可以进行更大规模的练习,而且记忆会更成功,尤其是对目标用户而言。从初步的研究结果来看,使用语音识别和菲律宾手语在基础统计学课程的学习上与传统方法相比具有显著的优势。
{"title":"Speech recognition and Filipino sign language E-tutor system: an assistive multimodal learning approach","authors":"M. Samonte","doi":"10.1145/3313950.3313970","DOIUrl":"https://doi.org/10.1145/3313950.3313970","url":null,"abstract":"Speech recognition technology facilitates student learning. It has potential benefits for students with physical disabilities and the technology has been implemented in the classroom over the years in order to learn in a more efficient way. This study provides deaf students with various methods of studying, learning, and remembering new information. Aside from speech-to-text, the developed system is also provided with speech-to-visual approach, which represents information associated with objects. Also, Filipino Sign Language was used utilize as an alternative way of presenting Statistics lessons included in K-12 curriculum. Practical real world approach in presenting Statistics lessons are used to enhance delivery in face to face class se-up or in self-phasing learning. These multiple learning strategies were combined together to have a balance approach for a greater practice and recall will be more successful, especially for the target users. From the initial results, this research showed a significant advantage of using speech recognition and Filipino Sign Language in learning basic Statistics lessons compared from the traditional method.","PeriodicalId":392037,"journal":{"name":"Proceedings of the 2nd International Conference on Image and Graphics Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114479357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Korean sign language recognition based on image and convolution neural network 基于图像和卷积神经网络的韩语手语识别
Hyojoo Shin, Woo-Je Kim, Kyoung-ae Jang
The purpose of this paper is to develop a convolution neural network based model for Korean sign language recognition. For this purpose, sign language videos were collected for 10 selected words of Korean sign language and these videos were converted into images to have 9 frames. The images with 9 frames were used as input data for the convolution neural network based model developed in this study. In order to develop the model for Korean sign language recognition, experiments for determining the number of convolution layers was first performed. Second, experiments for the pooling which intentionally reduces the features of the feature map was performed. Third, we conducted an experiment to reduce over fitting in the model learning process. Based on the experiments, we have developed a convolution neural network based model for Korean sign language recognition. The accuracy of the developed model was about 84.5% for the 10 selected Korean sign words.
本文的目的是建立一个基于卷积神经网络的韩语手语识别模型。为此,选取10个韩语手语词汇,收集手语视频,并将这些视频转换成图像,共9帧。采用9帧图像作为输入数据,建立基于卷积神经网络的模型。为了开发韩语手语识别模型,首先进行了确定卷积层数的实验。其次,进行了有意减少特征映射特征的池化实验。第三,我们进行了一个实验,以减少模型学习过程中的过拟合。在实验的基础上,我们建立了一个基于卷积神经网络的韩语手语识别模型。所建立的模型对所选的10个韩语手语的准确率约为84.5%。
{"title":"Korean sign language recognition based on image and convolution neural network","authors":"Hyojoo Shin, Woo-Je Kim, Kyoung-ae Jang","doi":"10.1145/3313950.3313967","DOIUrl":"https://doi.org/10.1145/3313950.3313967","url":null,"abstract":"The purpose of this paper is to develop a convolution neural network based model for Korean sign language recognition. For this purpose, sign language videos were collected for 10 selected words of Korean sign language and these videos were converted into images to have 9 frames. The images with 9 frames were used as input data for the convolution neural network based model developed in this study. In order to develop the model for Korean sign language recognition, experiments for determining the number of convolution layers was first performed. Second, experiments for the pooling which intentionally reduces the features of the feature map was performed. Third, we conducted an experiment to reduce over fitting in the model learning process. Based on the experiments, we have developed a convolution neural network based model for Korean sign language recognition. The accuracy of the developed model was about 84.5% for the 10 selected Korean sign words.","PeriodicalId":392037,"journal":{"name":"Proceedings of the 2nd International Conference on Image and Graphics Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128397807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Development of the virtual museum of nonthaburi 论武里虚拟博物馆的开发
Kemmanat Mingsiritham, Gan Chanyawudhiwan
Nonthaburi is a city for learning with history, local wisdoms, art and culture, and lifestyle. It is necessary to establish a learning resource to create the feeling of bonding and protection of the city. However, the problem is the visitor needs to travel to the learning resource. The use of technology enables an easy and convenient access to the learning resource, leading to preservation of the culture. This research has an objective to develop the Virtual Museum of Nonthaburi by interviewing with curators and/or staff of Nonthaburi Museum and experts in virtual learning resources. It studies the using result of the Virtual Museum of Nonthaburi from general visitors. Data was analyzed by using mean, standard deviation and content analysis. Research results: 1) The model of the Virtual Museum of Nonthaburi consists of 6 components: 1) information, 2) media and tools used, 3) interaction, 4) design, 5) decision support system, and 6) supporting factors. The overall quality result found that the quality was at the highest level )X=4.51, S.D = .0.57( and 2) Using results of the Virtual Museum of Nonthaburi found that most visitors viewed that Nonthaburi had a lot of tourist attractions with historical significance, cultural value, and living aspect. Especially, pottery which presents the development from the past to present is beautiful local artwork that should be preserved. The overall satisfaction result found that the quality was at the high level (X=4.42, S.D = .0.65).
论武里是一座学习历史、当地智慧、艺术文化和生活方式的城市。有必要建立一个学习资源,创造城市的纽带和保护的感觉。然而,问题是访问者需要前往学习资源。技术的使用使人们能够轻松方便地获取学习资源,从而保护文化。本研究通过访谈论武里博物馆的馆长和/或工作人员以及虚拟学习资源方面的专家,旨在开发论武里的虚拟博物馆。研究了普通游客对论武里虚拟博物馆的使用效果。数据分析采用均值、标准差和含量分析。研究结果:1)论武里虚拟博物馆模型由6个组成部分组成:1)信息,2)使用的媒介和工具,3)交互,4)设计,5)决策支持系统,6)支持因素。X=4.51, S.D = 0.57(和2)对壬武里虚拟博物馆的使用结果发现,大多数游客认为壬武里有许多具有历史意义、文化价值和生活方面的旅游景点。特别是陶器,它呈现了从过去到现在的发展,是美丽的地方艺术品,应该得到保护。总体满意度结果显示,质量处于较高水平(X=4.42,标准差= 0.65)。
{"title":"Development of the virtual museum of nonthaburi","authors":"Kemmanat Mingsiritham, Gan Chanyawudhiwan","doi":"10.1145/3313950.3313972","DOIUrl":"https://doi.org/10.1145/3313950.3313972","url":null,"abstract":"Nonthaburi is a city for learning with history, local wisdoms, art and culture, and lifestyle. It is necessary to establish a learning resource to create the feeling of bonding and protection of the city. However, the problem is the visitor needs to travel to the learning resource. The use of technology enables an easy and convenient access to the learning resource, leading to preservation of the culture. This research has an objective to develop the Virtual Museum of Nonthaburi by interviewing with curators and/or staff of Nonthaburi Museum and experts in virtual learning resources. It studies the using result of the Virtual Museum of Nonthaburi from general visitors. Data was analyzed by using mean, standard deviation and content analysis. Research results: 1) The model of the Virtual Museum of Nonthaburi consists of 6 components: 1) information, 2) media and tools used, 3) interaction, 4) design, 5) decision support system, and 6) supporting factors. The overall quality result found that the quality was at the highest level )X=4.51, S.D = .0.57( and 2) Using results of the Virtual Museum of Nonthaburi found that most visitors viewed that Nonthaburi had a lot of tourist attractions with historical significance, cultural value, and living aspect. Especially, pottery which presents the development from the past to present is beautiful local artwork that should be preserved. The overall satisfaction result found that the quality was at the high level (X=4.42, S.D = .0.65).","PeriodicalId":392037,"journal":{"name":"Proceedings of the 2nd International Conference on Image and Graphics Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128191048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Polarization image fusion algorithm based on global information correction 基于全局信息校正的偏振图像融合算法
Xia Wang, Jing Sun, Ziyan Xu, Jun Chang
The paper proposes a fusion framework for getting more information from multi-dimensional polarization image. Overall, the challenge lies on overcoming the information loss arising from reflection/irradiation interference of polarizers, inherent defects of intensity images and improper distribution of fusion weights in most fusion processes. So we introduce a modified front polarizer system model, Tiansi mask operator and comprehensive weights. We start our methodology with the modified front polarizer system model, aiming to correct the polarization information. Then, we make use of the high- frequency information enhancement effect and low frequency information preservation ability of Tiansi operator, combined with adaptive histogram equalization (AHE) to achieve intensity enhancement. Finally, the contrast, saliency and exposedness weights of the source images are respectively calculated by using Laplace filtering, IG algorithm, Gauss model and weighting them to obtain the comprehensive weights. We obtain the final image by the fusion of the processed image and the corresponding weight coefficients. Experimental results show that our method has good visual effects and is beneficial to target detection.
本文提出了一种从多维偏振图像中获取更多信息的融合框架。总的来说,挑战在于克服大多数融合过程中由于偏振片反射/辐照干扰引起的信息损失、强度图像的固有缺陷和融合权分布的不合理。为此,我们引入了改进的前偏光系统模型、天思掩模算子和综合权值。我们从改进的前偏振片系统模型开始我们的方法,目的是校正偏振信息。然后,利用天四算子的高频信息增强效果和低频信息保持能力,结合自适应直方图均衡化(AHE)实现强度增强。最后,利用拉普拉斯滤波、IG算法、高斯模型分别计算源图像的对比度、显著性和曝光权,并对其进行加权,得到综合权重。将处理后的图像与相应的权重系数进行融合得到最终图像。实验结果表明,该方法具有良好的视觉效果,有利于目标检测。
{"title":"Polarization image fusion algorithm based on global information correction","authors":"Xia Wang, Jing Sun, Ziyan Xu, Jun Chang","doi":"10.1145/3313950.3313955","DOIUrl":"https://doi.org/10.1145/3313950.3313955","url":null,"abstract":"The paper proposes a fusion framework for getting more information from multi-dimensional polarization image. Overall, the challenge lies on overcoming the information loss arising from reflection/irradiation interference of polarizers, inherent defects of intensity images and improper distribution of fusion weights in most fusion processes. So we introduce a modified front polarizer system model, Tiansi mask operator and comprehensive weights. We start our methodology with the modified front polarizer system model, aiming to correct the polarization information. Then, we make use of the high- frequency information enhancement effect and low frequency information preservation ability of Tiansi operator, combined with adaptive histogram equalization (AHE) to achieve intensity enhancement. Finally, the contrast, saliency and exposedness weights of the source images are respectively calculated by using Laplace filtering, IG algorithm, Gauss model and weighting them to obtain the comprehensive weights. We obtain the final image by the fusion of the processed image and the corresponding weight coefficients. Experimental results show that our method has good visual effects and is beneficial to target detection.","PeriodicalId":392037,"journal":{"name":"Proceedings of the 2nd International Conference on Image and Graphics Processing","volume":"234 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115752087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Extraction of features from video files using different image algebraic point operations 利用不同图像代数点运算从视频文件中提取特征
P. Dutta, M. Nachamai
In the human-computer interaction (HCI) field, facial feature analysis and extraction are the most decisive stages which can lead to a robust and efficient classification system like facial expression recognition, emotion classification. In this paper, an approach to the problem of automatic facial feature extraction from different videos are presented using several image algebraic operations. These operations deal with pixel intensity values individually through some mathematical theory involved in image analysis and transformations. In this paper, 11 operations (point subtraction, point addition, point multiplication, point division, edge detecting, average neighborhood filtering, image stretching, log operation, exponential operation, inverse filtering, and image thresholding) are implemented and tested on the images (video frames) extracted from three different self-recorded videos named as video1, video2, video3. The videos are in .avi, .mp4 and .wmv format respectively. The work is tested on two types of data: grayscale and RGB (Red, Green, Blue). To assess the efficiency of each operation, three factors are considered: processing time, frames per second (FPS) and sharpness of edges of feature points based on image gradients. The implementation has been done in MATLAB R2017a.
在人机交互(HCI)领域中,人脸特征的分析和提取是最具决定性的阶段,它可以像人脸表情识别、情绪分类一样,形成一个鲁棒和高效的分类系统。本文提出了一种基于图像代数运算的人脸特征自动提取方法。这些操作通过涉及图像分析和转换的一些数学理论来单独处理像素强度值。本文对从三个不同的自录视频video1、video2、video3中提取的图像(视频帧)进行了11项操作(点减法、点加法、点乘法、点除法、边缘检测、平均邻域滤波、图像拉伸、对数运算、指数运算、逆滤波和图像阈值分割)并进行了测试。视频格式分别为。avi,。mp4和。wmv。这项工作在两种类型的数据上进行了测试:灰度和RGB(红,绿,蓝)。为了评估每个操作的效率,考虑了三个因素:处理时间、每秒帧数(FPS)和基于图像梯度的特征点边缘清晰度。该实现已在MATLAB R2017a中完成。
{"title":"Extraction of features from video files using different image algebraic point operations","authors":"P. Dutta, M. Nachamai","doi":"10.1145/3313950.3313951","DOIUrl":"https://doi.org/10.1145/3313950.3313951","url":null,"abstract":"In the human-computer interaction (HCI) field, facial feature analysis and extraction are the most decisive stages which can lead to a robust and efficient classification system like facial expression recognition, emotion classification. In this paper, an approach to the problem of automatic facial feature extraction from different videos are presented using several image algebraic operations. These operations deal with pixel intensity values individually through some mathematical theory involved in image analysis and transformations. In this paper, 11 operations (point subtraction, point addition, point multiplication, point division, edge detecting, average neighborhood filtering, image stretching, log operation, exponential operation, inverse filtering, and image thresholding) are implemented and tested on the images (video frames) extracted from three different self-recorded videos named as video1, video2, video3. The videos are in .avi, .mp4 and .wmv format respectively. The work is tested on two types of data: grayscale and RGB (Red, Green, Blue). To assess the efficiency of each operation, three factors are considered: processing time, frames per second (FPS) and sharpness of edges of feature points based on image gradients. The implementation has been done in MATLAB R2017a.","PeriodicalId":392037,"journal":{"name":"Proceedings of the 2nd International Conference on Image and Graphics Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128947687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three-coordinate gravimeter with exhibition of axis sensitivity based on digital videoimages 基于数字视频图像的轴灵敏度显示三坐标重力仪
I. Korobiichuk, Yuriy Podchashinskiy, O. Bezvesilna, S. Nechay, Yuriy Shavurskiy
The paper presents a new design of a three-axis gravimeter of aviation gravimetric system, which provides compensation for errors caused by influence of mobile base vertical accelerations and in the result of measurement of full gravity acceleration vector. Angle of inclination of a mark that is applied to a gravimeter body and coincides with vertical sensitive axis direction is determined by linear approximation in digital video images. These data are used to point sensitive axes and improve gravimeter accuracy.
提出了一种新的航空重力系统三轴重力仪的设计方案,补偿了移动基座垂直加速度的影响和全重力加速度矢量测量结果的误差。在数字视频图像中,用线性逼近法确定重力仪体上与垂直敏感轴方向重合的标记的倾斜角。这些数据用于指向敏感轴和提高重力仪的精度。
{"title":"Three-coordinate gravimeter with exhibition of axis sensitivity based on digital videoimages","authors":"I. Korobiichuk, Yuriy Podchashinskiy, O. Bezvesilna, S. Nechay, Yuriy Shavurskiy","doi":"10.1145/3313950.3314187","DOIUrl":"https://doi.org/10.1145/3313950.3314187","url":null,"abstract":"The paper presents a new design of a three-axis gravimeter of aviation gravimetric system, which provides compensation for errors caused by influence of mobile base vertical accelerations and in the result of measurement of full gravity acceleration vector. Angle of inclination of a mark that is applied to a gravimeter body and coincides with vertical sensitive axis direction is determined by linear approximation in digital video images. These data are used to point sensitive axes and improve gravimeter accuracy.","PeriodicalId":392037,"journal":{"name":"Proceedings of the 2nd International Conference on Image and Graphics Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121919262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A fast and efficient correction technique for slant images 一种快速有效的倾斜图像校正技术
Wenjia Ding, Yi Xie, Yulin Wang
In many image processing applications, such as OCR and object analysis, the input image is often inclined. For the subsequent processing and analysis, geometric correction is the most part in pre-processing phase. For the image or scanned document which is rectangular or near-rectangular, if its corner(s) is/are missing or folding, to find the real inclination angle of image or document is time consuming by previous techniques. In this paper, we proposed a fast and efficient algorithm to find the inclination angle of such an image or document. Experiments show that the correction results for the inclined image or scanned document by proposed technique are perfect. Compared with the previous algorithms, the amount of calculation has been greatly reduced, so it is suitable for real-time correction of slant images, such as scanned financial note, vehicle license plate, and text document.
在许多图像处理应用中,如OCR和对象分析,输入图像往往是倾斜的。对于后续的处理和分析,几何校正是预处理阶段的主要内容。对于矩形或近矩形的图像或扫描文档,如果其角缺失或折叠,则使用以往的技术来查找图像或文档的真实倾斜角非常耗时。在本文中,我们提出了一种快速有效的算法来查找此类图像或文档的倾斜角。实验表明,该方法对倾斜图像或扫描文档的校正效果良好。与以往的算法相比,大大减少了计算量,因此适用于扫描的金融票据、车辆牌照、文本文档等倾斜图像的实时校正。
{"title":"A fast and efficient correction technique for slant images","authors":"Wenjia Ding, Yi Xie, Yulin Wang","doi":"10.1145/3313950.3313971","DOIUrl":"https://doi.org/10.1145/3313950.3313971","url":null,"abstract":"In many image processing applications, such as OCR and object analysis, the input image is often inclined. For the subsequent processing and analysis, geometric correction is the most part in pre-processing phase. For the image or scanned document which is rectangular or near-rectangular, if its corner(s) is/are missing or folding, to find the real inclination angle of image or document is time consuming by previous techniques. In this paper, we proposed a fast and efficient algorithm to find the inclination angle of such an image or document. Experiments show that the correction results for the inclined image or scanned document by proposed technique are perfect. Compared with the previous algorithms, the amount of calculation has been greatly reduced, so it is suitable for real-time correction of slant images, such as scanned financial note, vehicle license plate, and text document.","PeriodicalId":392037,"journal":{"name":"Proceedings of the 2nd International Conference on Image and Graphics Processing","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126194027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Component recognition method based on deep learning and machine vision 基于深度学习和机器视觉的构件识别方法
Haozhan Tang, Jie Chen, Xuesong Zhen
Traditional component coding recognition adopts manual recognition or primitive machine vision technology in the electronic component testing and screening industry, which has the issues of low testing efficiency and high recognition error rate. Therefore, we proposed a novel method of component coding recognition based on machine vision combining with deep learning. The machine vision imaging system have been developed to obtain the images of component, and the processing operators such as grayscale conversion, mean filter, slant correction and other techniques are used for preprocessing. The component coding of different types and materials were recognized by deep learning model of deep convolution neural network. Extensive experiments in the component testing center and comparisons with traditional recognition demonstrate that this method has high recognition accuracy and wide range of components recognition.
传统的元器件编码识别在电子元器件检测筛选行业中采用人工识别或原始机器视觉技术,存在检测效率低、识别错误率高的问题。为此,我们提出了一种基于机器视觉与深度学习相结合的构件编码识别新方法。为了获取构件的图像,开发了机器视觉成像系统,并采用灰度转换、均值滤波、倾斜校正等处理算子进行预处理。采用深度卷积神经网络的深度学习模型对不同类型和材料的构件编码进行识别。在部件测试中心进行的大量实验和与传统识别方法的比较表明,该方法具有较高的识别精度和较宽的部件识别范围。
{"title":"Component recognition method based on deep learning and machine vision","authors":"Haozhan Tang, Jie Chen, Xuesong Zhen","doi":"10.1145/3313950.3313962","DOIUrl":"https://doi.org/10.1145/3313950.3313962","url":null,"abstract":"Traditional component coding recognition adopts manual recognition or primitive machine vision technology in the electronic component testing and screening industry, which has the issues of low testing efficiency and high recognition error rate. Therefore, we proposed a novel method of component coding recognition based on machine vision combining with deep learning. The machine vision imaging system have been developed to obtain the images of component, and the processing operators such as grayscale conversion, mean filter, slant correction and other techniques are used for preprocessing. The component coding of different types and materials were recognized by deep learning model of deep convolution neural network. Extensive experiments in the component testing center and comparisons with traditional recognition demonstrate that this method has high recognition accuracy and wide range of components recognition.","PeriodicalId":392037,"journal":{"name":"Proceedings of the 2nd International Conference on Image and Graphics Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126957160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Age invariant face recognition using Frangi2D binary pattern 基于Frangi2D二值模式的年龄不变人脸识别
Sabah Afroze, M. Beham, Tamilselvi Rajendran, S. M. A. Maraikkayar, K. Rajakumar
The field of computer vision is devoted to discovering algorithms, data representations and computer architectures that embody the principles underlying visual capabilities. Computer vision is an interdisciplinary field that deals with how computers can be made for gaining high level understanding from digital images or videos. While very promising result has been shown on face recognition related problems, age invariant face recognition still relics a challenge. Facial appearance of a human varies over time, which results in substantial intra-class variations. In order to address this problem, we propose Frangi2D method for normalization, Linear Binary pattern (LBP) for feature extraction and Sparse Representation Classifier (SRC). Extensive results on a well-known public domain face aging dataset: MORPH. The experimental results show the superiority of our proposed method in age invariant face recognition.
计算机视觉领域致力于发现体现视觉能力基本原理的算法、数据表示和计算机体系结构。计算机视觉是一个跨学科领域,研究如何使计算机能够从数字图像或视频中获得高层次的理解。尽管在人脸识别相关问题上已经取得了很好的成果,但年龄不变人脸识别仍然是一个挑战。人类的面部外观随着时间的推移而变化,这导致了大量的阶级内部差异。为了解决这个问题,我们提出了用于归一化的Frangi2D方法,用于特征提取的线性二进制模式(Linear Binary pattern, LBP)和用于稀疏表示分类器(Sparse Representation Classifier, SRC)。在一个著名的公共领域人脸老化数据集:MORPH上得到了广泛的结果。实验结果表明了该方法在年龄不变人脸识别中的优越性。
{"title":"Age invariant face recognition using Frangi2D binary pattern","authors":"Sabah Afroze, M. Beham, Tamilselvi Rajendran, S. M. A. Maraikkayar, K. Rajakumar","doi":"10.1145/3313950.3313961","DOIUrl":"https://doi.org/10.1145/3313950.3313961","url":null,"abstract":"The field of computer vision is devoted to discovering algorithms, data representations and computer architectures that embody the principles underlying visual capabilities. Computer vision is an interdisciplinary field that deals with how computers can be made for gaining high level understanding from digital images or videos. While very promising result has been shown on face recognition related problems, age invariant face recognition still relics a challenge. Facial appearance of a human varies over time, which results in substantial intra-class variations. In order to address this problem, we propose Frangi2D method for normalization, Linear Binary pattern (LBP) for feature extraction and Sparse Representation Classifier (SRC). Extensive results on a well-known public domain face aging dataset: MORPH. The experimental results show the superiority of our proposed method in age invariant face recognition.","PeriodicalId":392037,"journal":{"name":"Proceedings of the 2nd International Conference on Image and Graphics Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128605947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
Proceedings of the 2nd International Conference on Image and Graphics Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1