首页 > 最新文献

Anais do XVII Workshop de Visão Computacional (WVC 2021)最新文献

英文 中文
Evaluation of normalization technique on classification with deep learning features 基于深度学习特征的分类归一化技术评价
Pub Date : 2021-11-22 DOI: 10.5753/wvc.2021.18898
A. D. Freitas, Adriano B. Silva, A. S. Martins, L. A. Neves, T. A. A. Tosta, P. D. Faria, M. Z. Nascimento
Cancer is one of the diseases with the highest mortality rate in the world. Dysplasia is a difficult-to-diagnose precancerous lesion, which may not have a good Hematoxylin and Eosin (H&E) stain ratio, making it difficult for the histology specialist to diagnose. In this work, a method for normalizing H&E stains in histological images was investigated. This method uses a generative neural network based on a U-net for image generation and a PatchGAN architecture for information discrimination. Then, the normalized histological images were employed in classification algorithms to investigate the detection of the level of dysplasia present in the histological tissue of the oral cavity. The CNN models as well as hybrid models based on learning features and machine learning algorithms were evaluated. The employment of the ResNet-50 architecture and the Random Forest algorithm provided results with an accuracy rate around 97% for the images normalized with the investigated method.
癌症是世界上死亡率最高的疾病之一。不典型增生是一种难以诊断的癌前病变,它可能没有很好的苏木精和伊红(H&E)染色比,使组织学专家难以诊断。在这项工作中,研究了一种在组织学图像中归一化H&E染色的方法。该方法使用基于U-net的生成神经网络进行图像生成,使用PatchGAN架构进行信息识别。然后,将归一化的组织学图像用于分类算法,研究口腔组织中存在的不典型增生水平的检测。对CNN模型以及基于学习特征和机器学习算法的混合模型进行了评价。采用ResNet-50架构和随机森林算法,对所研究方法归一化的图像,准确率约为97%。
{"title":"Evaluation of normalization technique on classification with deep learning features","authors":"A. D. Freitas, Adriano B. Silva, A. S. Martins, L. A. Neves, T. A. A. Tosta, P. D. Faria, M. Z. Nascimento","doi":"10.5753/wvc.2021.18898","DOIUrl":"https://doi.org/10.5753/wvc.2021.18898","url":null,"abstract":"Cancer is one of the diseases with the highest mortality rate in the world. Dysplasia is a difficult-to-diagnose precancerous lesion, which may not have a good Hematoxylin and Eosin (H&E) stain ratio, making it difficult for the histology specialist to diagnose. In this work, a method for normalizing H&E stains in histological images was investigated. This method uses a generative neural network based on a U-net for image generation and a PatchGAN architecture for information discrimination. Then, the normalized histological images were employed in classification algorithms to investigate the detection of the level of dysplasia present in the histological tissue of the oral cavity. The CNN models as well as hybrid models based on learning features and machine learning algorithms were evaluated. The employment of the ResNet-50 architecture and the Random Forest algorithm provided results with an accuracy rate around 97% for the images normalized with the investigated method.","PeriodicalId":311431,"journal":{"name":"Anais do XVII Workshop de Visão Computacional (WVC 2021)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127525728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Grocery Product Recognition to Aid Visually Impaired People 杂货产品识别以帮助视障人士
Pub Date : 2021-11-22 DOI: 10.5753/wvc.2021.18896
André Machado, K. Aires, R. Veras, L. B. Britto Neto
This paper proposes a new approach in object recognition to assist visually impaired people. This approach achieved accuracy rates higher than the approaches proposed by the authors of the selected datasets. We applied Data Augmentation with other techniques and adjustments to different Pre-trained CNNs (Convolutional Neural Networks). The ResNet-50 based approach achieved the best results in the most recent datasets. This work focused on products that are usually found on grocery store shelves, supermarkets, refrigerators or pantries.
本文提出了一种帮助视障人士进行物体识别的新方法。该方法的准确率高于所选数据集作者提出的方法。我们将数据增强与其他技术和调整应用于不同的预训练cnn(卷积神经网络)。基于ResNet-50的方法在最新的数据集中取得了最好的结果。这项工作的重点是通常在杂货店货架上、超市、冰箱或食品储藏室里找到的产品。
{"title":"Grocery Product Recognition to Aid Visually Impaired People","authors":"André Machado, K. Aires, R. Veras, L. B. Britto Neto","doi":"10.5753/wvc.2021.18896","DOIUrl":"https://doi.org/10.5753/wvc.2021.18896","url":null,"abstract":"This paper proposes a new approach in object recognition to assist visually impaired people. This approach achieved accuracy rates higher than the approaches proposed by the authors of the selected datasets. We applied Data Augmentation with other techniques and adjustments to different Pre-trained CNNs (Convolutional Neural Networks). The ResNet-50 based approach achieved the best results in the most recent datasets. This work focused on products that are usually found on grocery store shelves, supermarkets, refrigerators or pantries.","PeriodicalId":311431,"journal":{"name":"Anais do XVII Workshop de Visão Computacional (WVC 2021)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115580861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Periocular authentication in smartphones applying uLBP descriptor on CNN Feature Maps 基于CNN Feature Maps的uLBP描述符智能手机眼周认证
Pub Date : 2021-11-22 DOI: 10.5753/wvc.2021.18890
William Barcellos, A. Gonzaga
The outputs of CNN layers, called Activations, are composed of Feature Maps, which show textural information that can be extracted by a texture descriptor. Standard CNN feature extraction use Activations as feature vectors for object recognition. The goal of this work is to evaluate a new methodology of CNN feature extraction. In this paper, instead of using the Activations as a feature vector, we use a CNN as a feature extractor, and then we apply a texture descriptor directly on the Feature Maps. Thus, we use the extracted features obtained by the texture descriptor as a feature vector for authentication. To evaluate our proposed method, we use the AlexNet CNN previously trained on the ImageNet database as a feature extractor; then we apply the uniform LBP (uLBP) descriptor on the Feature Maps for texture extraction. We tested our proposed method on the VISOB dataset composed of periocular images taken from 3 different smartphones under 3 different lighting conditions. Our results show that the use of a texture descriptor on CNN Feature Maps achieves better performance than computer vision handcrafted methods or even by standard CNN feature extraction.
CNN层的输出被称为激活,由特征图组成,特征图显示了可以通过纹理描述符提取的纹理信息。标准的CNN特征提取使用Activations作为目标识别的特征向量。这项工作的目的是评估一种新的CNN特征提取方法。在本文中,我们使用CNN作为特征提取器,而不是使用激活作为特征向量,然后我们直接在特征映射上应用纹理描述符。因此,我们使用纹理描述符获得的提取特征作为特征向量进行身份验证。为了评估我们提出的方法,我们使用先前在ImageNet数据库上训练的AlexNet CNN作为特征提取器;然后在Feature Maps上应用统一LBP (uLBP)描述符进行纹理提取。我们在VISOB数据集上测试了我们提出的方法,该数据集由3种不同的智能手机在3种不同的照明条件下拍摄的眼周图像组成。我们的研究结果表明,在CNN Feature Maps上使用纹理描述符比计算机视觉手工制作方法甚至是标准的CNN Feature extraction取得了更好的性能。
{"title":"Periocular authentication in smartphones applying uLBP descriptor on CNN Feature Maps","authors":"William Barcellos, A. Gonzaga","doi":"10.5753/wvc.2021.18890","DOIUrl":"https://doi.org/10.5753/wvc.2021.18890","url":null,"abstract":"The outputs of CNN layers, called Activations, are composed of Feature Maps, which show textural information that can be extracted by a texture descriptor. Standard CNN feature extraction use Activations as feature vectors for object recognition. The goal of this work is to evaluate a new methodology of CNN feature extraction. In this paper, instead of using the Activations as a feature vector, we use a CNN as a feature extractor, and then we apply a texture descriptor directly on the Feature Maps. Thus, we use the extracted features obtained by the texture descriptor as a feature vector for authentication. To evaluate our proposed method, we use the AlexNet CNN previously trained on the ImageNet database as a feature extractor; then we apply the uniform LBP (uLBP) descriptor on the Feature Maps for texture extraction. We tested our proposed method on the VISOB dataset composed of periocular images taken from 3 different smartphones under 3 different lighting conditions. Our results show that the use of a texture descriptor on CNN Feature Maps achieves better performance than computer vision handcrafted methods or even by standard CNN feature extraction.","PeriodicalId":311431,"journal":{"name":"Anais do XVII Workshop de Visão Computacional (WVC 2021)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116446748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comparative study of convolutional neural networks for classification of pigmented skin lesions 卷积神经网络用于色素性皮肤病变分类的比较研究
Pub Date : 2021-11-22 DOI: 10.5753/wvc.2021.18909
Natalia Camillo do Carmo, J. F. Mari
Skin cancer is one of the most common types of cancer in Brazil and its incidence rate has increased in recent years. Melanoma cases are more aggressive compared to nonmelanoma skin cancer. Machine learning-based classification algorithms can help dermatologists to diagnose whether skin lesion is melanoma or non-melanoma cancer. We compared four convolutional neural networks architectures (ResNet-50, VGG16, Inception-v3, and DenseNet-121) using different training strategies and validation methods to classify seven classes of skin lesions. The experiments were executed using the HAM10000 dataset which contains 10,015 images of pigmented skin lesions. We considered the test accuracy to determine the best model for each strategy. DenseNet-121 was the best model when trained with fine-tuning and data augmentation, 90% (k-fold crossvalidation). Our results can help to improve the use of machine learning algorithms for classifying pigmented skin lesions.
皮肤癌是巴西最常见的癌症之一,近年来其发病率有所上升。黑色素瘤病例比非黑色素瘤皮肤癌更具侵袭性。基于机器学习的分类算法可以帮助皮肤科医生诊断皮肤病变是黑色素瘤还是非黑色素瘤癌。我们比较了四种卷积神经网络架构(ResNet-50、VGG16、Inception-v3和DenseNet-121)使用不同的训练策略和验证方法对七种皮肤病变进行分类。实验使用HAM10000数据集执行,该数据集包含10015张色素皮肤病变图像。我们考虑了测试准确性来确定每种策略的最佳模型。当经过微调和数据增强训练时,DenseNet-121是最好的模型,90% (k-fold交叉验证)。我们的研究结果可以帮助改进机器学习算法对色素皮肤病变进行分类的使用。
{"title":"A comparative study of convolutional neural networks for classification of pigmented skin lesions","authors":"Natalia Camillo do Carmo, J. F. Mari","doi":"10.5753/wvc.2021.18909","DOIUrl":"https://doi.org/10.5753/wvc.2021.18909","url":null,"abstract":"Skin cancer is one of the most common types of cancer in Brazil and its incidence rate has increased in recent years. Melanoma cases are more aggressive compared to nonmelanoma skin cancer. Machine learning-based classification algorithms can help dermatologists to diagnose whether skin lesion is melanoma or non-melanoma cancer. We compared four convolutional neural networks architectures (ResNet-50, VGG16, Inception-v3, and DenseNet-121) using different training strategies and validation methods to classify seven classes of skin lesions. The experiments were executed using the HAM10000 dataset which contains 10,015 images of pigmented skin lesions. We considered the test accuracy to determine the best model for each strategy. DenseNet-121 was the best model when trained with fine-tuning and data augmentation, 90% (k-fold crossvalidation). Our results can help to improve the use of machine learning algorithms for classifying pigmented skin lesions.","PeriodicalId":311431,"journal":{"name":"Anais do XVII Workshop de Visão Computacional (WVC 2021)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129708846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Neonatal Face Mosaic: An areas-of-interest segmentation method based on 2D face images 新生儿面部镶嵌:一种基于二维面部图像的兴趣区域分割方法
Pub Date : 2021-11-22 DOI: 10.5753/wvc.2021.18914
Pedro Henrique Silva Domingues, Renan Martins Mendes da Silva, Ibrahim Jamil Orra, Matheus Elias Cruz, T. Heiderich, C. Thomaz
The daily life of preterm babies may be involved with long exposure to pain, causing problems in the development of the nervous system. In this context, an on-going area of research is the scientific development of image-based automatic pain detection systems based on several techniques, from anatomical measurements to artificial intelligence, they have generally two main issues: the categorization of the most relevant facial regions for identifying neonatal pain and the practical difficulty related to the presence of artifacts obstructing parts of the face. This paper proposes and implements an areas-of-interest automatic segmentation method that allows the creation of a novel dataset containing crops of neonatal faces relevant for pain classification, labelled by areas-of-interest and pain status. Moreover, we have also investigated the use of similarity matching techniques to compare each area-of-interest to the corresponding one extracted from a prototype face with no occlusion.t
早产儿的日常生活可能涉及长时间接触疼痛,导致神经系统发育出现问题。在这种情况下,一个正在进行的研究领域是基于几种技术的基于图像的自动疼痛检测系统的科学发展,从解剖学测量到人工智能,它们通常有两个主要问题:用于识别新生儿疼痛的最相关面部区域的分类以及与存在的人工物体相关的实际困难。本文提出并实现了一种兴趣区域自动分割方法,该方法允许创建一个新的数据集,该数据集包含与疼痛分类相关的新生儿面部作物,并通过兴趣区域和疼痛状态进行标记。此外,我们还研究了相似性匹配技术的使用,将每个感兴趣的区域与从没有遮挡的原型人脸中提取的相应区域进行比较
{"title":"Neonatal Face Mosaic: An areas-of-interest segmentation method based on 2D face images","authors":"Pedro Henrique Silva Domingues, Renan Martins Mendes da Silva, Ibrahim Jamil Orra, Matheus Elias Cruz, T. Heiderich, C. Thomaz","doi":"10.5753/wvc.2021.18914","DOIUrl":"https://doi.org/10.5753/wvc.2021.18914","url":null,"abstract":"The daily life of preterm babies may be involved with long exposure to pain, causing problems in the development of the nervous system. In this context, an on-going area of research is the scientific development of image-based automatic pain detection systems based on several techniques, from anatomical measurements to artificial intelligence, they have generally two main issues: the categorization of the most relevant facial regions for identifying neonatal pain and the practical difficulty related to the presence of artifacts obstructing parts of the face. This paper proposes and implements an areas-of-interest automatic segmentation method that allows the creation of a novel dataset containing crops of neonatal faces relevant for pain classification, labelled by areas-of-interest and pain status. Moreover, we have also investigated the use of similarity matching techniques to compare each area-of-interest to the corresponding one extracted from a prototype face with no occlusion.t","PeriodicalId":311431,"journal":{"name":"Anais do XVII Workshop de Visão Computacional (WVC 2021)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133107317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The interference of optical zoom in human and machine classification of pollen grain images 光学变焦对花粉颗粒图像人机分类的干扰
Pub Date : 2021-11-22 DOI: 10.5753/wvc.2021.18897
Felipe Silveira Brito Borges, Juliana Velasques Balta, Milad Roghanian, A. B. Gonçalves, Marco A. Alvarez, H. Pistori
Palynology can be applied to different areas, such as archeology and allergy, where it is constantly growing. However, no publication comparing human classifications with machine learning classifications at different optical scales has been found in the literature. An image dataset with 17 pollen species that occur in Brazil was created, and machine learning algorithms were used for their automatic classification and subsequent comparison with humans. The experiments presented here show how machine and human classification behave according to different optical image scales. Satisfactory results were achieved, with 98.88% average accuracy for the machine and 45.72% for human classification. The results impact a single scale pattern for capturing pollen grain images for both future computer vision experiments and for a faster advance in palynology science.
孢粉学可以应用于不同的领域,如考古学和过敏症,在这些领域它不断发展。然而,文献中尚未发现在不同光学尺度下比较人类分类与机器学习分类的出版物。创建了一个包含巴西17种花粉的图像数据集,并使用机器学习算法进行自动分类,随后与人类进行比较。本文的实验展示了机器和人类在不同光学图像尺度下的分类行为。结果令人满意,机器的平均准确率为98.88%,人类的平均准确率为45.72%。这一研究结果将为未来的计算机视觉实验和孢粉学的快速发展提供一种单一尺度的花粉颗粒图像捕获模式。
{"title":"The interference of optical zoom in human and machine classification of pollen grain images","authors":"Felipe Silveira Brito Borges, Juliana Velasques Balta, Milad Roghanian, A. B. Gonçalves, Marco A. Alvarez, H. Pistori","doi":"10.5753/wvc.2021.18897","DOIUrl":"https://doi.org/10.5753/wvc.2021.18897","url":null,"abstract":"Palynology can be applied to different areas, such as archeology and allergy, where it is constantly growing. However, no publication comparing human classifications with machine learning classifications at different optical scales has been found in the literature. An image dataset with 17 pollen species that occur in Brazil was created, and machine learning algorithms were used for their automatic classification and subsequent comparison with humans. The experiments presented here show how machine and human classification behave according to different optical image scales. Satisfactory results were achieved, with 98.88% average accuracy for the machine and 45.72% for human classification. The results impact a single scale pattern for capturing pollen grain images for both future computer vision experiments and for a faster advance in palynology science.","PeriodicalId":311431,"journal":{"name":"Anais do XVII Workshop de Visão Computacional (WVC 2021)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133230822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pavement Crack Segmentation using a U-Net based Neural Network 基于U-Net神经网络的路面裂缝分割
Pub Date : 2021-11-22 DOI: 10.5753/wvc.2021.18893
Raido Lacorte Galina, Thadeu Pezzin Melo, K. S. Komati
Cracks on the concrete surface are symptoms and precursors of structural degradation and hence must be identified and remedied. However, locating cracks is a time-consuming task that requires specialized professionals and special equipment. The use of neural networks for automatic crack detection emerges to assist in this task. This work proposes one U-Net based neural network to perform crack segmentation, trained with the Crack500 and DeepCrack datasets, separately. The U-Net used has seven contraction and seven expansion layers, which differs from the original architecture of four layers of each part. The IoU results obtained by the model trained with Crack500 was 71.03%, and by the model trained with DeepCrack was 86.38%.
混凝土表面的裂缝是结构退化的征兆和前兆,因此必须加以识别和补救。然而,定位裂缝是一项耗时的任务,需要专业人员和专用设备。使用神经网络进行自动裂纹检测来协助完成这项任务。这项工作提出了一个基于U-Net的神经网络来执行裂缝分割,分别使用Crack500和DeepCrack数据集进行训练。使用的U-Net有7个收缩层和7个扩展层,不同于原来每部分4层的架构。使用Crack500训练的模型的IoU结果为71.03%,使用DeepCrack训练的模型的IoU结果为86.38%。
{"title":"Pavement Crack Segmentation using a U-Net based Neural Network","authors":"Raido Lacorte Galina, Thadeu Pezzin Melo, K. S. Komati","doi":"10.5753/wvc.2021.18893","DOIUrl":"https://doi.org/10.5753/wvc.2021.18893","url":null,"abstract":"Cracks on the concrete surface are symptoms and precursors of structural degradation and hence must be identified and remedied. However, locating cracks is a time-consuming task that requires specialized professionals and special equipment. The use of neural networks for automatic crack detection emerges to assist in this task. This work proposes one U-Net based neural network to perform crack segmentation, trained with the Crack500 and DeepCrack datasets, separately. The U-Net used has seven contraction and seven expansion layers, which differs from the original architecture of four layers of each part. The IoU results obtained by the model trained with Crack500 was 71.03%, and by the model trained with DeepCrack was 86.38%.","PeriodicalId":311431,"journal":{"name":"Anais do XVII Workshop de Visão Computacional (WVC 2021)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115765053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HandArch: A deep learning architecture for LIBRAS hand configuration recognition HandArch:用于LIBRAS手型识别的深度学习架构
Pub Date : 2021-11-22 DOI: 10.5753/wvc.2021.18883
Gabriel Peixoto de Carvalho, André Luiz Brandão, F. Ferreira
Despite the recent advancements in deep learning, sign language recognition persists as a challenge in computer vision due to its complexity in shape and movement patterns. Current studies that address sign language recognition treat hand pose recognition as an image classification problem. Based on this approach, we introduce HandArch, a novel architecture for realtime hand pose recognition from video to accelerate the development of sign language recognition applications. Furthermore, we present Libras91, a novel dataset of Brazilian sign language (LIBRAS) hand configurations containing 91 classes and 108,896 samples. Experimental results show that our approach surpasses the accuracy of previous studies while working in real-time on video files. The recognition accuracy of our system is 99% for the novel dataset and over 95% for other hand pose datasets.
尽管最近在深度学习方面取得了进展,但由于其形状和运动模式的复杂性,手语识别仍然是计算机视觉中的一个挑战。目前针对手语识别的研究都将手势识别作为一个图像分类问题。在此基础上,我们引入了HandArch——一种用于实时视频手势识别的新架构,以加速手语识别应用的发展。此外,我们提出了Libras91,这是一个新的巴西手语(LIBRAS)手部配置数据集,包含91个类和108,896个样本。实验结果表明,在实时处理视频文件时,我们的方法的准确性超过了以往的研究。我们的系统对新数据集的识别准确率为99%,对其他手姿数据集的识别准确率超过95%。
{"title":"HandArch: A deep learning architecture for LIBRAS hand configuration recognition","authors":"Gabriel Peixoto de Carvalho, André Luiz Brandão, F. Ferreira","doi":"10.5753/wvc.2021.18883","DOIUrl":"https://doi.org/10.5753/wvc.2021.18883","url":null,"abstract":"Despite the recent advancements in deep learning, sign language recognition persists as a challenge in computer vision due to its complexity in shape and movement patterns. Current studies that address sign language recognition treat hand pose recognition as an image classification problem. Based on this approach, we introduce HandArch, a novel architecture for realtime hand pose recognition from video to accelerate the development of sign language recognition applications. Furthermore, we present Libras91, a novel dataset of Brazilian sign language (LIBRAS) hand configurations containing 91 classes and 108,896 samples. Experimental results show that our approach surpasses the accuracy of previous studies while working in real-time on video files. The recognition accuracy of our system is 99% for the novel dataset and over 95% for other hand pose datasets.","PeriodicalId":311431,"journal":{"name":"Anais do XVII Workshop de Visão Computacional (WVC 2021)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122127075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Segmentation of Cattle Images Using Deep Learning 基于深度学习的牛图像无监督分割
Pub Date : 2021-11-22 DOI: 10.5753/wvc.2021.18886
Vinícius Guardieiro Sousa, A. Backes
In this work, we used the Deep Learning (DL) architecture named U-Net to segment images containing side view cattle. We evaluated the ability of the U-Net to segment images captured with different backgrounds and from the different breeds, both acquired by us and from the Internet. Since cattle images present a more constant background than other applications, we also evaluated the performance of the U-Net when we change the numbers of convolutional blocks and filters. Results show that U-Net can be used to segment cattle images using fewer blocks and filters than traditional U-Net and that the number of blocks is more important than the total number of filters used.
在这项工作中,我们使用了名为U-Net的深度学习(DL)架构来分割包含侧视图牛的图像。我们评估了U-Net分割不同背景和不同品种的图像的能力,这些图像都是我们和互联网上获得的。由于牛图像呈现出比其他应用程序更恒定的背景,当我们改变卷积块和过滤器的数量时,我们还评估了U-Net的性能。结果表明,与传统的U-Net相比,U-Net可以使用更少的块和滤波器对牛图像进行分割,并且块的数量比所用滤波器的总数更重要。
{"title":"Unsupervised Segmentation of Cattle Images Using Deep Learning","authors":"Vinícius Guardieiro Sousa, A. Backes","doi":"10.5753/wvc.2021.18886","DOIUrl":"https://doi.org/10.5753/wvc.2021.18886","url":null,"abstract":"In this work, we used the Deep Learning (DL) architecture named U-Net to segment images containing side view cattle. We evaluated the ability of the U-Net to segment images captured with different backgrounds and from the different breeds, both acquired by us and from the Internet. Since cattle images present a more constant background than other applications, we also evaluated the performance of the U-Net when we change the numbers of convolutional blocks and filters. Results show that U-Net can be used to segment cattle images using fewer blocks and filters than traditional U-Net and that the number of blocks is more important than the total number of filters used.","PeriodicalId":311431,"journal":{"name":"Anais do XVII Workshop de Visão Computacional (WVC 2021)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129954111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Application of Convolutional Neural Network in Coffee Capsule Count Aiming Collection System for Recycling 卷积神经网络在咖啡胶囊计数回收系统中的应用
Pub Date : 2021-11-22 DOI: 10.5753/wvc.2021.18907
Henrique Wippel Parucker da Silva, G. B. Santos
The coffee capsules brought practicality and speed in the preparation of the drink. However, with its popularization came a major environmental problem, the generation of a large amount of garbage, which for 2021 has an estimated 14 thousand tons of garbage, only coming from the capsules. To avoid this disposal it is necessary to recycle them, however it is not a trivial job, since they are composed of various materials, as well as the collection of these capsules presents challenges. Therefore, a collection system is of great value, which, in addition to being automated, generates bonuses proportional to the quantity of discarded capsules. This work is dedicated preliminary tests on the development of such a system using a convolutional neural network for the detection of coffee capsules. This algorithm was trained with two image sets, one containing images with reflection and the other without, which presented an accuracy of approximately 97%.
咖啡胶囊为咖啡的制备带来了实用性和速度。然而,随着它的普及,随之而来的是一个重大的环境问题,即产生了大量的垃圾,到2021年,估计有1.4万吨垃圾来自胶囊。为了避免这种处置,有必要回收它们,但这不是一项微不足道的工作,因为它们由各种材料组成,以及这些胶囊的收集提出了挑战。因此,收集系统非常有价值,除了自动化之外,还会根据丢弃胶囊的数量产生奖励。这项工作是专门对使用卷积神经网络检测咖啡胶囊的系统的开发进行初步测试。该算法使用两个图像集进行训练,其中一个包含反射图像,另一个包含无反射图像,准确率约为97%。
{"title":"Application of Convolutional Neural Network in Coffee Capsule Count Aiming Collection System for Recycling","authors":"Henrique Wippel Parucker da Silva, G. B. Santos","doi":"10.5753/wvc.2021.18907","DOIUrl":"https://doi.org/10.5753/wvc.2021.18907","url":null,"abstract":"The coffee capsules brought practicality and speed in the preparation of the drink. However, with its popularization came a major environmental problem, the generation of a large amount of garbage, which for 2021 has an estimated 14 thousand tons of garbage, only coming from the capsules. To avoid this disposal it is necessary to recycle them, however it is not a trivial job, since they are composed of various materials, as well as the collection of these capsules presents challenges. Therefore, a collection system is of great value, which, in addition to being automated, generates bonuses proportional to the quantity of discarded capsules. This work is dedicated preliminary tests on the development of such a system using a convolutional neural network for the detection of coffee capsules. This algorithm was trained with two image sets, one containing images with reflection and the other without, which presented an accuracy of approximately 97%.","PeriodicalId":311431,"journal":{"name":"Anais do XVII Workshop de Visão Computacional (WVC 2021)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128821716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Anais do XVII Workshop de Visão Computacional (WVC 2021)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1