首页 > 最新文献

Machine Graphics and Vision最新文献

英文 中文
Use of virtual reality to facilitate engineer training in the aerospace industry 利用虚拟现实技术促进航空航天工业的工程师培训
Pub Date : 2023-11-06 DOI: 10.22630/mgv.2023.32.2.2
Andrzej Paszkiewicz, Mateusz Salach, Dawid Wydrzyński, Joanna Woźniak, Grzegorz Budzik, Marek Bolanowski, Maria Ganzha, Marcin Paprzycki, Norbert Cierpicki
This work concerns automation of the training process, using modern information technologies, including virtual reality (VR). The starting point is an observation that automotive and aerospace industries require effective methods of preparation of engineering personnel. In this context, the technological process of preparing operations of a CNC numerical machine has been extracted. On this basis, a dedicated virtual reality environment, simulating manufacturing of a selected aircraft landing gear component, was created. For a comprehensive analysis of the pros and cons of the proposed approach, four forms of training, involving a physical CNC machine, a physical simulator, a software simulator, and the developed VR environment were instantiated. The features of each training form were analysed in terms of their potential for industrial applications. A survey, using the Net Promoter Score method, was also conducted among a target group of engineers, regarding the potential of use of each training form. As a result, the advantages and disadvantages of all four training forms were captured. They can be used as criteria for selecting the most effective training form.
这项工作涉及使用包括虚拟现实(VR)在内的现代信息技术实现培训过程的自动化。出发点是观察到汽车和航空航天工业需要有效的方法来培养工程人员。在此背景下,提取了数控数控机床准备操作的工艺流程。在此基础上,创建了一个专用的虚拟现实环境,模拟了选定飞机起落架部件的制造。为了全面分析所提出方法的利弊,四种形式的培训,包括物理数控机床,物理模拟器,软件模拟器和开发的VR环境的实例化。对每一种培训表格的特点进行了分析,分析了它们在工业应用方面的潜力。我们还在目标工程师群体中进行了一项使用净推荐值法的调查,以了解每种培训表格的使用潜力。结果,捕获了所有四种培训形式的优点和缺点。它们可以作为选择最有效的训练形式的标准。
{"title":"Use of virtual reality to facilitate engineer training in the aerospace industry","authors":"Andrzej Paszkiewicz, Mateusz Salach, Dawid Wydrzyński, Joanna Woźniak, Grzegorz Budzik, Marek Bolanowski, Maria Ganzha, Marcin Paprzycki, Norbert Cierpicki","doi":"10.22630/mgv.2023.32.2.2","DOIUrl":"https://doi.org/10.22630/mgv.2023.32.2.2","url":null,"abstract":"This work concerns automation of the training process, using modern information technologies, including virtual reality (VR). The starting point is an observation that automotive and aerospace industries require effective methods of preparation of engineering personnel. In this context, the technological process of preparing operations of a CNC numerical machine has been extracted. On this basis, a dedicated virtual reality environment, simulating manufacturing of a selected aircraft landing gear component, was created. For a comprehensive analysis of the pros and cons of the proposed approach, four forms of training, involving a physical CNC machine, a physical simulator, a software simulator, and the developed VR environment were instantiated. The features of each training form were analysed in terms of their potential for industrial applications. A survey, using the Net Promoter Score method, was also conducted among a target group of engineers, regarding the potential of use of each training form. As a result, the advantages and disadvantages of all four training forms were captured. They can be used as criteria for selecting the most effective training form.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"15 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135683976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient pedestrian attributes recognition system under challenging conditions 具有挑战性条件下的高效行人属性识别系统
Pub Date : 2023-08-21 DOI: 10.22630/mgv.2023.32.2.1
Ha X. Nguyen, Dong N. Hoang, Tuan A. Tran, Tuan M. Dang
In this work, an efficient pedestrian attribute recognition system is introduced. The system is based on a novel processing pipeline that combines the best-performing attribute extraction model with an efficient attribute filtering algorithm using keypoints of human pose. The attribute extraction models are developed based on several state-of-the-art deep networks via transfer learning techniques, including ResNet50, Swin-transformer, and ConvNeXt. Pre-trained models of these networks are fine-tuned using the Ensemble Pedestrian Attribute Recognition (EPAR) dataset. Several optimization techniques, including the advanced optimizer Adam with Decoupled Weight Decay Regularization (AdamW), Random Erasing (RE), and weighted loss functions, are adopted to solve issues of data unbalancing or challenging conditions like partial and occluded bodies. Experimental evaluations are performed via EPAR that contains 26993 images of 1477 person IDs, most of which are in challenging conditions. The results show that the ConvNeXt-v2-B outperforms other networks; mean accuracy (mA) reaches 85.57%, and other indices are also the highest. The addition of AdamW or RE can improve accuracy by 1-2%. The use of new loss functions can solve the issue of data unbalancing, in which the accuracy of data-less attributes improves by a maximum of 14% in the best case. Significantly, when the attribute filtering algorithm is applied, the results are dramatically improved, and mA reaches an excellent value of 94.85%. Utilizing the state-of-the-art attribute extraction model with optimization techniques on the large-scale and diverse dataset and attribute filtering has shown a good approach and thus has a high potential for practical applications.
本文介绍了一种高效的行人属性识别系统。该系统基于一种新颖的处理流水线,将性能最好的属性提取模型与高效的基于人体姿态关键点的属性过滤算法相结合。属性提取模型是基于几种最先进的深度网络,通过迁移学习技术开发的,包括ResNet50、swan -transformer和ConvNeXt。这些网络的预训练模型使用集成行人属性识别(EPAR)数据集进行微调。采用了几种优化技术,包括具有解耦权衰减正则化(AdamW)、随机擦除(RE)和加权损失函数的高级优化器Adam,以解决数据不平衡或部分和遮挡体等挑战性条件的问题。实验评估是通过EPAR进行的,EPAR包含26993张1477个人id的图像,其中大多数都处于具有挑战性的条件下。结果表明,ConvNeXt-v2-B网络优于其他网络;平均准确率(mA)达到85.57%,其他指标也最高。添加AdamW或RE可将精度提高1-2%。使用新的损失函数可以解决数据不平衡的问题,在最好的情况下,无数据属性的准确性最多提高14%。值得注意的是,当应用属性过滤算法时,结果得到了显著改善,mA达到了94.85%的优异值。利用最先进的属性提取模型和优化技术,在大规模和多样化的数据集上进行属性过滤,是一种很好的方法,具有很高的实际应用潜力。
{"title":"An efficient pedestrian attributes recognition system under challenging conditions","authors":"Ha X. Nguyen, Dong N. Hoang, Tuan A. Tran, Tuan M. Dang","doi":"10.22630/mgv.2023.32.2.1","DOIUrl":"https://doi.org/10.22630/mgv.2023.32.2.1","url":null,"abstract":"In this work, an efficient pedestrian attribute recognition system is introduced. The system is based on a novel processing pipeline that combines the best-performing attribute extraction model with an efficient attribute filtering algorithm using keypoints of human pose. The attribute extraction models are developed based on several state-of-the-art deep networks via transfer learning techniques, including ResNet50, Swin-transformer, and ConvNeXt. Pre-trained models of these networks are fine-tuned using the Ensemble Pedestrian Attribute Recognition (EPAR) dataset. Several optimization techniques, including the advanced optimizer Adam with Decoupled Weight Decay Regularization (AdamW), Random Erasing (RE), and weighted loss functions, are adopted to solve issues of data unbalancing or challenging conditions like partial and occluded bodies. Experimental evaluations are performed via EPAR that contains 26993 images of 1477 person IDs, most of which are in challenging conditions. The results show that the ConvNeXt-v2-B outperforms other networks; mean accuracy (mA) reaches 85.57%, and other indices are also the highest. The addition of AdamW or RE can improve accuracy by 1-2%. The use of new loss functions can solve the issue of data unbalancing, in which the accuracy of data-less attributes improves by a maximum of 14% in the best case. Significantly, when the attribute filtering algorithm is applied, the results are dramatically improved, and mA reaches an excellent value of 94.85%. Utilizing the state-of-the-art attribute extraction model with optimization techniques on the large-scale and diverse dataset and attribute filtering has shown a good approach and thus has a high potential for practical applications.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73840839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance evaluation of Machine Learning models to predict heart attack 预测心脏病发作的机器学习模型的性能评估
Pub Date : 2023-08-14 DOI: 10.22630/mgv.2023.32.1.6
Majid Khan, Ghassan Husnain, Waqas Ahmad, Zain Shaukat, Latif Jan, Ihtisham Ul Haq, Shahab Ul Islam, Atif Ishtiaq
Coronary Artery Disease is the type of cardiovascular disease (CVD) that happens when the blood vessels which stream the blood toward the heart, either become tapered or blocked. Of this, the heart is incapable to push sufficient blood to encounter its requirements. This would lead to angina (chest pain). CVDs are the leading cause of mortality worldwide. According to WHO, in the year 2019 17.9 million people deceased from CVD. Machine Learning is a type of artificial intelligence that uses algorithms to help analyse large datasets more efficiently. It can be used in medical research to help process large amounts of data quickly, such as patient records or medical images. By using Machine Learning techniques and methods, scientists can automate the analysis of complex and large datasets to gain deeper insights into the data. Machine Learning is a type of technology that helps with gathering data and understanding patterns. Recently, researchers in the healthcare industry have been using Machine Learning techniques to assist with diagnosing heart-related diseases. This means that the professionals involved in the diagnosis process can use Machine Learning to help them figure out what is wrong with a patient and provide appropriate treatment. This paper evaluates different machine learning models performances. The Supervised Learning algorithms are used commonly in Machine Learning which means that the training is done using labelled data, belonging to a particular classification. Such classification methods like Random Forest, Decision Tree, K-Nearest Neighbour, XGBoost algorithm, Naive Bayes, and Support Vector Machine will be used to assess the cardiovascular disease by Machine Learning.
冠状动脉疾病是一种心血管疾病(CVD),当向心脏输送血液的血管变细或堵塞时就会发生。在这种情况下,心脏无法输送足够的血液来满足其需求。这会导致心绞痛(胸痛)。心血管疾病是全世界死亡的主要原因。根据世卫组织的数据,2019年有1790万人死于心血管疾病。机器学习是一种人工智能,它使用算法来帮助更有效地分析大型数据集。它可以用于医学研究,帮助快速处理大量数据,如患者记录或医学图像。通过使用机器学习技术和方法,科学家可以自动分析复杂和大型数据集,从而更深入地了解数据。机器学习是一种有助于收集数据和理解模式的技术。最近,医疗保健行业的研究人员一直在使用机器学习技术来协助诊断心脏相关疾病。这意味着参与诊断过程的专业人员可以使用机器学习来帮助他们找出患者的问题所在,并提供适当的治疗。本文评估了不同机器学习模型的性能。监督学习算法通常用于机器学习,这意味着训练是使用属于特定分类的标记数据完成的。随机森林、决策树、k近邻、XGBoost算法、朴素贝叶斯、支持向量机等分类方法将被用于机器学习对心血管疾病的评估。
{"title":"Performance evaluation of Machine Learning models to predict heart attack","authors":"Majid Khan, Ghassan Husnain, Waqas Ahmad, Zain Shaukat, Latif Jan, Ihtisham Ul Haq, Shahab Ul Islam, Atif Ishtiaq","doi":"10.22630/mgv.2023.32.1.6","DOIUrl":"https://doi.org/10.22630/mgv.2023.32.1.6","url":null,"abstract":"Coronary Artery Disease is the type of cardiovascular disease (CVD) that happens when the blood vessels which stream the blood toward the heart, either become tapered or blocked. Of this, the heart is incapable to push sufficient blood to encounter its requirements. This would lead to angina (chest pain). CVDs are the leading cause of mortality worldwide. According to WHO, in the year 2019 17.9 million people deceased from CVD. Machine Learning is a type of artificial intelligence that uses algorithms to help analyse large datasets more efficiently. It can be used in medical research to help process large amounts of data quickly, such as patient records or medical images. By using Machine Learning techniques and methods, scientists can automate the analysis of complex and large datasets to gain deeper insights into the data. Machine Learning is a type of technology that helps with gathering data and understanding patterns. Recently, researchers in the healthcare industry have been using Machine Learning techniques to assist with diagnosing heart-related diseases. This means that the professionals involved in the diagnosis process can use Machine Learning to help them figure out what is wrong with a patient and provide appropriate treatment. This paper evaluates different machine learning models performances. The Supervised Learning algorithms are used commonly in Machine Learning which means that the training is done using labelled data, belonging to a particular classification. Such classification methods like Random Forest, Decision Tree, K-Nearest Neighbour, XGBoost algorithm, Naive Bayes, and Support Vector Machine will be used to assess the cardiovascular disease by Machine Learning.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135263832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lung and colon cancer detection from CT images using Deep Learning 利用深度学习从CT图像中检测肺癌和结肠癌
Pub Date : 2023-08-10 DOI: 10.22630/mgv.2023.32.1.5
J. D. Akinyemi, Akinkunle A. Akinola, Olajumoke O. Adekunle, T. Adetiloye, E. Dansu
Cancer is a deadly disease that has gained a reputation as a global health concern. Further, lung cancer has been widely reported as the most deadly cancer type globally, while colon cancer comes second. Meanwhile, early detection is one of the primary ways to prevent lung and colon cancer fatalities. To aid the early detection of lung and colon cancer, we propose a computer-aided diagnostic approach that employs a Deep Learning (DL) architecture to enhance the detection of these cancer types from Computed Tomography (CT) images of suspected body parts. Our experimental dataset (LC25000) contains 25000 CT images of benign and malignant lung and colon cancer tissues. We used weights from a pre-trained DL architecture for computer vision, EfficientNet, to build and train a lung and colon cancer detection model. EfficientNet is a Convolutional Neural Network architecture that scales all input dimensions such as depth, width, and resolution at the same time. Our research findings showed detection accuracies of 99.63%, 99.50%, and 99.72% for training, validation, and test sets, respectively.
癌症是一种致命的疾病,已成为全球健康问题。此外,肺癌被广泛报道为全球最致命的癌症类型,其次是结肠癌。同时,早期发现是预防肺癌和结肠癌死亡的主要方法之一。为了帮助肺癌和结肠癌的早期检测,我们提出了一种计算机辅助诊断方法,该方法采用深度学习(DL)架构,从可疑身体部位的计算机断层扫描(CT)图像中增强对这些癌症类型的检测。我们的实验数据集(LC25000)包含25000张肺、结肠癌良、恶性组织的CT图像。我们使用来自计算机视觉预训练DL架构的权重,高效网络,来构建和训练肺癌和结肠癌检测模型。effentnet是一种卷积神经网络架构,可以同时缩放所有输入维度,如深度、宽度和分辨率。我们的研究结果显示,训练集、验证集和测试集的检测准确率分别为99.63%、99.50%和99.72%。
{"title":"Lung and colon cancer detection from CT images using Deep Learning","authors":"J. D. Akinyemi, Akinkunle A. Akinola, Olajumoke O. Adekunle, T. Adetiloye, E. Dansu","doi":"10.22630/mgv.2023.32.1.5","DOIUrl":"https://doi.org/10.22630/mgv.2023.32.1.5","url":null,"abstract":"Cancer is a deadly disease that has gained a reputation as a global health concern. Further, lung cancer has been widely reported as the most deadly cancer type globally, while colon cancer comes second. Meanwhile, early detection is one of the primary ways to prevent lung and colon cancer fatalities. To aid the early detection of lung and colon cancer, we propose a computer-aided diagnostic approach that employs a Deep Learning (DL) architecture to enhance the detection of these cancer types from Computed Tomography (CT) images of suspected body parts. Our experimental dataset (LC25000) contains 25000 CT images of benign and malignant lung and colon cancer tissues. We used weights from a pre-trained DL architecture for computer vision, EfficientNet, to build and train a lung and colon cancer detection model. EfficientNet is a Convolutional Neural Network architecture that scales all input dimensions such as depth, width, and resolution at the same time. Our research findings showed detection accuracies of 99.63%, 99.50%, and 99.72% for training, validation, and test sets, respectively.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77626880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Riesz-Laplace Wavelet Transform and PCNN Based Image Fusion Riesz-Laplace小波变换与PCNN图像融合
Pub Date : 2023-06-16 DOI: 10.22630/mgv.2023.32.1.4
Shuifa Sun, Yongheng Tang, Zhoujunshen Mei, Min Yang, Tinglong Tang, Yirong Wu
Important information perceived by human vision comes from the low-level features of the image, which can be extracted by the Riesz transform. In this study, we propose a Riesz transform based approach to image fusion. The image to be fused is first decomposed using the Riesz transform. Then the image sequence obtained in the Riesz transform domain is subjected to the Laplacian wavelet transform based on the fractional Laplacian operators and the multi-harmonic splines. After Laplacian wavelet transform, the image representations have directional and multi-resolution characteristics. Finally, image fusion is performed, leveraging Riesz-Laplace wavelet analysis and the global coupling characteristics of pulse coupled neural network (PCNN). The proposed approach has been tested in several application scenarios, such as multi-focus imaging, medical imaging, remote sensing full-color imaging, and multi-spectral imaging. Compared with conventional methods, the proposed approach demonstrates superior performance on visual effects, contrast, clarity, and the overall efficiency.
人类视觉感知到的重要信息来源于图像的底层特征,这些底层特征可以通过Riesz变换提取出来。在本研究中,我们提出了一种基于Riesz变换的图像融合方法。首先利用Riesz变换对待融合图像进行分解。然后在Riesz变换域中对得到的图像序列进行基于分数阶拉普拉斯算子和多谐样条的拉普拉斯小波变换。经过拉普拉斯小波变换后的图像表示具有方向性和多分辨率的特点。最后,利用Riesz-Laplace小波分析和脉冲耦合神经网络(PCNN)的全局耦合特性进行图像融合。该方法已在多焦点成像、医学成像、遥感全彩成像和多光谱成像等多个应用场景中进行了测试。与传统方法相比,该方法在视觉效果、对比度、清晰度和整体效率方面表现出优越的性能。
{"title":"Riesz-Laplace Wavelet Transform and PCNN Based Image Fusion","authors":"Shuifa Sun, Yongheng Tang, Zhoujunshen Mei, Min Yang, Tinglong Tang, Yirong Wu","doi":"10.22630/mgv.2023.32.1.4","DOIUrl":"https://doi.org/10.22630/mgv.2023.32.1.4","url":null,"abstract":"Important information perceived by human vision comes from the low-level features of the image, which can be extracted by the Riesz transform. In this study, we propose a Riesz transform based approach to image fusion. The image to be fused is first decomposed using the Riesz transform. Then the image sequence obtained in the Riesz transform domain is subjected to the Laplacian wavelet transform based on the fractional Laplacian operators and the multi-harmonic splines. After Laplacian wavelet transform, the image representations have directional and multi-resolution characteristics. Finally, image fusion is performed, leveraging Riesz-Laplace wavelet analysis and the global coupling characteristics of pulse coupled neural network (PCNN). The proposed approach has been tested in several application scenarios, such as multi-focus imaging, medical imaging, remote sensing full-color imaging, and multi-spectral imaging. Compared with conventional methods, the proposed approach demonstrates superior performance on visual effects, contrast, clarity, and the overall efficiency.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81064817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying selected diseases of leaves using deep learning and transfer learning models 利用深度学习和迁移学习模型识别选定的叶片病害
Pub Date : 2023-04-06 DOI: 10.22630/mgv.2023.32.1.3
A. Mimi, Sayeda Fatema Tuj Zohura, Muhammad Ibrahim, Riddho Ridwanul Haque, Omar Farrok, T. Jabid, M. Ali
Leaf diseases may harm plants in different ways, often causing reduced productivity and, at times, lethal consequences. Detecting such diseases in a timely manner can help plant owners take effective remedial measures. Deficiencies of vital elements such as nitrogen, microbial infections and other similar disorders can often have visible effects, such as the yellowing of leaves in Catharanthus roseus (bright eyes) and scorched leaves in Fragaria × ananassa (strawberry) plants. In this work, we explore approaches to use computer vision techniques to help plant owners identify such leaf disorders in their plants automatically and conveniently. This research designs three machine learning systems, namely a vanilla CNN model, a CNN-SVM hybrid model, and a MobileNetV2-based transfer learning model that detect yellowed and scorched leaves in Catharanthus roseus and strawberry plants, respectively, using images captured by mobile phones. In our experiments, the models yield a very promising accuracy on a dataset having around 4000 images. Of the three models, the transfer learning-based one demonstrates the highest accuracy (97.35% on test set) in our experiments. Furthermore, an Android application is developed that uses this model to allow end-users to conveniently monitor the condition of their plants in real time.
叶片病害可能以不同的方式危害植物,往往导致生产力下降,有时甚至造成致命后果。及时发现这些病害可以帮助工厂主采取有效的补救措施。缺乏重要元素,如氮、微生物感染和其他类似的疾病,往往会产生明显的影响,例如Catharanthus roseus(明亮的眼睛)的叶子变黄和Fragaria × ananassa(草莓)植物的叶子烧焦。在这项工作中,我们探索了使用计算机视觉技术帮助植物所有者自动方便地识别植物中此类叶片疾病的方法。本研究设计了香草CNN模型、CNN- svm混合模型和基于mobilenetv2的迁移学习模型三种机器学习系统,分别利用手机拍摄的图像检测花楸(Catharanthus roseus)和草莓(strawberry)植物的黄叶和焦叶。在我们的实验中,这些模型在大约4000张图像的数据集上产生了非常有希望的精度。在我们的实验中,基于迁移学习的模型在测试集上显示出最高的准确率(97.35%)。此外,开发了一个Android应用程序,使用该模型允许最终用户方便地实时监控其工厂的状况。
{"title":"Identifying selected diseases of leaves using deep learning and transfer learning models","authors":"A. Mimi, Sayeda Fatema Tuj Zohura, Muhammad Ibrahim, Riddho Ridwanul Haque, Omar Farrok, T. Jabid, M. Ali","doi":"10.22630/mgv.2023.32.1.3","DOIUrl":"https://doi.org/10.22630/mgv.2023.32.1.3","url":null,"abstract":"Leaf diseases may harm plants in different ways, often causing reduced productivity and, at times, lethal consequences. Detecting such diseases in a timely manner can help plant owners take effective remedial measures. Deficiencies of vital elements such as nitrogen, microbial infections and other similar disorders can often have visible effects, such as the yellowing of leaves in Catharanthus roseus (bright eyes) and scorched leaves in Fragaria × ananassa (strawberry) plants. In this work, we explore approaches to use computer vision techniques to help plant owners identify such leaf disorders in their plants automatically and conveniently. This research designs three machine learning systems, namely a vanilla CNN model, a CNN-SVM hybrid model, and a MobileNetV2-based transfer learning model that detect yellowed and scorched leaves in Catharanthus roseus and strawberry plants, respectively, using images captured by mobile phones. In our experiments, the models yield a very promising accuracy on a dataset having around 4000 images. Of the three models, the transfer learning-based one demonstrates the highest accuracy (97.35% on test set) in our experiments. Furthermore, an Android application is developed that uses this model to allow end-users to conveniently monitor the condition of their plants in real time.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82060219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Exploring automated object detection methods for manholes using classical computer vision and deep learning 探索使用经典计算机视觉和深度学习的人孔自动目标检测方法
Pub Date : 2023-03-07 DOI: 10.22630/mgv.2023.32.1.2
S. Rao, Nitya Mitnala
Open, broken, and improperly closed manholes can pose problems for autonomous vehicles and thus need to be included in obstacle avoidance and lane-changing algorithms. In this work, we propose and compare multiple approaches for manhole localization and classification like classical computer vision, convolutional neural networks like YOLOv3 and YOLOv3-Tiny, and vision transformers like YOLOS and ViT. These are analyzed for speed, computational complexity, and accuracy in order to determine the model that can be used with autonomous vehicles. In addition, we propose a size detection pipeline using classical computer vision to determine the size of the hole in an improperly closed manhole with respect to the manhole itself. The evaluation of the data showed that convolutional neural networks are currently better for this task, but vision transformers seem promising.
打开、损坏和未正确关闭的人孔可能会给自动驾驶汽车带来问题,因此需要将其纳入避障和变道算法中。在这项工作中,我们提出并比较了多种人孔定位和分类方法,如经典计算机视觉,卷积神经网络如YOLOv3和YOLOv3- tiny,以及视觉变压器如yoloos和ViT。分析这些模型的速度、计算复杂性和准确性,以确定可用于自动驾驶汽车的模型。此外,我们提出了一种使用经典计算机视觉的尺寸检测管道,以确定不适当关闭的人孔中孔的尺寸相对于人孔本身。对数据的评估表明,卷积神经网络目前更适合这项任务,但视觉变压器似乎很有希望。
{"title":"Exploring automated object detection methods for manholes using classical computer vision and deep learning","authors":"S. Rao, Nitya Mitnala","doi":"10.22630/mgv.2023.32.1.2","DOIUrl":"https://doi.org/10.22630/mgv.2023.32.1.2","url":null,"abstract":"Open, broken, and improperly closed manholes can pose problems for autonomous vehicles and thus need to be included in obstacle avoidance and lane-changing algorithms. In this work, we propose and compare multiple approaches for manhole localization and classification like classical computer vision, convolutional neural networks like YOLOv3 and YOLOv3-Tiny, and vision transformers like YOLOS and ViT. These are analyzed for speed, computational complexity, and accuracy in order to determine the model that can be used with autonomous vehicles. In addition, we propose a size detection pipeline using classical computer vision to determine the size of the hole in an improperly closed manhole with respect to the manhole itself. The evaluation of the data showed that convolutional neural networks are currently better for this task, but vision transformers seem promising.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"160 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86215153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vision-based biomechanical markerless motion classification 基于视觉的生物力学无标记运动分类
Pub Date : 2023-02-16 DOI: 10.22630/mgv.2023.32.1.1
Yu Liang Liew, J. F. Chin
This study used stick model augmentation on single-camera motion video to create a markerless motion classification model of manual operations. All videos were augmented with a stick model composed of keypoints and lines by using the programming model, which later incorporated the COCO dataset, OpenCV and OpenPose modules to estimate the coordinates and body joints. The stick model data included the initial velocity, cumulative velocity, and acceleration for each body joint. The extracted motion vector data were normalized using three different techniques, and the resulting datasets were subjected to eight classifiers. The experiment involved four distinct motion sequences performed by eight participants. The random forest classifier performed the best in terms of accuracy in recorded data classification in its min-max normalized dataset. This classifier also obtained a score of 81.80% for the dataset before random subsampling and a score of 92.37% for the resampled dataset. Meanwhile, the random subsampling method dramatically improved classification accuracy by removing noise data and replacing them with replicated instances to balance the class. This research advances methodological and applied knowledge on the capture and classification of human motion using a single camera view.
本研究对单摄像机运动视频采用棒模型增强,建立了手动操作的无标记运动分类模型。使用编程模型对所有视频进行关键点和直线组成的棒模型增强,然后结合COCO数据集、OpenCV和OpenPose模块估计坐标和身体关节。棒模型数据包括每个身体关节的初始速度、累积速度和加速度。使用三种不同的技术对提取的运动矢量数据进行归一化,并对得到的数据集进行8种分类。该实验包括由8名参与者完成的4种不同的动作序列。随机森林分类器在其最小-最大归一化数据集中的记录数据分类精度方面表现最好。该分类器对随机子抽样前的数据集得分为81.80%,对重抽样数据集得分为92.37%。同时,随机子抽样方法通过去除噪声数据并用复制实例代替噪声数据来平衡分类,从而显著提高了分类精度。本研究推进了使用单摄像机视图捕获和分类人体运动的方法和应用知识。
{"title":"Vision-based biomechanical markerless motion classification","authors":"Yu Liang Liew, J. F. Chin","doi":"10.22630/mgv.2023.32.1.1","DOIUrl":"https://doi.org/10.22630/mgv.2023.32.1.1","url":null,"abstract":"This study used stick model augmentation on single-camera motion video to create a markerless motion classification model of manual operations. All videos were augmented with a stick model composed of keypoints and lines by using the programming model, which later incorporated the COCO dataset, OpenCV and OpenPose modules to estimate the coordinates and body joints. The stick model data included the initial velocity, cumulative velocity, and acceleration for each body joint. The extracted motion vector data were normalized using three different techniques, and the resulting datasets were subjected to eight classifiers. The experiment involved four distinct motion sequences performed by eight participants. The random forest classifier performed the best in terms of accuracy in recorded data classification in its min-max normalized dataset. This classifier also obtained a score of 81.80% for the dataset before random subsampling and a score of 92.37% for the resampled dataset. Meanwhile, the random subsampling method dramatically improved classification accuracy by removing noise data and replacing them with replicated instances to balance the class. This research advances methodological and applied knowledge on the capture and classification of human motion using a single camera view.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"56 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74916933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Person re-identification accuracy improvement by training a CNN with the new large joint dataset and re-rank 用新的大型联合数据集训练CNN并重新排序,提高人再识别精度
Pub Date : 2022-12-19 DOI: 10.22630/mgv.2022.31.1.5
R. Bohush, S. Ihnatsyeva, S. Ablameyko
The paper is aimed to improve person re-identification accuracy in distributed video surveillance systems based on constructing a large joint image dataset of people for training convolutional neural networks (CNN). For this aim, an analysis of existing datasets is provided. Then, a new large joint dataset for person re-identification task is constructed that includes the existing public datasets CUHK02, CUHK03, Market, Duke, MSMT17 and PolReID. Testing for re-identification is performed for such frequently cited CNNs as ResNet-50, DenseNet121 and PCB. Re-identification accuracy is evaluated by using the main metrics Rank, mAP and mINP. The use of the new large joint dataset makes it possible to improve Rank1 mAP, mINP on all test sets. Re-ranking is used to further increase the re-identification accuracy. Presented results confirm the effectiveness of the proposed approach.
本文旨在通过构建用于训练卷积神经网络(CNN)的大型联合图像数据集来提高分布式视频监控系统中人的再识别精度。为此,提供了对现有数据集的分析。然后,构建了一个包含现有公共数据集CUHK02、CUHK03、Market、Duke、MSMT17和PolReID的大型联合数据集,用于人员再识别任务。对诸如ResNet-50、DenseNet121和PCB等经常被引用的cnn进行了重新识别测试。利用Rank、mAP和mINP等主要指标对再识别精度进行评价。使用新的大型联合数据集可以提高所有测试集上的Rank1 mAP, mINP。重新排序是为了进一步提高重新识别的准确性。给出的结果证实了所提方法的有效性。
{"title":"Person re-identification accuracy improvement by training a CNN with the new large joint dataset and re-rank","authors":"R. Bohush, S. Ihnatsyeva, S. Ablameyko","doi":"10.22630/mgv.2022.31.1.5","DOIUrl":"https://doi.org/10.22630/mgv.2022.31.1.5","url":null,"abstract":"The paper is aimed to improve person re-identification accuracy in distributed video surveillance systems based on constructing a large joint image dataset of people for training convolutional neural networks (CNN). For this aim, an analysis of existing datasets is provided. Then, a new large joint dataset for person re-identification task is constructed that includes the existing public datasets CUHK02, CUHK03, Market, Duke, MSMT17 and PolReID. Testing for re-identification is performed for such frequently cited CNNs as ResNet-50, DenseNet121 and PCB. Re-identification accuracy is evaluated by using the main metrics Rank, mAP and mINP. The use of the new large joint dataset makes it possible to improve Rank1 mAP, mINP on all test sets. Re-ranking is used to further increase the re-identification accuracy. Presented results confirm the effectiveness of the proposed approach.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76316851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention-based U-Net for image demoiréing 基于注意力的U-Net图像分解
Pub Date : 2022-12-15 DOI: 10.22630/mgv.2022.31.1.1
Tomasz Lehmann
Image demoiréing is a particular example of a picture restoration problem. Moiré is an interference pattern generated by overlaying similar but slightly offset templates.In this paper, we present a deep learning based algorithm to reduce moiré disruptions. The proposed solution contains an explanation of the cross-sampling procedure – the training dataset management method which was optimized according to limited computing resources.Suggested neural network architecture is based on Attention U-Net structure. It is an exceptionally effective model which was not proposed before in image demoiréing systems. The greatest improvement of this model in comparison to U-Net network is the implementation of attention gates. These additional computing operations make the algorithm more focused on target structures.We also examined three MSE and SSIM based loss functions. The SSIM index is used to predict the perceived quality of digital images and videos. A similar approach was applied in various computer vision areas.The author’s main contributions to the image demoiréing problem contain the use of the novel architecture for this task, innovative two-part loss function, and the untypical use of the cross-sampling training procedure.
图像还原是图像恢复问题的一个特殊示例。干涉图案是由重叠相似但略有偏移的模板而产生的干涉图案。在本文中,我们提出了一种基于深度学习的算法来减少干扰。该解决方案包含了对交叉采样过程的解释-根据有限的计算资源进行优化的训练数据集管理方法。建议的神经网络架构是基于注意力U-Net结构。它是一种以前没有提出过的非常有效的图像分解模型。与U-Net相比,该模型最大的改进是实现了注意力门。这些额外的计算操作使算法更加专注于目标结构。我们还研究了三种基于MSE和SSIM的损失函数。SSIM指数用于预测数字图像和视频的感知质量。类似的方法被应用于各种计算机视觉领域。作者对图像分解问题的主要贡献包括使用新架构,创新的两部分损失函数,以及非典型的交叉采样训练过程。
{"title":"Attention-based U-Net for image demoiréing","authors":"Tomasz Lehmann","doi":"10.22630/mgv.2022.31.1.1","DOIUrl":"https://doi.org/10.22630/mgv.2022.31.1.1","url":null,"abstract":"Image demoiréing is a particular example of a picture restoration problem. Moiré is an interference pattern generated by overlaying similar but slightly offset templates.In this paper, we present a deep learning based algorithm to reduce moiré disruptions. The proposed solution contains an explanation of the cross-sampling procedure – the training dataset management method which was optimized according to limited computing resources.Suggested neural network architecture is based on Attention U-Net structure. It is an exceptionally effective model which was not proposed before in image demoiréing systems. The greatest improvement of this model in comparison to U-Net network is the implementation of attention gates. These additional computing operations make the algorithm more focused on target structures.We also examined three MSE and SSIM based loss functions. The SSIM index is used to predict the perceived quality of digital images and videos. A similar approach was applied in various computer vision areas.The author’s main contributions to the image demoiréing problem contain the use of the novel architecture for this task, innovative two-part loss function, and the untypical use of the cross-sampling training procedure.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82486740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Machine Graphics and Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1