首页 > 最新文献

Proceedings of the 4th International Conference on Image Processing and Machine Vision最新文献

英文 中文
Masked Face Recognition with 3D Facial Geometric Attributes 基于三维人脸几何属性的蒙面人脸识别
Yuan Wang, Zhen Yang, Zhiqiang Zhang, Huaijuan Zang, Qiang Zhu, Shu Zhan
During the coronavirus pandemic, the demand for contactless biometrics technology has promoted the development of masked face recognition. Training a masked face recognition model needs to address two crucial issues: a lack of large-scale realistic masked face datasets and the difficulty of obtaining robust face representations due to the huge difference between complete faces and masked faces. To tackle with the first issue, this paper proposes to train a 3D masked face recognition network with non-masked face images. For the second issue, this paper utilizes the geometric features of 3D face, namely depth, azimuth, and elevation, to represent the face. The inherent advantages of 3D face enhance the stability and practicability of 3D masked face recognition network. In addition, a facial geometry extractor is proposed to highlight discriminative facial geometric features so that the 3D masked face recognition network can take full advantage of the depth, azimuth and elevation information in distinguishing face identities. The experimental results on four public 3D face datasets show that the proposed 3D masked face recognition network improves the accuracy of the masked face recognition, which verifies the feasibility of training the masked face recognition model with non-masked face images.
在新冠肺炎疫情期间,对非接触式生物识别技术的需求推动了蒙面人脸识别的发展。训练一个被蒙面人脸识别模型需要解决两个关键问题:缺乏大规模的逼真的被蒙面人脸数据集,以及由于完整人脸和被蒙面人脸之间的巨大差异而难以获得鲁棒的人脸表征。为了解决第一个问题,本文提出用非被遮挡的人脸图像训练三维被遮挡人脸识别网络。第二个问题,本文利用三维人脸的几何特征,即深度、方位角和高程来表示人脸。三维人脸的固有优势增强了三维掩模人脸识别网络的稳定性和实用性。此外,提出了一种人脸几何提取器来突出识别人脸的几何特征,使三维掩模人脸识别网络能够充分利用深度、方位角和高程信息来识别人脸身份。在4个公开的三维人脸数据集上的实验结果表明,所提出的三维掩模人脸识别网络提高了掩模人脸识别的准确率,验证了用非掩模人脸图像训练掩模人脸识别模型的可行性。
{"title":"Masked Face Recognition with 3D Facial Geometric Attributes","authors":"Yuan Wang, Zhen Yang, Zhiqiang Zhang, Huaijuan Zang, Qiang Zhu, Shu Zhan","doi":"10.1145/3529446.3529449","DOIUrl":"https://doi.org/10.1145/3529446.3529449","url":null,"abstract":"During the coronavirus pandemic, the demand for contactless biometrics technology has promoted the development of masked face recognition. Training a masked face recognition model needs to address two crucial issues: a lack of large-scale realistic masked face datasets and the difficulty of obtaining robust face representations due to the huge difference between complete faces and masked faces. To tackle with the first issue, this paper proposes to train a 3D masked face recognition network with non-masked face images. For the second issue, this paper utilizes the geometric features of 3D face, namely depth, azimuth, and elevation, to represent the face. The inherent advantages of 3D face enhance the stability and practicability of 3D masked face recognition network. In addition, a facial geometry extractor is proposed to highlight discriminative facial geometric features so that the 3D masked face recognition network can take full advantage of the depth, azimuth and elevation information in distinguishing face identities. The experimental results on four public 3D face datasets show that the proposed 3D masked face recognition network improves the accuracy of the masked face recognition, which verifies the feasibility of training the masked face recognition model with non-masked face images.","PeriodicalId":151062,"journal":{"name":"Proceedings of the 4th International Conference on Image Processing and Machine Vision","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116728455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Development and future of information hiding in image transformation domain: A literature review 图像变换领域信息隐藏的研究进展与展望
Yue Yang
Information hiding technology is a technique to hide meaningful information in the public carrier information. When data elements are becoming more and more important, information hiding technology has a better performance than traditional encryption and decryption technologies such as single table substitution and Virginia cipher. Since it contains redundant information, image is a common carrier for information hiding among many carrier types. As one of the primary means of information hiding technology, image transform-domain information hiding technology has been widely studied and used in academic and industrial fields. Information hiding in the image transform domain improves security and robustness effectively. This paper mainly introduces the concepts, principles, processes, mainstream algorithms and applications of image transform-domain information hiding techniques. A possible general future development of it is also depicted.
信息隐藏技术是一种将有意义的信息隐藏在公共载体信息中的技术。当数据元素变得越来越重要时,信息隐藏技术比传统的单表替换和弗吉尼亚密码等加解密技术具有更好的性能。由于图像中含有冗余信息,因此在众多的载体类型中,图像是信息隐藏的常用载体。图像变换域信息隐藏技术作为信息隐藏技术的主要手段之一,在学术界和工业界得到了广泛的研究和应用。图像变换域的信息隐藏有效地提高了算法的安全性和鲁棒性。本文主要介绍了图像变换域信息隐藏技术的概念、原理、过程、主流算法及应用。还描述了它未来可能的总体发展。
{"title":"Development and future of information hiding in image transformation domain: A literature review","authors":"Yue Yang","doi":"10.1145/3529446.3529458","DOIUrl":"https://doi.org/10.1145/3529446.3529458","url":null,"abstract":"Information hiding technology is a technique to hide meaningful information in the public carrier information. When data elements are becoming more and more important, information hiding technology has a better performance than traditional encryption and decryption technologies such as single table substitution and Virginia cipher. Since it contains redundant information, image is a common carrier for information hiding among many carrier types. As one of the primary means of information hiding technology, image transform-domain information hiding technology has been widely studied and used in academic and industrial fields. Information hiding in the image transform domain improves security and robustness effectively. This paper mainly introduces the concepts, principles, processes, mainstream algorithms and applications of image transform-domain information hiding techniques. A possible general future development of it is also depicted.","PeriodicalId":151062,"journal":{"name":"Proceedings of the 4th International Conference on Image Processing and Machine Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126439270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Improved Dark Channel Prior Defogging Algorithm Based on Transmissivity Image Segmentation 基于透射率图像分割的改进暗通道先验去雾算法
Wenjing Yu, Jinyu He, Jing Yin, Enqi Chen
In view of the dark channel prior algorithm in dealing with the haze image color distortion in the sky region, atmospheric light value error extraction and scene edge halo effect, a dark channel prior defogging removal method based on transmittance image is proposed in this paper. The input haze image is converted into transmittance image. with guided filtering, the improved MSR algorithm can be used to segment the image into sky region and non-sky region. Minimum filtering and sky transmittance estimation are performed for sky region and non-sky region respectively. The two parts of images obtained by processing are combined, and the transmission is refined by fast guided filtering, and the haze image is defogging removed by combining the atmospheric light value extracted from the sky region to obtain a clear restored image. The experimental results show that the improved minimum filtering algorithm and transmittance estimation method can effectively remove the halo effect at the edge of the depth of field and the color distortion in the sky area, so that the restored image retains more details and has a clearer and natural vision. Compared with the traditional dark channel prior algorithm, the information entropy of the proposed algorithm increases by 12.1% on average, PNSR increases by 6.024% on average, SSIM increases by 15.8%, and MSE decreases by 4.7% on average.
针对暗通道先验算法在处理天空区域雾霾图像颜色失真、大气光值误差提取和场景边缘晕效应等方面的不足,提出了一种基于透射率图像的暗通道先验去雾方法。将输入的雾霾图像转换为透射图像。通过引导滤波,改进的MSR算法可以将图像分割为天空区域和非天空区域。分别对天空区和非天空区进行最小滤波和天空透射率估计。将处理后得到的两部分图像进行组合,通过快速引导滤波对传输进行细化,结合从天空区域提取的大气光值对雾霾图像进行去雾去除,得到清晰的恢复图像。实验结果表明,改进的最小滤波算法和透射率估计方法能有效去除景深边缘的光晕效应和天空区域的颜色失真,使恢复后的图像保留了更多的细节,视觉更清晰自然。与传统的暗信道先验算法相比,该算法的信息熵平均提高12.1%,PNSR平均提高6.024%,SSIM平均提高15.8%,MSE平均降低4.7%。
{"title":"An Improved Dark Channel Prior Defogging Algorithm Based on Transmissivity Image Segmentation","authors":"Wenjing Yu, Jinyu He, Jing Yin, Enqi Chen","doi":"10.1145/3529446.3529454","DOIUrl":"https://doi.org/10.1145/3529446.3529454","url":null,"abstract":"In view of the dark channel prior algorithm in dealing with the haze image color distortion in the sky region, atmospheric light value error extraction and scene edge halo effect, a dark channel prior defogging removal method based on transmittance image is proposed in this paper. The input haze image is converted into transmittance image. with guided filtering, the improved MSR algorithm can be used to segment the image into sky region and non-sky region. Minimum filtering and sky transmittance estimation are performed for sky region and non-sky region respectively. The two parts of images obtained by processing are combined, and the transmission is refined by fast guided filtering, and the haze image is defogging removed by combining the atmospheric light value extracted from the sky region to obtain a clear restored image. The experimental results show that the improved minimum filtering algorithm and transmittance estimation method can effectively remove the halo effect at the edge of the depth of field and the color distortion in the sky area, so that the restored image retains more details and has a clearer and natural vision. Compared with the traditional dark channel prior algorithm, the information entropy of the proposed algorithm increases by 12.1% on average, PNSR increases by 6.024% on average, SSIM increases by 15.8%, and MSE decreases by 4.7% on average.","PeriodicalId":151062,"journal":{"name":"Proceedings of the 4th International Conference on Image Processing and Machine Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128388263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Image Processing based Scoring System for Small Arms Firing in the Military Domain 基于图像处理的军事领域轻武器射击计分系统
Md Rezoanul Hafiz Chandan, Tanvir Ahamad Naim, Md. Abdur Razzak, N. Sharmin
Firing and its overall conduct of scoring plays a very vital role in the military organizations. Currently, the evaluation of such a system is partially automated that still needs manual works. A fully automatic calculation of target scoring is time demanding and can add a great value in minimizing human error. This paper puts focus on designing and developing the process of scoring a fired target automatically based on image processing techniques in the context of military domain. Along with conventional image processing techniques, this work has used structural similarity indexing algorithms to detect the bullet holes from images. The experiment has been conducted on real-time scenarios considering the military field and shows a promising result to calculate the scores automatically.
射击及其整体得分行为在军事组织中起着至关重要的作用。目前,这样一个系统的评估是部分自动化的,仍然需要手工工作。完全自动计算目标得分是费时的,并且可以在最大限度地减少人为错误方面增加很大的价值。本文重点研究了军事领域背景下基于图像处理技术的发射目标自动评分过程的设计与开发。与传统的图像处理技术一起,本研究使用结构相似度索引算法从图像中检测弹孔。在考虑军事领域的实时场景下进行了实验,显示了自动计算分数的良好效果。
{"title":"Image Processing based Scoring System for Small Arms Firing in the Military Domain","authors":"Md Rezoanul Hafiz Chandan, Tanvir Ahamad Naim, Md. Abdur Razzak, N. Sharmin","doi":"10.1145/3529446.3529456","DOIUrl":"https://doi.org/10.1145/3529446.3529456","url":null,"abstract":"Firing and its overall conduct of scoring plays a very vital role in the military organizations. Currently, the evaluation of such a system is partially automated that still needs manual works. A fully automatic calculation of target scoring is time demanding and can add a great value in minimizing human error. This paper puts focus on designing and developing the process of scoring a fired target automatically based on image processing techniques in the context of military domain. Along with conventional image processing techniques, this work has used structural similarity indexing algorithms to detect the bullet holes from images. The experiment has been conducted on real-time scenarios considering the military field and shows a promising result to calculate the scores automatically.","PeriodicalId":151062,"journal":{"name":"Proceedings of the 4th International Conference on Image Processing and Machine Vision","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132441700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Research on the Application of Three-dimensional Digital Model in the Protection and Inheritance of Traditional Architecture: –Take the example of the Ma Tau Wall of Huizhou architecture 三维数字模型在传统建筑保护与传承中的应用研究——以徽州建筑马头墙为例
Aozhi Wei, Yang Cao, Wangqiao Rong
As a representative architectural style of Huizhou architecture, the Ma Tau Wall has a deep historical and cultural heritage, and plays an important role in the study of Huizhou culture and the innovation and development of modern architecture. This paper will take the Huizhou Ma Tau Wall as an example and combine it with 3D animation to explore how the traditional architecture can be preserved and inherited in digital form with the support of 3D technology. Under the study of fieldwork and literature review, the Ma Tau Wall is digitally restored through Maya and given interactive effects with the help of UE4, and finally it is made into an interactive model, so that the viewers can experience the traditional ancient architecture culture in a novel form, which is more beneficial to the protection and inheritance of ancient architecture.
马头墙作为徽州建筑的代表建筑风格,具有深厚的历史文化底蕴,对研究徽州文化和现代建筑的创新发展具有重要作用。本文将以惠州马头墙为例,结合三维动画,探讨在三维技术的支持下,传统建筑如何以数字形式保存和传承。在实地调研和文献回顾的基础上,通过Maya对马头墙进行数字化修复,并借助UE4进行交互效果处理,最后制作成互动模型,让观众以一种新颖的形式体验传统古建筑文化,更有利于古建筑的保护和传承。
{"title":"Research on the Application of Three-dimensional Digital Model in the Protection and Inheritance of Traditional Architecture: –Take the example of the Ma Tau Wall of Huizhou architecture","authors":"Aozhi Wei, Yang Cao, Wangqiao Rong","doi":"10.1145/3529446.3529452","DOIUrl":"https://doi.org/10.1145/3529446.3529452","url":null,"abstract":"As a representative architectural style of Huizhou architecture, the Ma Tau Wall has a deep historical and cultural heritage, and plays an important role in the study of Huizhou culture and the innovation and development of modern architecture. This paper will take the Huizhou Ma Tau Wall as an example and combine it with 3D animation to explore how the traditional architecture can be preserved and inherited in digital form with the support of 3D technology. Under the study of fieldwork and literature review, the Ma Tau Wall is digitally restored through Maya and given interactive effects with the help of UE4, and finally it is made into an interactive model, so that the viewers can experience the traditional ancient architecture culture in a novel form, which is more beneficial to the protection and inheritance of ancient architecture.","PeriodicalId":151062,"journal":{"name":"Proceedings of the 4th International Conference on Image Processing and Machine Vision","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116931238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Decision Modeling and Simulation of Fighter Air-to-ground Combat Based on Reinforcement Learning 基于强化学习的战斗机空对地作战决策建模与仿真
Yifei Wu, Yonglin Lei, Zhi Zhu, Yan Wang
With the Artificial Intelligence (AI) widely used in air combat simulation system, the decision-making system of fighter has reached a high level of complexity. Traditionally, the pure theoretical analysis and the rule-based system are not enough to represent the cognitive behavior of pilots. In order to properly specify the autonomous decision-making of fighter, hence, we proposed a unified framework which combines the combat simulation and machine learning in this paper. This framework adopts deep reinforcement learning modelling by using the supervised learning and the Deep Q-Network (DQN) methods. As a proof of concept, we built an autonomous decision-making training scenario based on the Weapon Effectiveness Simulation System (WESS). The simulation results show that the intelligent decision-making model based on the proposed framework has better combat effects than the traditional decision-making model based on knowledge engineering.
随着人工智能技术在空战仿真系统中的广泛应用,战斗机决策系统的复杂性已经达到了很高的水平。传统上,单纯的理论分析和基于规则的系统不足以表征飞行员的认知行为。为此,本文提出了一种将战斗仿真与机器学习相结合的统一框架,以合理地描述战斗机的自主决策。该框架通过使用监督学习和深度Q-Network (DQN)方法,采用深度强化学习建模。作为概念验证,我们建立了一个基于武器效能仿真系统(WESS)的自主决策训练场景。仿真结果表明,基于该框架的智能决策模型比基于知识工程的传统决策模型具有更好的作战效果。
{"title":"Decision Modeling and Simulation of Fighter Air-to-ground Combat Based on Reinforcement Learning","authors":"Yifei Wu, Yonglin Lei, Zhi Zhu, Yan Wang","doi":"10.1145/3529446.3529463","DOIUrl":"https://doi.org/10.1145/3529446.3529463","url":null,"abstract":"With the Artificial Intelligence (AI) widely used in air combat simulation system, the decision-making system of fighter has reached a high level of complexity. Traditionally, the pure theoretical analysis and the rule-based system are not enough to represent the cognitive behavior of pilots. In order to properly specify the autonomous decision-making of fighter, hence, we proposed a unified framework which combines the combat simulation and machine learning in this paper. This framework adopts deep reinforcement learning modelling by using the supervised learning and the Deep Q-Network (DQN) methods. As a proof of concept, we built an autonomous decision-making training scenario based on the Weapon Effectiveness Simulation System (WESS). The simulation results show that the intelligent decision-making model based on the proposed framework has better combat effects than the traditional decision-making model based on knowledge engineering.","PeriodicalId":151062,"journal":{"name":"Proceedings of the 4th International Conference on Image Processing and Machine Vision","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128604197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Event Camera Survey and Extension Application to Semantic Segmentation 事件相机调查及其在语义分割中的扩展应用
Siqi Jia
Event cameras are a kind of radically novel vision sensors. Unlike traditional standard cameras which acquire full images at a fixed rate, event cameras capture brightness changes for each pixel asynchronously. As a result, the output of event camera is a stream of events, which include information of each pixel about the time, location and sign of brightness changes. Event cameras have many advantages over traditional cameras: high temporal resolution (with microsecond resolution), low latency, low power (10mW), high dynamic range (HDR>120 dB). Therefore, event cameras are increasingly used in the field in which many problems cannot be solved due to limitation of frame-based cameras, such as AR/VR, video game, mobile robotics and computer vision. In this paper, we first describe the basic principle and advantageous properties of event camera. Additionally, we introduce wide range of applications of event camera. Specific functions: tracking, high speed and high dynamic range video reconstruction, dynamic obstacle detection and avoidance, motion segmentation. Based on these fundamental applications, much more intelligent and even completely vision-based application are produced, like its combination with fully convolutional network. Finally, we use DeepLab to do semantic segmentation of the scene and apply this result to the corresponding points of the 3D reconstruction of the same scene. We also propose potential solution to solve the ambiguity problem of semantic segmentation in the end.
事件相机是一种全新的视觉传感器。与以固定速率获取完整图像的传统标准相机不同,事件相机可以异步捕获每个像素的亮度变化。因此,事件相机的输出是一个事件流,其中包含了每个像素的时间、位置和亮度变化的标志等信息。与传统摄像机相比,事件摄像机具有许多优点:高时间分辨率(微秒级分辨率)、低延迟、低功耗(10mW)、高动态范围(HDR> 120db)。因此,事件摄像机越来越多地应用于AR/VR、视频游戏、移动机器人、计算机视觉等由于帧式摄像机的限制而无法解决的领域。本文首先介绍了事件相机的基本原理和优点。此外,我们还介绍了事件相机的广泛应用。具体功能:跟踪、高速高动态范围视频重建、动态障碍物检测与避障、运动分割。在这些基础应用的基础上,产生了更智能甚至完全基于视觉的应用,比如它与全卷积网络的结合。最后,我们使用DeepLab对场景进行语义分割,并将此结果应用于同一场景三维重建的对应点。最后提出了解决语义分词歧义问题的潜在解决方案。
{"title":"Event Camera Survey and Extension Application to Semantic Segmentation","authors":"Siqi Jia","doi":"10.1145/3529446.3529465","DOIUrl":"https://doi.org/10.1145/3529446.3529465","url":null,"abstract":"Event cameras are a kind of radically novel vision sensors. Unlike traditional standard cameras which acquire full images at a fixed rate, event cameras capture brightness changes for each pixel asynchronously. As a result, the output of event camera is a stream of events, which include information of each pixel about the time, location and sign of brightness changes. Event cameras have many advantages over traditional cameras: high temporal resolution (with microsecond resolution), low latency, low power (10mW), high dynamic range (HDR>120 dB). Therefore, event cameras are increasingly used in the field in which many problems cannot be solved due to limitation of frame-based cameras, such as AR/VR, video game, mobile robotics and computer vision. In this paper, we first describe the basic principle and advantageous properties of event camera. Additionally, we introduce wide range of applications of event camera. Specific functions: tracking, high speed and high dynamic range video reconstruction, dynamic obstacle detection and avoidance, motion segmentation. Based on these fundamental applications, much more intelligent and even completely vision-based application are produced, like its combination with fully convolutional network. Finally, we use DeepLab to do semantic segmentation of the scene and apply this result to the corresponding points of the 3D reconstruction of the same scene. We also propose potential solution to solve the ambiguity problem of semantic segmentation in the end.","PeriodicalId":151062,"journal":{"name":"Proceedings of the 4th International Conference on Image Processing and Machine Vision","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124705023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
AAT: An Efficient Adaptive Adversarial Training Algorithm 一种有效的自适应对抗训练算法
Menghua Cao, Dongxia Wang, Yulong Wang
Adversarial training is one of the most promising methods to improve the model's robustness, while the expensive training cost keeps a huge problem for this method. Recent researchers have made great effort to improve its performance by reducing the inner adversarial sample construction cost. Their works have alleviated this problem to some extent while the overall performance is still expensive and not interpretable. In this work, we propose AAT (Adaptive Adversarial Training) algorithm utilizing the inherent relationship between the model's robustness and the effects of the adversarial samples to accelerate the overall performance. Our method offers more interpretable robustness improvement while achieving higher efficiency than the state-of-the-art works on standard datasets. We have reduced more than 56% training time than traditional adversarial training on CIFAR10.
对抗性训练是提高模型鲁棒性最有希望的方法之一,但高昂的训练成本一直是该方法的一大难题。近年来研究人员通过降低内部对抗样本构建成本来提高其性能。他们的作品在一定程度上缓解了这一问题,但整体性能仍然昂贵且不可解释。在这项工作中,我们提出了AAT (Adaptive Adversarial Training)算法,利用模型的鲁棒性和对抗样本的影响之间的内在关系来加速整体性能。我们的方法提供了更多可解释的鲁棒性改进,同时实现了比标准数据集上最先进的工作更高的效率。我们在CIFAR10上的训练时间比传统的对抗性训练减少了56%以上。
{"title":"AAT: An Efficient Adaptive Adversarial Training Algorithm","authors":"Menghua Cao, Dongxia Wang, Yulong Wang","doi":"10.1145/3529446.3529464","DOIUrl":"https://doi.org/10.1145/3529446.3529464","url":null,"abstract":"Adversarial training is one of the most promising methods to improve the model's robustness, while the expensive training cost keeps a huge problem for this method. Recent researchers have made great effort to improve its performance by reducing the inner adversarial sample construction cost. Their works have alleviated this problem to some extent while the overall performance is still expensive and not interpretable. In this work, we propose AAT (Adaptive Adversarial Training) algorithm utilizing the inherent relationship between the model's robustness and the effects of the adversarial samples to accelerate the overall performance. Our method offers more interpretable robustness improvement while achieving higher efficiency than the state-of-the-art works on standard datasets. We have reduced more than 56% training time than traditional adversarial training on CIFAR10.","PeriodicalId":151062,"journal":{"name":"Proceedings of the 4th International Conference on Image Processing and Machine Vision","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123477216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Iris Image Super Resolution based on Wavelet Transform 基于小波变换的分层虹膜图像超分辨率研究
Yufeng Xia, Peipei Li, Jia Wang, Zhili Zhang, Duanling Li, Zhaofeng He
Iris images under the surveillance scenario are often low-quality, which makes the iris recognition challenging. Recently, deep learning-based methods are adopted to enhance the quality of iris images and achieve remarkable performance. However, these methods ignore the characteristics of the iris texture, which is important for iris recognition. In order to restore richer texture details, we propose a super-resolution network based on Wavelet with Transformer and Residual Attention Network (WTRAN). Specifically, we treat the low-resolution images as the low-frequency wavelet coefficients after wavelet decomposition and predict the corresponding high-frequency wavelet coefficients sequence. In order to extract detailed local features, we adopt both channel and spatial attention, and propose a Residual Dense Attention Block (RDAB). Furthermore, we propose a Convolutional Transformer Attention Module (CTAM) to integrate transformer and CNN to extract both the global topology and local texture details. In addition to constraining the quality of image generation, effective identity preserving constraints are also used to ensure the consistency of the super-resolution images in the high-level semantic space. Extensive experiments show that the proposed method has achieved competitive iris image super resolution results compared with the most advanced super-resolution method.
监控场景下的虹膜图像通常质量较低,这给虹膜识别带来了挑战。近年来,人们采用基于深度学习的方法来提高虹膜图像的质量,并取得了显著的效果。然而,这些方法忽略了虹膜纹理的特征,而虹膜纹理对虹膜识别至关重要。为了恢复更丰富的纹理细节,提出了一种基于小波变换和残差注意网络(WTRAN)的超分辨率网络。具体来说,我们将低分辨率图像作为小波分解后的低频小波系数,并预测相应的高频小波系数序列。为了提取详细的局部特征,我们采用通道注意和空间注意相结合的方法,提出了残差密集注意块(RDAB)。此外,我们提出了一种卷积变压器注意模块(CTAM),将变压器和CNN相结合,提取全局拓扑和局部纹理细节。在约束图像生成质量的同时,采用有效的同一性保持约束,保证了高语义空间超分辨率图像的一致性。大量实验表明,与最先进的超分辨率方法相比,该方法取得了具有竞争力的虹膜图像超分辨率结果。
{"title":"Hierarchical Iris Image Super Resolution based on Wavelet Transform","authors":"Yufeng Xia, Peipei Li, Jia Wang, Zhili Zhang, Duanling Li, Zhaofeng He","doi":"10.1145/3529446.3529453","DOIUrl":"https://doi.org/10.1145/3529446.3529453","url":null,"abstract":"Iris images under the surveillance scenario are often low-quality, which makes the iris recognition challenging. Recently, deep learning-based methods are adopted to enhance the quality of iris images and achieve remarkable performance. However, these methods ignore the characteristics of the iris texture, which is important for iris recognition. In order to restore richer texture details, we propose a super-resolution network based on Wavelet with Transformer and Residual Attention Network (WTRAN). Specifically, we treat the low-resolution images as the low-frequency wavelet coefficients after wavelet decomposition and predict the corresponding high-frequency wavelet coefficients sequence. In order to extract detailed local features, we adopt both channel and spatial attention, and propose a Residual Dense Attention Block (RDAB). Furthermore, we propose a Convolutional Transformer Attention Module (CTAM) to integrate transformer and CNN to extract both the global topology and local texture details. In addition to constraining the quality of image generation, effective identity preserving constraints are also used to ensure the consistency of the super-resolution images in the high-level semantic space. Extensive experiments show that the proposed method has achieved competitive iris image super resolution results compared with the most advanced super-resolution method.","PeriodicalId":151062,"journal":{"name":"Proceedings of the 4th International Conference on Image Processing and Machine Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129712765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vision-based Assessment of Instructional Content on Golf Performance 基于视觉的高尔夫成绩教学内容评价
Akshay Krishna, Patrick Smith, M. Nicolescu, S. Hayes
This paper focuses on the detection and tracking of a golf ball from video input, as a part of a study aiming to evaluate the influence of instructional content delivered through Acceptance and Commitment Training (ACT) on sports performance. As part of this experiment, the participants were putting on a small green region, and they recorded themselves performing the swing. Using our generated dataset, we propose an automated solution that involves only vision processing to detect and track the golf ball, and to estimate metrics such as velocity, shot angle, and whether the golf ball made the hole. We use color detection, background subtraction, contour detection, homography transformations, and other techniques to detect and track the golf ball. Our approach was extensively tested on a large dataset with annotations, with good results.
本文的重点是检测和跟踪视频输入的高尔夫球,作为研究的一部分,旨在评估通过接受和承诺训练(ACT)提供的教学内容对运动表现的影响。作为实验的一部分,参与者戴上一个小的绿色区域,并记录下自己荡秋千的过程。使用我们生成的数据集,我们提出了一个自动化的解决方案,该解决方案只涉及视觉处理来检测和跟踪高尔夫球,并估计诸如速度,击球角度以及高尔夫球是否进洞等指标。我们使用颜色检测、背景减法、轮廓检测、单应变换等技术来检测和跟踪高尔夫球。我们的方法在带有注释的大型数据集上进行了广泛的测试,结果很好。
{"title":"Vision-based Assessment of Instructional Content on Golf Performance","authors":"Akshay Krishna, Patrick Smith, M. Nicolescu, S. Hayes","doi":"10.1145/3529446.3529459","DOIUrl":"https://doi.org/10.1145/3529446.3529459","url":null,"abstract":"This paper focuses on the detection and tracking of a golf ball from video input, as a part of a study aiming to evaluate the influence of instructional content delivered through Acceptance and Commitment Training (ACT) on sports performance. As part of this experiment, the participants were putting on a small green region, and they recorded themselves performing the swing. Using our generated dataset, we propose an automated solution that involves only vision processing to detect and track the golf ball, and to estimate metrics such as velocity, shot angle, and whether the golf ball made the hole. We use color detection, background subtraction, contour detection, homography transformations, and other techniques to detect and track the golf ball. Our approach was extensively tested on a large dataset with annotations, with good results.","PeriodicalId":151062,"journal":{"name":"Proceedings of the 4th International Conference on Image Processing and Machine Vision","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127193148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 4th International Conference on Image Processing and Machine Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1