Journal of Electronic Imaging最新文献_第8页

Reconstructing images with attention generative adversarial network against adversarial attacks 利用注意力生成式对抗网络重建图像，抵御对抗性攻击

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-06-01 DOI: 10.1117/1.jei.33.3.033029

Xiong Shen, Yiqin Lu, Zhe Cheng, Zhongshu Mao, Zhang Yang, Jiancheng Qin

Deep learning is widely used in the field of computer vision, but the emergence of adversarial examples threatens its application. How to effectively detect adversarial examples and correct their labels has become a problem to be solved in this application field. Generative adversarial networks (GANs) can effectively learn the features from images. Based on GAN, this work proposes a defense method called “Reconstructing images with GAN” (RIG). The adversarial examples are generated by attack algorithms reconstructed by the trained generator of RIG to eliminate the perturbations of the adversarial examples, which disturb the models for classification, so that the models can restore their labels when classifying the reconstructed images. In addition, to improve the defensive performance of RIG, the attention mechanism (AM) is introduced to enhance the defense effect of RIG, which is called reconstructing images with attention GAN (RIAG). Experiments show that RIG and RIAG can effectively eliminate the perturbations of the adversarial examples. The results also show that RIAG has a better defensive performance than RIG in eliminating the perturbations of adversarial examples, which indicates that the introduction of AM can effectively improve the defense effect of RIG.

深度学习在计算机视觉领域得到了广泛应用，但对抗性示例的出现对其应用造成了威胁。如何有效地检测对抗示例并纠正其标签成为该应用领域亟待解决的问题。生成式对抗网络（GAN）可以有效地从图像中学习特征。在 GAN 的基础上，本研究提出了一种名为 "用 GAN 重构图像"（RIG）的防御方法。对抗示例由经过 RIG 训练的生成器重建的攻击算法生成，以消除对抗示例对分类模型的扰动，从而使模型在对重建图像进行分类时能够恢复其标签。此外，为了提高 RIG 的防御性能，还引入了注意力机制（AM）来增强 RIG 的防御效果，这就是注意力 GAN（RIAG）。实验表明，RIG 和 RIAG 能有效消除对抗实例的扰动。实验结果还表明，在消除对抗实例的扰动方面，RIAG 的防御性能优于 RIG，这说明引入 AM 可以有效提高 RIG 的防御效果。

{"title":"Reconstructing images with attention generative adversarial network against adversarial attacks","authors":"Xiong Shen, Yiqin Lu, Zhe Cheng, Zhongshu Mao, Zhang Yang, Jiancheng Qin","doi":"10.1117/1.jei.33.3.033029","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033029","url":null,"abstract":"Deep learning is widely used in the field of computer vision, but the emergence of adversarial examples threatens its application. How to effectively detect adversarial examples and correct their labels has become a problem to be solved in this application field. Generative adversarial networks (GANs) can effectively learn the features from images. Based on GAN, this work proposes a defense method called “Reconstructing images with GAN” (RIG). The adversarial examples are generated by attack algorithms reconstructed by the trained generator of RIG to eliminate the perturbations of the adversarial examples, which disturb the models for classification, so that the models can restore their labels when classifying the reconstructed images. In addition, to improve the defensive performance of RIG, the attention mechanism (AM) is introduced to enhance the defense effect of RIG, which is called reconstructing images with attention GAN (RIAG). Experiments show that RIG and RIAG can effectively eliminate the perturbations of the adversarial examples. The results also show that RIAG has a better defensive performance than RIG in eliminating the perturbations of adversarial examples, which indicates that the introduction of AM can effectively improve the defense effect of RIG.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"135 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AFFNet: adversarial feature fusion network for super-resolution image reconstruction in remote sensing images AFFNet：用于遥感图像超分辨率图像重建的对抗特征融合网络

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-06-01 DOI: 10.1117/1.jei.33.3.033032

Qian Zhao, Qianxi Yin

As an important source of Earth surface information, remote sensing image has the problems of rough and fuzzy image details and poor perception quality, which affect further analysis and application of geographic information. To address the above problems, we introduce the adversarial feature fusion network with an attention-based mechanism for super-resolution reconstruction of remote sensing images in this paper. First, residual structures are designed in the generator to enhance the deep feature extraction capability of remote sensing images. The residual structure is composed of the depthwise over-parameterized convolution and self-attention mechanism, which work synergistically to extract deep feature information from remote sensing images. Second, coordinate attention feature fusion module is introduced at the feature fusion stage, which can fuse shallow features and deep high-level features. Therefore, it can enhance the attention of the model to different features and better fuse inconsistent semantic features. Finally, we design the pixel-attention upsampling module in the up-sampling stage. It adaptively focuses on the most information-rich regions of a pixel and restores the image details more accurately. We conducted extensive experiments on several remote sensing image datasets, and the results showed that compared with current advanced models, our method can better restore the details in the image and achieve good subjective visual effects, which also verifies the effectiveness and superiority of the algorithm proposed in this paper.

遥感图像作为地球表面信息的重要来源，存在图像细节粗糙模糊、感知质量差等问题，影响了地理信息的进一步分析和应用。针对上述问题，我们在本文中引入了基于注意力机制的对抗特征融合网络，用于遥感图像的超分辨率重建。首先，在生成器中设计了残差结构，以增强遥感图像的深度特征提取能力。残差结构由深度超参数化卷积和自注意机制组成，两者协同工作，提取遥感图像的深度特征信息。其次，在特征融合阶段引入了坐标注意特征融合模块，该模块可以融合浅层特征和深层高层特征。因此，它可以提高模型对不同特征的关注度，更好地融合不一致的语义特征。最后，我们在上采样阶段设计了像素注意力上采样模块。它能自适应地关注像素中信息最丰富的区域，更准确地还原图像细节。我们在多个遥感图像数据集上进行了大量实验，结果表明，与目前先进的模型相比，我们的方法能更好地还原图像细节，达到良好的主观视觉效果，这也验证了本文提出的算法的有效性和优越性。

{"title":"AFFNet: adversarial feature fusion network for super-resolution image reconstruction in remote sensing images","authors":"Qian Zhao, Qianxi Yin","doi":"10.1117/1.jei.33.3.033032","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033032","url":null,"abstract":"As an important source of Earth surface information, remote sensing image has the problems of rough and fuzzy image details and poor perception quality, which affect further analysis and application of geographic information. To address the above problems, we introduce the adversarial feature fusion network with an attention-based mechanism for super-resolution reconstruction of remote sensing images in this paper. First, residual structures are designed in the generator to enhance the deep feature extraction capability of remote sensing images. The residual structure is composed of the depthwise over-parameterized convolution and self-attention mechanism, which work synergistically to extract deep feature information from remote sensing images. Second, coordinate attention feature fusion module is introduced at the feature fusion stage, which can fuse shallow features and deep high-level features. Therefore, it can enhance the attention of the model to different features and better fuse inconsistent semantic features. Finally, we design the pixel-attention upsampling module in the up-sampling stage. It adaptively focuses on the most information-rich regions of a pixel and restores the image details more accurately. We conducted extensive experiments on several remote sensing image datasets, and the results showed that compared with current advanced models, our method can better restore the details in the image and achieve good subjective visual effects, which also verifies the effectiveness and superiority of the algorithm proposed in this paper.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"36 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141518299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Lightweight deep and cross residual skip connection separable CNN for plant leaf diseases classification 用于植物叶片病害分类的轻量级深度和交叉残差跳接可分离 CNN

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-06-01 DOI: 10.1117/1.jei.33.3.033035

Naresh Vedhamuru, Ramanathan Malmathanraj, Ponnusamy Palanisamy

Crop diseases have an adverse effect on the yield, productivity, and quality of agricultural produce, which threatens the safety and security of the global feature of food supply. Addressing and controlling plant diseases through implementation of timely disease management strategies to reduce their transmission are essential for ensuring minimal crop loss, and addressing the increasing demand for food worldwide as the population continues to increase in a steadfast manner. Crop disease mitigation measures involve preventive monitoring, resulting in early detection and classification of plant diseases for effective agricultural procedure to improve crop yield. Early detection and accurate diagnosis of plant diseases enables farmers to deploy disease management strategies, such interventions are critical for better management contributing to higher crop output by curbing the spread of infection and limiting the extent of damage caused by diseases. We propose and implement a deep and cross residual skip connection separable convolutional neural network (DCRSCSCNN) for identifying and classifying leaf diseases for crops including apple, corn, cucumber, grape, potato, and guava. The significant feature of DCRSCSCNN includes residual skip connection and cross residual skip connection separable convolution block. The usage of residual skip connections assists in fixing the gradient vanishing issue faced by network architecture. The employment of separable convolution decreases the number of parameters, which leads to a model with a reduced size. So far, there has been limited exploration or investigation of leveraging separable convolution within lightweight neural networks. Extensive evaluation of several training and test sets using distinct datasets demonstrate that the proposed DCRSCSCNN outperforms other state-of-the-art approaches. The DCRSCSCNN achieved exceptional classification and identification accuracy rates of 99.89% for apple, 98.72% for corn, 100% for cucumber, 99.78% for grape, 100% for potato, 99.69% for guava1, and 99.08% for guava2 datasets.

农作物病害对农产品的产量、生产率和质量都有不利影响，威胁着全球粮食供应的安全和保障。通过实施及时的病害管理策略来应对和控制植物病害，减少病害传播，对于确保将作物损失降至最低，以及应对全球人口持续增长带来的粮食需求增长至关重要。作物病害缓解措施包括预防性监测，从而及早发现植物病害并对其进行分类，以采取有效的农业措施提高作物产量。对植物病害的早期检测和准确诊断使农民能够部署病害管理策略，这种干预措施对更好地管理至关重要，可通过遏制感染传播和限制病害造成的损害程度来提高作物产量。我们提出并实施了一种深度和交叉残差跳接可分离卷积神经网络（DCRSCSCNN），用于对苹果、玉米、黄瓜、葡萄、马铃薯和番石榴等作物的叶片病害进行识别和分类。DCRSCSCNN 的重要特征包括残余跳转连接和交叉残余跳转连接可分离卷积块。残差跳转连接的使用有助于解决网络架构所面临的梯度消失问题。可分离卷积的使用减少了参数的数量，从而缩小了模型的规模。迄今为止，在轻量级神经网络中利用可分离卷积的探索或研究还很有限。利用不同的数据集对多个训练集和测试集进行的广泛评估表明，所提出的 DCRSCSCNN 优于其他最先进的方法。DCRSCSCNN 在苹果、玉米、黄瓜、葡萄、马铃薯、番石榴 1 和番石榴 2 数据集上的分类和识别准确率分别达到了 99.89%、98.72%、100%、99.78%、100%、99.69% 和 99.08%。

{"title":"Lightweight deep and cross residual skip connection separable CNN for plant leaf diseases classification","authors":"Naresh Vedhamuru, Ramanathan Malmathanraj, Ponnusamy Palanisamy","doi":"10.1117/1.jei.33.3.033035","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033035","url":null,"abstract":"Crop diseases have an adverse effect on the yield, productivity, and quality of agricultural produce, which threatens the safety and security of the global feature of food supply. Addressing and controlling plant diseases through implementation of timely disease management strategies to reduce their transmission are essential for ensuring minimal crop loss, and addressing the increasing demand for food worldwide as the population continues to increase in a steadfast manner. Crop disease mitigation measures involve preventive monitoring, resulting in early detection and classification of plant diseases for effective agricultural procedure to improve crop yield. Early detection and accurate diagnosis of plant diseases enables farmers to deploy disease management strategies, such interventions are critical for better management contributing to higher crop output by curbing the spread of infection and limiting the extent of damage caused by diseases. We propose and implement a deep and cross residual skip connection separable convolutional neural network (DCRSCSCNN) for identifying and classifying leaf diseases for crops including apple, corn, cucumber, grape, potato, and guava. The significant feature of DCRSCSCNN includes residual skip connection and cross residual skip connection separable convolution block. The usage of residual skip connections assists in fixing the gradient vanishing issue faced by network architecture. The employment of separable convolution decreases the number of parameters, which leads to a model with a reduced size. So far, there has been limited exploration or investigation of leveraging separable convolution within lightweight neural networks. Extensive evaluation of several training and test sets using distinct datasets demonstrate that the proposed DCRSCSCNN outperforms other state-of-the-art approaches. The DCRSCSCNN achieved exceptional classification and identification accuracy rates of 99.89% for apple, 98.72% for corn, 100% for cucumber, 99.78% for grape, 100% for potato, 99.69% for guava1, and 99.08% for guava2 datasets.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"74 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141518298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Stega4NeRF: cover selection steganography for neural radiance fields Stega4NeRF：神经辐射场的封面选择隐写术

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-06-01 DOI: 10.1117/1.jei.33.3.033031

Weina Dong, Jia Liu, Lifeng Chen, Wenquan Sun, Xiaozhong Pan

The implicit neural representation of visual data (such as images, videos, and 3D models) has become a current hotspot in computer vision research. This work proposes a cover selection steganography scheme for neural radiance fields (NeRFs). The message sender first trains an NeRF model selecting any viewpoint in 3D space as the viewpoint key Kv, to generate a unique secret viewpoint image. Subsequently, a message extractor is trained using overfitting to establish a one-to-one mapping between the secret viewpoint image and the secret message. To address the issue of securely transmitting the message extractor in traditional steganography, the message extractor is concealed within a hybrid model performing standard classification tasks. The receiver possesses a shared extractor key Ke, which is used to recover the message extractor from the hybrid model. Then the secret viewpoint image is obtained by NeRF through the viewpoint key Kv, and the secret message is extracted by inputting it into the message extractor. Experimental results demonstrate that the trained message extractor achieves high-speed steganography with a large capacity and attains a 100% message embedding. Additionally, the vast viewpoint key space of NeRF ensures the concealment of the scheme.

视觉数据（如图像、视频和三维模型）的隐式神经表示已成为当前计算机视觉研究的热点。本研究提出了一种针对神经辐射场（NeRF）的封面选择隐写术方案。信息发送者首先训练神经辐射场模型，选择三维空间中的任意视点作为视点密钥 Kv，生成唯一的秘密视点图像。随后，利用过拟合训练信息提取器，在秘密视点图像和秘密信息之间建立一一对应的映射关系。为了解决传统隐写术中安全传输信息提取器的问题，信息提取器被隐藏在一个执行标准分类任务的混合模型中。接收者拥有一个共享提取器密钥 Ke，用来从混合模型中恢复信息提取器。然后，通过视点密钥 Kv，用 NeRF 获取秘密视点图像，并将其输入信息提取器，提取秘密信息。实验结果表明，训练有素的信息提取器实现了大容量高速隐写，信息嵌入率达到 100%。此外，NeRF 广阔的视角密钥空间确保了该方案的隐蔽性。

{"title":"Stega4NeRF: cover selection steganography for neural radiance fields","authors":"Weina Dong, Jia Liu, Lifeng Chen, Wenquan Sun, Xiaozhong Pan","doi":"10.1117/1.jei.33.3.033031","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033031","url":null,"abstract":"The implicit neural representation of visual data (such as images, videos, and 3D models) has become a current hotspot in computer vision research. This work proposes a cover selection steganography scheme for neural radiance fields (NeRFs). The message sender first trains an NeRF model selecting any viewpoint in 3D space as the viewpoint key Kv, to generate a unique secret viewpoint image. Subsequently, a message extractor is trained using overfitting to establish a one-to-one mapping between the secret viewpoint image and the secret message. To address the issue of securely transmitting the message extractor in traditional steganography, the message extractor is concealed within a hybrid model performing standard classification tasks. The receiver possesses a shared extractor key Ke, which is used to recover the message extractor from the hybrid model. Then the secret viewpoint image is obtained by NeRF through the viewpoint key Kv, and the secret message is extracted by inputting it into the message extractor. Experimental results demonstrate that the trained message extractor achieves high-speed steganography with a large capacity and attains a 100% message embedding. Additionally, the vast viewpoint key space of NeRF ensures the concealment of the scheme.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"81 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141518301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Low-light image enhancement using negative feedback pulse coupled neural network 利用负反馈脉冲耦合神经网络增强弱光图像效果

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-06-01 DOI: 10.1117/1.jei.33.3.033037

Ping Gao, Guidong Zhang, Lingling Chen, Xiaoyun Chen

Low-light image enhancement, fundamentally an ill-posed problem, seeks to simultaneously provide superior visual effects and preserve the natural appearance. Current methodologies often exhibit limitations in contrast enhancement, noise reduction, and the mitigation of halo artifacts. Negative feedback pulse coupled neural network (NFPCNN) is proposed to provide a well posed solution based on uniform distribution in contrast enhancement. The negative feedback dynamically adjusts the attenuation amplitude of neuron threshold based on recent neuronal ignited state. Neurons in the concentrated brightness area arrange smaller attenuation amplitude to enhance the local contrast, whereas neurons in the sparse area set larger attenuation amplitude. NFPCNN makes up for the negligence of pulse coupled neural network in the brightness distribution of the input image. Consistent with Weber–Fechner law, gamma correction is employed to adjust the output of NFPCNN. Although contrast enhancement can improve detail expressiveness, it might also introduce artifacts or aggravate noise. To mitigate these issues, the bilateral filter is employed to suppress halo artifacts. Brightness is used as coefficient to refine the Relativity-of-Gaussian noise suppression method. Experimental results show that the proposed method can effectively suppress noise while enhancing image contrast.

低照度图像增强从根本上说是一个难题，需要同时提供卓越的视觉效果和保持自然的外观。目前的方法通常在对比度增强、降噪和减少光晕伪影方面表现出局限性。负反馈脉冲耦合神经网络（NFPCNN）的提出，为对比度增强提供了一种基于均匀分布的合理解决方案。负反馈会根据神经元最近的点燃状态动态调整神经元阈值的衰减幅度。亮度集中区域的神经元设置较小的衰减幅度以增强局部对比度，而稀疏区域的神经元则设置较大的衰减幅度。NFPCNN 弥补了脉冲耦合神经网络对输入图像亮度分布的疏忽。根据韦伯-费希纳定律，NFPCNN 的输出调整采用伽玛校正。虽然对比度增强可以提高细节表现力，但也可能会引入伪影或加重噪声。为了缓解这些问题，我们采用了双边滤波器来抑制光晕伪影。亮度作为系数被用来完善高斯相对噪声抑制方法。实验结果表明，所提出的方法能有效抑制噪声，同时增强图像对比度。

{"title":"Low-light image enhancement using negative feedback pulse coupled neural network","authors":"Ping Gao, Guidong Zhang, Lingling Chen, Xiaoyun Chen","doi":"10.1117/1.jei.33.3.033037","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033037","url":null,"abstract":"Low-light image enhancement, fundamentally an ill-posed problem, seeks to simultaneously provide superior visual effects and preserve the natural appearance. Current methodologies often exhibit limitations in contrast enhancement, noise reduction, and the mitigation of halo artifacts. Negative feedback pulse coupled neural network (NFPCNN) is proposed to provide a well posed solution based on uniform distribution in contrast enhancement. The negative feedback dynamically adjusts the attenuation amplitude of neuron threshold based on recent neuronal ignited state. Neurons in the concentrated brightness area arrange smaller attenuation amplitude to enhance the local contrast, whereas neurons in the sparse area set larger attenuation amplitude. NFPCNN makes up for the negligence of pulse coupled neural network in the brightness distribution of the input image. Consistent with Weber–Fechner law, gamma correction is employed to adjust the output of NFPCNN. Although contrast enhancement can improve detail expressiveness, it might also introduce artifacts or aggravate noise. To mitigate these issues, the bilateral filter is employed to suppress halo artifacts. Brightness is used as coefficient to refine the Relativity-of-Gaussian noise suppression method. Experimental results show that the proposed method can effectively suppress noise while enhancing image contrast.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"28 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Face antispoofing method based on single-modal and lightweight network 基于单模态和轻量级网络的人脸防伪方法

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-06-01 DOI: 10.1117/1.jei.33.3.033030

Guoxiang Tong, Xinrong Yan

In the field of face antispoofing, researchers are increasingly focusing their efforts on multimodal and feature fusion. While multimodal approaches are more effective than single-modal ones, they often come with a huge number of parameters, require significant computational resources, and pose challenges for execution on mobile devices. To address the real-time problem, we propose a fast and lightweight framework based on ShuffleNet V2. Our approach takes patch-level images as input, enhances unit performance by introducing an attention module, and addresses dataset sample imbalance issues through the focal loss function. The framework effectively tackles the real-time constraints of the model. We evaluate the performance of our model on CASIA-FASD, Replay-Attack, and MSU-MFSD datasets. The results demonstrate that our method outperforms the current state-of-the-art methods in both intratest and intertest scenarios. Furthermore, our network has only 0.84 M parameters and 0.81 GFlops, making it suitable for deployment in mobile and real-time settings. Our work can serve as a valuable reference for researchers seeking to develop single-modal face antispoofing methods suitable for mobile and real-time applications.

在人脸防欺骗领域，研究人员正越来越多地把精力集中在多模态和特征融合上。虽然多模态方法比单模态方法更有效，但它们往往带有大量参数，需要大量计算资源，并给移动设备的执行带来挑战。为了解决实时性问题，我们提出了一种基于 ShuffleNet V2 的快速轻量级框架。我们的方法将斑块级图像作为输入，通过引入注意力模块增强单元性能，并通过焦点损失函数解决数据集样本不平衡问题。该框架有效地解决了模型的实时性限制。我们在 CASIA-FASD、Replay-Attack 和 MSU-MFSD 数据集上评估了模型的性能。结果表明，我们的方法在测试内和测试间的表现都优于目前最先进的方法。此外，我们的网络只有 0.84 M 参数和 0.81 GFlops，适合在移动和实时环境中部署。我们的工作可以为寻求开发适用于移动和实时应用的单模态人脸反欺骗方法的研究人员提供有价值的参考。

{"title":"Face antispoofing method based on single-modal and lightweight network","authors":"Guoxiang Tong, Xinrong Yan","doi":"10.1117/1.jei.33.3.033030","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033030","url":null,"abstract":"In the field of face antispoofing, researchers are increasingly focusing their efforts on multimodal and feature fusion. While multimodal approaches are more effective than single-modal ones, they often come with a huge number of parameters, require significant computational resources, and pose challenges for execution on mobile devices. To address the real-time problem, we propose a fast and lightweight framework based on ShuffleNet V2. Our approach takes patch-level images as input, enhances unit performance by introducing an attention module, and addresses dataset sample imbalance issues through the focal loss function. The framework effectively tackles the real-time constraints of the model. We evaluate the performance of our model on CASIA-FASD, Replay-Attack, and MSU-MFSD datasets. The results demonstrate that our method outperforms the current state-of-the-art methods in both intratest and intertest scenarios. Furthermore, our network has only 0.84 M parameters and 0.81 GFlops, making it suitable for deployment in mobile and real-time settings. Our work can serve as a valuable reference for researchers seeking to develop single-modal face antispoofing methods suitable for mobile and real-time applications.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"230 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141518300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive sparse attention module based on reciprocal nearest neighbors 基于互惠近邻的自适应稀疏注意力模块

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-06-01 DOI: 10.1117/1.jei.33.3.033038

Zhonggui Sun, Can Zhang, Mingzhu Zhang

The attention mechanism has become a crucial technique in deep feature representation for computer vision tasks. Using a similarity matrix, it enhances the current feature point with global context from the feature map of the network. However, the indiscriminate utilization of all information can easily introduce some irrelevant contents, inevitably hampering performance. In response to this challenge, sparsing, a common information filtering strategy, has been applied in many related studies. Regrettably, their filtering processes often lack reliability and adaptability. To address this issue, we first define an adaptive-reciprocal nearest neighbors (A-RNN) relationship. In identifying neighbors, it gains flexibility through learning adaptive thresholds. In addition, by introducing a reciprocity mechanism, the reliability of neighbors is ensured. Then, we use A-RNN to rectify the similarity matrix in the conventional attention module. In the specific implementation, to distinctly consider non-local and local information, we introduce two blocks: the non-local sparse constraint block and the local sparse constraint block. The former utilizes A-RNN to sparsify non-local information, whereas the latter uses adaptive thresholds to sparsify local information. As a result, an adaptive sparse attention (ASA) module is achieved, inheriting the advantages of flexibility and reliability from A-RNN. In the validation for the proposed ASA module, we use it to replace the attention module in NLNet and conduct experiments on semantic segmentation benchmarks including Cityscapes, ADE20K and PASCAL VOC 2012. With the same backbone (ResNet101), our ASA module outperforms the conventional attention module and its some state-of-the-art variants.

注意力机制已成为计算机视觉任务中深度特征表示的关键技术。它利用相似性矩阵，通过网络特征图中的全局上下文来增强当前特征点。然而，不加区分地利用所有信息很容易引入一些不相关的内容，从而不可避免地影响性能。为了应对这一挑战，稀疏化作为一种常见的信息过滤策略，已在许多相关研究中得到应用。遗憾的是，它们的过滤过程往往缺乏可靠性和适应性。为了解决这个问题，我们首先定义了一种自适应互惠近邻（A-RNN）关系。在识别邻居时，它通过学习自适应阈值获得灵活性。此外，通过引入互惠机制，确保了邻居的可靠性。然后，我们利用 A-RNN 修正传统注意力模块中的相似性矩阵。在具体实现中，为了区别考虑非本地信息和本地信息，我们引入了两个区块：非本地稀疏约束区块和本地稀疏约束区块。前者利用 A-RNN 来稀疏非本地信息，而后者则利用自适应阈值来稀疏本地信息。因此，自适应稀疏注意（ASA）模块继承了 A-RNN 的灵活性和可靠性优势。在验证所提出的 ASA 模块时，我们用它取代了 NLNet 中的注意力模块，并在包括 Cityscapes、ADE20K 和 PASCAL VOC 2012 在内的语义分割基准上进行了实验。在相同的骨干网（ResNet101）上，我们的 ASA 模块优于传统的注意力模块及其一些最先进的变体。

{"title":"Adaptive sparse attention module based on reciprocal nearest neighbors","authors":"Zhonggui Sun, Can Zhang, Mingzhu Zhang","doi":"10.1117/1.jei.33.3.033038","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033038","url":null,"abstract":"The attention mechanism has become a crucial technique in deep feature representation for computer vision tasks. Using a similarity matrix, it enhances the current feature point with global context from the feature map of the network. However, the indiscriminate utilization of all information can easily introduce some irrelevant contents, inevitably hampering performance. In response to this challenge, sparsing, a common information filtering strategy, has been applied in many related studies. Regrettably, their filtering processes often lack reliability and adaptability. To address this issue, we first define an adaptive-reciprocal nearest neighbors (A-RNN) relationship. In identifying neighbors, it gains flexibility through learning adaptive thresholds. In addition, by introducing a reciprocity mechanism, the reliability of neighbors is ensured. Then, we use A-RNN to rectify the similarity matrix in the conventional attention module. In the specific implementation, to distinctly consider non-local and local information, we introduce two blocks: the non-local sparse constraint block and the local sparse constraint block. The former utilizes A-RNN to sparsify non-local information, whereas the latter uses adaptive thresholds to sparsify local information. As a result, an adaptive sparse attention (ASA) module is achieved, inheriting the advantages of flexibility and reliability from A-RNN. In the validation for the proposed ASA module, we use it to replace the attention module in NLNet and conduct experiments on semantic segmentation benchmarks including Cityscapes, ADE20K and PASCAL VOC 2012. With the same backbone (ResNet101), our ASA module outperforms the conventional attention module and its some state-of-the-art variants.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"169 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Super-resolution reconstruction of images based on residual dual-path interactive fusion combined with attention 基于残差双路径交互融合与注意力相结合的图像超分辨率重建技术

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-06-01 DOI: 10.1117/1.jei.33.3.033034

Wang Hao, Peng Taile, Zhou Ying

In recent years, deep learning has made significant progress in the field of single-image super-resolution (SISR) reconstruction, which has greatly improved reconstruction quality. However, most of the SISR networks focus too much on increasing the depth of the network in the process of feature extraction and neglect the connections between different levels of features as well as the full use of low-frequency feature information. To address this problem, this work proposes a network based on residual dual-path interactive fusion combined with attention (RDIFCA). Using the dual interactive fusion strategy, the network achieves the effective fusion and multiplexing of high- and low-frequency information while increasing the depth of the network, which significantly enhances the expressive ability of the network. The experimental results show that the proposed RDIFCA network exhibits certain superiority in terms of objective evaluation indexes and visual effects on the Set5, Set14, BSD100, Urban100, and Manga109 test sets.

近年来，深度学习在单图像超分辨率（SISR）重建领域取得了重大进展，极大地提高了重建质量。然而，大多数 SISR 网络在特征提取过程中过于注重增加网络的深度，而忽视了不同层次特征之间的联系以及低频特征信息的充分利用。针对这一问题，本研究提出了一种基于残差双路径交互融合结合注意力（RDIFCA）的网络。利用双交互融合策略，该网络在增加网络深度的同时，实现了高频和低频信息的有效融合和复用，显著增强了网络的表达能力。实验结果表明，所提出的 RDIFCA 网络在 Set5、Set14、BSD100、Urban100 和 Manga109 测试集上的客观评价指标和视觉效果方面都表现出了一定的优越性。

引用次数: 0

Robust classification with noisy labels using Venn–Abers predictors 使用 Venn-Abers 预测器对噪声标签进行稳健分类

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-06-01 DOI: 10.1117/1.jei.33.3.031210

Ichraq Lemghari, Sylvie Le Hégarat-Mascle, Emanuel Aldea, Jennifer Vandoni

The advent of deep learning methods has led to impressive advances in computer vision tasks over the past decades, largely due to their ability to extract non-linear features that are well adapted to the task at hand. For supervised approaches, data labeling is essential to achieve a high level of performance; however, this task can be so fastidious or even troublesome in difficult contexts (e.g., specific defect detection, unconventional data annotations, etc.) that experts can sometimes erroneously provide the wrong ground truth label. Considering classification problems, this paper addresses the issue of handling noisy labels in datasets. Specifically, we first detect the noisy samples of a dataset using set-valued labels and then improve their classification using Venn–Abers predictors. The obtained results reach more than 0.99 and 0.90 accuracy for noisified versions of two widely used image classification datasets, digit MNIST and CIFAR-10 respectively with a 40% two-class pair-flip noise ratio and 0.87 accuracy for CIFAR-10 with 10-class uniform 40% noise ratio.

过去几十年来，深度学习方法的出现在计算机视觉任务中取得了令人瞩目的进步，这主要归功于它们能够提取与当前任务相适应的非线性特征。对于有监督的方法来说，数据标注对于实现高水平性能至关重要；然而，在困难的情况下（如特定缺陷检测、非常规数据注释等），这项任务可能非常繁琐甚至麻烦，以至于专家有时会错误地提供错误的基本真实标签。考虑到分类问题，本文探讨了如何处理数据集中的噪声标签。具体来说，我们首先使用集值标签检测数据集中的噪声样本，然后使用 Venn-Abers 预测器改进其分类。对于两个广泛使用的图像分类数据集（数字 MNIST 和 CIFAR-10）的噪声版本（两类对翻噪声比为 40%），所获得的结果分别达到了 0.99 和 0.90 以上的准确率；对于 CIFAR-10（10 类统一噪声比为 40%），所获得的准确率为 0.87。

引用次数: 0

Monitoring of industrial crystallization processes through image sequence segmentation and characterization 通过图像序列分割和特征描述监控工业结晶过程

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-06-01 DOI: 10.1117/1.jei.33.3.031211

Saïd Rahmani, Roger de Souza Lima, Eric Serris, Ana Cameirão, Johan Debayle

To enhance control and monitoring of industrial crystallization processes, we propose an innovative nondestructive imaging method utilizing in situ 2D vision sensors. This approach enables the acquisition of 2D videos depicting crystal aggregates throughout the batch crystallization process. Our approach is built upon experimental observations, specifically regarding the process dynamics and sensor fouling. It involves dynamic segmentation of observed aggregates, from which quantitative analyses are derived. Notably, our method allows for tracking the evolution of the particle size distribution of crystal aggregates over time and the determination of the growth kinetics of crystals that agglomerate at the sensor air gap. This enables the detection of key stages in the crystallization process and the geometric characterization of crystal aggregate production.

为了加强对工业结晶过程的控制和监测，我们提出了一种利用原位二维视觉传感器的创新型无损成像方法。这种方法能够获取二维视频，描述整个批量结晶过程中的晶体聚集情况。我们的方法建立在实验观察的基础上，特别是在过程动态和传感器堵塞方面。它包括对观察到的聚集体进行动态分割，并从中得出定量分析结果。值得注意的是，我们的方法可以跟踪晶体聚集体的粒度分布随时间的变化，并确定在传感器气隙处聚集的晶体的生长动力学。这样就能检测结晶过程中的关键阶段，并对晶体聚集体的生成进行几何表征。

引用次数: 0