首页 > 最新文献

IEEE transactions on artificial intelligence最新文献

英文 中文
Broad Siamese Network for Facial Beauty Prediction 用于面部美感预测的广义连体网络
Pub Date : 2024-07-24 DOI: 10.1109/TAI.2024.3429293
Yikai Li;Tong Zhang;C. L. Philip Chen
Facial beauty prediction (FBP) aims to automatically predict beauty scores of facial images according to human perception. Usually, facial images contain lots of information irrelevant to facial beauty, such as information about pose, emotion, and illumination, which interferes with the prediction of facial beauty. To overcome interferences, we develop a broad Siamese network (BSN) to focus more on the task of beauty prediction. Specifically, BSN consists mainly of three components: a multitask Siamese network (MTSN), a multilayer attention (MLA) module, and a broad representation learning (BRL) module. First, MTSN is proposed with different tasks about facial beauty to fully mine knowledge about attractiveness and guide the network to neglect interference information. In the subnetwork of MTSN, the MLA module is proposed to focus more on salient features about facial beauty and reduce the impact of interference information. Then, the BRL module based on broad learning system (BLS) is developed to learn discriminative features with the guidance of beauty scores. It further releases facial features from the impact of interference information. Comparisons with state-of-the-art methods demonstrate the effectiveness of BSN.
面部美感预测(FBP)旨在根据人的感知自动预测面部图像的美感分数。通常,面部图像包含大量与面部美感无关的信息,如姿势、情感和光照等信息,这些信息会干扰面部美感预测。为了克服干扰,我们开发了广义连体网络(BSN),使其更专注于美感预测任务。具体来说,BSN 主要由三部分组成:多任务连体网络(MTSN)、多层注意(MLA)模块和广义表征学习(BRL)模块。首先,MTSN 提出了不同的面部美感任务,以充分挖掘有关吸引力的知识,并引导网络忽略干扰信息。在 MTSN 的子网络中,提出了 MLA 模块,以更加关注面部美的突出特征,减少干扰信息的影响。然后,开发了基于广泛学习系统(BLS)的 BRL 模块,在美貌评分的指导下学习辨别特征。它进一步使面部特征不受干扰信息的影响。与最先进方法的比较证明了 BSN 的有效性。
{"title":"Broad Siamese Network for Facial Beauty Prediction","authors":"Yikai Li;Tong Zhang;C. L. Philip Chen","doi":"10.1109/TAI.2024.3429293","DOIUrl":"https://doi.org/10.1109/TAI.2024.3429293","url":null,"abstract":"Facial beauty prediction (FBP) aims to automatically predict beauty scores of facial images according to human perception. Usually, facial images contain lots of information irrelevant to facial beauty, such as information about pose, emotion, and illumination, which interferes with the prediction of facial beauty. To overcome interferences, we develop a broad Siamese network (BSN) to focus more on the task of beauty prediction. Specifically, BSN consists mainly of three components: a multitask Siamese network (MTSN), a multilayer attention (MLA) module, and a broad representation learning (BRL) module. First, MTSN is proposed with different tasks about facial beauty to fully mine knowledge about attractiveness and guide the network to neglect interference information. In the subnetwork of MTSN, the MLA module is proposed to focus more on salient features about facial beauty and reduce the impact of interference information. Then, the BRL module based on broad learning system (BLS) is developed to learn discriminative features with the guidance of beauty scores. It further releases facial features from the impact of interference information. Comparisons with state-of-the-art methods demonstrate the effectiveness of BSN.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5786-5800"},"PeriodicalIF":0.0,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CycleGAN*: Collaborative AI Learning With Improved Adversarial Neural Networks for Multimodalities Data CycleGAN*:利用改进的对抗神经网络进行多模态数据的人工智能协作学习
Pub Date : 2024-07-23 DOI: 10.1109/TAI.2024.3432856
Yibo He;Kah Phooi Seng;Li Minn Ang
With the widespread adoption of generative adversarial networks (GANs) for sample generation, this article aims to enhance adversarial neural networks to facilitate collaborative artificial intelligence (AI) learning which has been specifically tailored to handle datasets containing multimodalities. Currently, a significant portion of the literature is dedicated to sample generation using GANs, with the objective of enhancing the detection performance of machine learning (ML) classifiers through the incorporation of these generated data into the original training set via adversarial training. The quality of the generated adversarial samples is contingent upon the sufficiency of training data samples. However, in the multimodal domain, the scarcity of multimodal data poses a challenge due to resource constraints. In this article, we address this challenge by proposing a new multimodal dataset generation approach based on the classical audio–visual speech recognition (AVSR) task, utilizing CycleGAN, DiscoGAN, and StyleGAN2 for exploration and performance comparison. AVSR experiments are conducted using the LRS2 and LRS3 corpora. Our experiments reveal that CycleGAN, DiscoGAN, and StyleGAN2 do not effectively address the low-data state problem in AVSR classification. Consequently, we introduce an enhanced model, CycleGAN*, based on the original CycleGAN, which efficiently learns the original dataset features and generates high-quality multimodal data. Experimental results demonstrate that the multimodal datasets generated by our proposed CycleGAN* exhibit significant improvement in word error rate (WER), indicating reduced errors. Notably, the images produced by CycleGAN* exhibit a marked enhancement in overall visual clarity, indicative of its superior generative capabilities. Furthermore, in contrast to traditional approaches, we underscore the significance of collaborative learning. We implement co-training with diverse multimodal data to facilitate information sharing and complementary learning across modalities. This collaborative approach enhances the model’s capability to integrate heterogeneous information, thereby boosting its performance in multimodal environments.
随着生成式对抗网络(GAN)在样本生成方面的广泛应用,本文旨在增强对抗神经网络,以促进协作式人工智能(AI)学习,这种学习是专门为处理包含多模态的数据集而量身定制的。目前,有相当一部分文献致力于使用 GAN 生成样本,目的是通过对抗训练将这些生成的数据纳入原始训练集,从而提高机器学习(ML)分类器的检测性能。生成的对抗样本的质量取决于训练数据样本是否充足。然而,在多模态领域,由于资源限制,多模态数据的稀缺性带来了挑战。在本文中,我们提出了一种基于经典视听语音识别(AVSR)任务的新的多模态数据集生成方法,利用 CycleGAN、DiscoGAN 和 StyleGAN2 进行探索和性能比较,从而应对这一挑战。AVSR 实验使用 LRS2 和 LRS3 语料库进行。实验结果表明,CycleGAN、DiscoGAN 和 StyleGAN2 无法有效解决 AVSR 分类中的低数据状态问题。因此,我们在原始 CycleGAN 的基础上引入了一个增强模型 CycleGAN*,它能有效地学习原始数据集特征并生成高质量的多模态数据。实验结果表明,由我们提出的 CycleGAN* 生成的多模态数据集在字错误率(WER)方面有显著改善,表明错误减少。值得注意的是,CycleGAN* 生成的图像在整体视觉清晰度上有明显提高,这表明它具有卓越的生成能力。此外,与传统方法相比,我们强调了协作学习的重要性。我们利用多样化的多模态数据实施协同训练,以促进信息共享和跨模态互补学习。这种协作方法增强了模型整合异构信息的能力,从而提高了模型在多模态环境中的性能。
{"title":"CycleGAN*: Collaborative AI Learning With Improved Adversarial Neural Networks for Multimodalities Data","authors":"Yibo He;Kah Phooi Seng;Li Minn Ang","doi":"10.1109/TAI.2024.3432856","DOIUrl":"https://doi.org/10.1109/TAI.2024.3432856","url":null,"abstract":"With the widespread adoption of generative adversarial networks (GANs) for sample generation, this article aims to enhance adversarial neural networks to facilitate collaborative artificial intelligence (AI) learning which has been specifically tailored to handle datasets containing multimodalities. Currently, a significant portion of the literature is dedicated to sample generation using GANs, with the objective of enhancing the detection performance of machine learning (ML) classifiers through the incorporation of these generated data into the original training set via adversarial training. The quality of the generated adversarial samples is contingent upon the sufficiency of training data samples. However, in the multimodal domain, the scarcity of multimodal data poses a challenge due to resource constraints. In this article, we address this challenge by proposing a new multimodal dataset generation approach based on the classical audio–visual speech recognition (AVSR) task, utilizing CycleGAN, DiscoGAN, and StyleGAN2 for exploration and performance comparison. AVSR experiments are conducted using the LRS2 and LRS3 corpora. Our experiments reveal that CycleGAN, DiscoGAN, and StyleGAN2 do not effectively address the low-data state problem in AVSR classification. Consequently, we introduce an enhanced model, CycleGAN*, based on the original CycleGAN, which efficiently learns the original dataset features and generates high-quality multimodal data. Experimental results demonstrate that the multimodal datasets generated by our proposed CycleGAN* exhibit significant improvement in word error rate (WER), indicating reduced errors. Notably, the images produced by CycleGAN* exhibit a marked enhancement in overall visual clarity, indicative of its superior generative capabilities. Furthermore, in contrast to traditional approaches, we underscore the significance of collaborative learning. We implement co-training with diverse multimodal data to facilitate information sharing and complementary learning across modalities. This collaborative approach enhances the model’s capability to integrate heterogeneous information, thereby boosting its performance in multimodal environments.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5616-5629"},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cooperative Advantage Actor–Critic Reinforcement Learning for Multiagent Pursuit-Evasion Games on Communication Graphs
Pub Date : 2024-07-23 DOI: 10.1109/TAI.2024.3432511
Yizhen Meng;Chun Liu;Qiang Wang;Longyu Tan
This article investigates the distributed optimal strategy problem in multiagent pursuit-evasion (MPE) games, striving for Nash equilibrium through the optimization of individual benefit matrices based on observations. To this end, a novel collaborative control scheme for MPE games using communication graphs is proposed. This scheme employs cooperative advantage actor–critic (A2C) reinforcement learning to facilitate collaborative capture by pursuers in a distributed manner while maintaining bounded system signals. The strategy orchestrates the actions of pursuers through adaptive neural network learning, ensuring proximity-based collaboration for effective captures. Meanwhile, evaders aim to evade collectively by converging toward each other. Through extensive simulations involving five pursuers and two evaders, the efficacy of the proposed approach is demonstrated, and pursuers seamlessly organize into pursuit units and capture evaders, validating the collaborative capture objective. This article represents a promising step toward effective and cooperative control strategies in MPE game scenarios.
{"title":"Cooperative Advantage Actor–Critic Reinforcement Learning for Multiagent Pursuit-Evasion Games on Communication Graphs","authors":"Yizhen Meng;Chun Liu;Qiang Wang;Longyu Tan","doi":"10.1109/TAI.2024.3432511","DOIUrl":"https://doi.org/10.1109/TAI.2024.3432511","url":null,"abstract":"This article investigates the distributed optimal strategy problem in multiagent pursuit-evasion (MPE) games, striving for Nash equilibrium through the optimization of individual benefit matrices based on observations. To this end, a novel collaborative control scheme for MPE games using communication graphs is proposed. This scheme employs cooperative advantage actor–critic (A2C) reinforcement learning to facilitate collaborative capture by pursuers in a distributed manner while maintaining bounded system signals. The strategy orchestrates the actions of pursuers through adaptive neural network learning, ensuring proximity-based collaboration for effective captures. Meanwhile, evaders aim to evade collectively by converging toward each other. Through extensive simulations involving five pursuers and two evaders, the efficacy of the proposed approach is demonstrated, and pursuers seamlessly organize into pursuit units and capture evaders, validating the collaborative capture objective. This article represents a promising step toward effective and cooperative control strategies in MPE game scenarios.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6509-6523"},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward Correlated Sequential Rules 实现相关的序列规则
Pub Date : 2024-07-22 DOI: 10.1109/TAI.2024.3429306
Lili Chen;Wensheng Gan;Chien-Ming Chen
The goal of high-utility sequential pattern mining (HUSPM) is to efficiently discover profitable or useful sequential patterns in a large number of sequences. However, simply being aware of utility-eligible patterns is insufficient for making predictions. To compensate for this deficiency, high-utility sequential rule mining (HUSRM) is designed to explore the confidence or probability of predicting the occurrence of consequence sequential patterns based on the appearance of premise sequential patterns. It has numerous applications, such as product recommendation and weather prediction. However, the existing algorithm, known as HUSRM, is limited to extracting all eligible rules while neglecting the correlation between the generated sequential rules. To address this issue, we propose a novel algorithm called correlated high-utility sequential rule miner (CoUSR) to integrate the concept of correlation into HUSRM. The proposed algorithm requires not only that each rule be correlated but also that the patterns in the antecedent and consequent of the high-utility sequential rule be correlated. The algorithm adopts a utility-list structure to avoid multiple database scans. Additionally, several pruning strategies are used to improve the algorithm's efficiency and performance. Based on several real-world datasets, subsequent experiments demonstrated that CoUSR is effective and efficient in terms of operation time and memory consumption. All codes are accessible on GitHub: https://github.com/DSI-Lab1/CoUSR.
高效用序列模式挖掘(HUSPM)的目标是在大量序列中有效地发现有利可图或有用的序列模式。然而,仅仅意识到有用模式还不足以进行预测。为了弥补这一不足,高效用序列规则挖掘(HUSRM)旨在根据前提序列模式的出现情况,探索预测后果序列模式出现的置信度或概率。它有许多应用,如产品推荐和天气预测。然而,现有的算法(即 HUSRM)仅限于提取所有符合条件的规则,而忽略了生成的序列规则之间的相关性。为了解决这个问题,我们提出了一种名为 "相关高效用序列规则挖掘器"(CoUSR)的新算法,将相关性概念融入 HUSRM。所提出的算法不仅要求每条规则都是相关的,还要求高效用序列规则的前因和后果中的模式是相关的。该算法采用效用列表结构,以避免多次数据库扫描。此外,还采用了多种剪枝策略来提高算法的效率和性能。基于多个真实数据集的后续实验证明,CoUSR 在运行时间和内存消耗方面都是有效和高效的。所有代码均可在 GitHub 上访问:https://github.com/DSI-Lab1/CoUSR。
{"title":"Toward Correlated Sequential Rules","authors":"Lili Chen;Wensheng Gan;Chien-Ming Chen","doi":"10.1109/TAI.2024.3429306","DOIUrl":"https://doi.org/10.1109/TAI.2024.3429306","url":null,"abstract":"The goal of high-utility sequential pattern mining (HUSPM) is to efficiently discover profitable or useful sequential patterns in a large number of sequences. However, simply being aware of utility-eligible patterns is insufficient for making predictions. To compensate for this deficiency, high-utility sequential rule mining (HUSRM) is designed to explore the confidence or probability of predicting the occurrence of consequence sequential patterns based on the appearance of premise sequential patterns. It has numerous applications, such as product recommendation and weather prediction. However, the existing algorithm, known as HUSRM, is limited to extracting all eligible rules while neglecting the correlation between the generated sequential rules. To address this issue, we propose a novel algorithm called correlated high-utility sequential rule miner (CoUSR) to integrate the concept of correlation into HUSRM. The proposed algorithm requires not only that each rule be correlated but also that the patterns in the antecedent and consequent of the high-utility sequential rule be correlated. The algorithm adopts a utility-list structure to avoid multiple database scans. Additionally, several pruning strategies are used to improve the algorithm's efficiency and performance. Based on several real-world datasets, subsequent experiments demonstrated that CoUSR is effective and efficient in terms of operation time and memory consumption. All codes are accessible on GitHub: \u0000<uri>https://github.com/DSI-Lab1/CoUSR</uri>\u0000.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 10","pages":"5340-5351"},"PeriodicalIF":0.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-Stage Representation Refinement Based on Convex Combination for 3-D Human Poses Estimation
Pub Date : 2024-07-22 DOI: 10.1109/TAI.2024.3432028
Luefeng Chen;Wei Cao;Biao Zheng;Min Wu;Witold Pedrycz;Kaoru Hirota
In the human pose estimation task, on the one hand, 3-D pose always has difficulty in dividing different 2-D poses if the view is limited; on the other hand, it is hard to reduce the lifting ambiguity because of the lack of depth information, it is an important and challenging problem. Therefore, two-stage representation refinement based on the convex combination for 3-D human pose estimation is proposed, in which the two-stage method includes a dense-spatial-temporal convolutional network and a local-to-refine network. The former is applied to determine the features between each video frame; the latter is used to get the different scales of pose details. It aims to address the difficulty of estimating 3-D human pose from 2-D image sequences. In such a way, it can better use the relations between every frame in the sequence of the pose video to produce more accurate results. Finally, we combine the above network with a block called convex combination to help refine the 3-D pose location. We test the proposed approach on both Human3.6m and MPII datasets. The result confirms that our method can achieve better performance than improved CNN supervision, a simple yet effective baseline, and coarse-to-fine volumetric prediction. Besides, a robustness test experiment is carried out for the proposed method while the input is interrupted. The result verifies that our method shows better robustness.
{"title":"Two-Stage Representation Refinement Based on Convex Combination for 3-D Human Poses Estimation","authors":"Luefeng Chen;Wei Cao;Biao Zheng;Min Wu;Witold Pedrycz;Kaoru Hirota","doi":"10.1109/TAI.2024.3432028","DOIUrl":"https://doi.org/10.1109/TAI.2024.3432028","url":null,"abstract":"In the human pose estimation task, on the one hand, 3-D pose always has difficulty in dividing different 2-D poses if the view is limited; on the other hand, it is hard to reduce the lifting ambiguity because of the lack of depth information, it is an important and challenging problem. Therefore, two-stage representation refinement based on the convex combination for 3-D human pose estimation is proposed, in which the two-stage method includes a dense-spatial-temporal convolutional network and a local-to-refine network. The former is applied to determine the features between each video frame; the latter is used to get the different scales of pose details. It aims to address the difficulty of estimating 3-D human pose from 2-D image sequences. In such a way, it can better use the relations between every frame in the sequence of the pose video to produce more accurate results. Finally, we combine the above network with a block called convex combination to help refine the 3-D pose location. We test the proposed approach on both Human3.6m and MPII datasets. The result confirms that our method can achieve better performance than improved CNN supervision, a simple yet effective baseline, and coarse-to-fine volumetric prediction. Besides, a robustness test experiment is carried out for the proposed method while the input is interrupted. The result verifies that our method shows better robustness.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6500-6508"},"PeriodicalIF":0.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Learning Security Breach by Evolutionary Universal Perturbation Attack (EUPA) 进化通用扰动攻击(EUPA)造成的深度学习安全漏洞
Pub Date : 2024-07-19 DOI: 10.1109/TAI.2024.3429473
Neeraj Gupta;Mahdi Khosravy;Antoine Pasquali;Olaf Witkowski
The potential for sabotaging deep convolutions neural networks classifiers by universal perturbation attack (UPA) has proved itself as an effective threat to fool deep learning models in sensitive applications such as autonomous vehicles, clinical diagnosis, face recognition, and so on. The prospective application of UPA is for adversarial training of deep convolutional networks against the attacks. Although evolutionary algorithms have already shown their tremendous ability in solving nonconvex complex problems, the literature has limited exploration of evolutionary techniques and strategies for UPA, thus, it needs to be explored on evolutionary algorithms to minimize the magnitude and number of perturbation pixels while maximizing the misclassification of maximum data samples. In this research. This work focuses on utilizing an integer coded genetic algorithm within an evolutionary framework to evolve the UPA. The evolutionary UPA has been structured, analyzed, and compared for two evolutionary optimization structures: 1) constrained single-objective evolutionary UPA; and 2) Pareto double-objective evolutionary UPA. The efficiency of the methodology is analyzed on GoogleNet convolution neural network for its effectiveness on the Imagenet dataset. The results show that under the same experimental conditions, the constrained single objective technique outperforms the Pareto double objective one, and manages a successful breach on a deep network wherein the average detection score falls to $0.446429$. It is observed that besides the minimization of the detection rate score, the constraint of invisibility of noise is much more effective rather than having a conflicting objective of noise power minimization.
在自动驾驶汽车、临床诊断、人脸识别等敏感应用中,普遍扰动攻击(UPA)破坏深度卷积神经网络分类器的可能性已被证明是愚弄深度学习模型的有效威胁。UPA 的前瞻性应用是针对攻击对深度卷积网络进行对抗性训练。虽然进化算法在解决非凸复杂问题方面已经展现出了巨大的能力,但文献中对 UPA 的进化技术和策略的探索还很有限,因此需要探索进化算法,在最大化数据样本误分类的同时,最小化扰动像素的大小和数量。在这项研究中。这项工作的重点是在进化框架内利用整数编码遗传算法来进化 UPA。针对两种进化优化结构,对进化 UPA 进行了构建、分析和比较:1) 受限单目标进化 UPA;和 2) 帕累托双目标进化 UPA。在 GoogleNet 卷积神经网络上分析了该方法在 Imagenet 数据集上的效率。结果表明,在相同的实验条件下,受限单目标技术优于帕累托双目标技术,并成功攻破了深度网络,其平均检测得分降至 0.446429 美元。据观察,除了检测率得分最小化外,噪声不可见的约束比噪声功率最小化这一相互冲突的目标更有效。
{"title":"Deep Learning Security Breach by Evolutionary Universal Perturbation Attack (EUPA)","authors":"Neeraj Gupta;Mahdi Khosravy;Antoine Pasquali;Olaf Witkowski","doi":"10.1109/TAI.2024.3429473","DOIUrl":"https://doi.org/10.1109/TAI.2024.3429473","url":null,"abstract":"The potential for sabotaging deep convolutions neural networks classifiers by universal perturbation attack (UPA) has proved itself as an effective threat to fool deep learning models in sensitive applications such as autonomous vehicles, clinical diagnosis, face recognition, and so on. The prospective application of UPA is for adversarial training of deep convolutional networks against the attacks. Although evolutionary algorithms have already shown their tremendous ability in solving nonconvex complex problems, the literature has limited exploration of evolutionary techniques and strategies for UPA, thus, it needs to be explored on evolutionary algorithms to minimize the magnitude and number of perturbation pixels while maximizing the misclassification of maximum data samples. In this research. This work focuses on utilizing an integer coded genetic algorithm within an evolutionary framework to evolve the UPA. The evolutionary UPA has been structured, analyzed, and compared for two evolutionary optimization structures: 1) constrained single-objective evolutionary UPA; and 2) Pareto double-objective evolutionary UPA. The efficiency of the methodology is analyzed on GoogleNet convolution neural network for its effectiveness on the Imagenet dataset. The results show that under the same experimental conditions, the constrained single objective technique outperforms the Pareto double objective one, and manages a successful breach on a deep network wherein the average detection score falls to \u0000<inline-formula><tex-math>$0.446429$</tex-math></inline-formula>\u0000. It is observed that besides the minimization of the detection rate score, the constraint of invisibility of noise is much more effective rather than having a conflicting objective of noise power minimization.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5655-5665"},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Spatial-Temporal Masked Contrast for Skeleton Action Recognition 用于骨骼动作识别的分层时空掩蔽对比技术
Pub Date : 2024-07-17 DOI: 10.1109/TAI.2024.3430260
Wenming Cao;Aoyu Zhang;Zhihai He;Yicha Zhang;Xinpeng Yin
In the field of 3-D action recognition, self-supervised learning has shown promising results but remains a challenging task. Previous approaches to motion modeling often relied on selecting features solely from the temporal or spatial domain, which limited the extraction of higher-level semantic information. Additionally, traditional one-to-one approaches in multilevel comparative learning overlooked the relationships between different levels, hindering the learning representation of the model. To address these issues, we propose the hierarchical spatial-temporal masked network (HSTM) for learning 3-D action representations. HSTM introduces a novel masking method that operates simultaneously in both the temporal and spatial dimensions. This approach leverages semantic relevance to identify meaningful regions in time and space, guiding the masking process based on semantic richness. This guidance is crucial for learning useful feature representations effectively. Furthermore, to enhance the learning of potential features, we introduce cross-level distillation (CLD) to extend the comparative learning approach. By training the model with two types of losses simultaneously, each level of the multilevel comparative learning process can be guided by levels rich in semantic information. This allows for more effective supervision of comparative learning, leading to improved performance. Extensive experiments conducted on the NTU-60, NTU-120, and PKU-MMD datasets demonstrate the effectiveness of our proposed framework. The learned action representations exhibit strong transferability and achieve state-of-the-art results.
在三维动作识别领域,自监督学习已经取得了可喜的成果,但仍然是一项具有挑战性的任务。以往的运动建模方法通常只依赖于从时间或空间域中选择特征,这限制了对更高层次语义信息的提取。此外,多层次比较学习中传统的一对一方法忽略了不同层次之间的关系,阻碍了模型的学习表示。为了解决这些问题,我们提出了用于学习三维动作表征的分层时空遮蔽网络(HSTM)。HSTM 引入了一种在时间和空间维度上同时运行的新型遮蔽方法。这种方法利用语义相关性来识别时间和空间中的有意义区域,并根据语义丰富程度来指导屏蔽过程。这种指导对于有效学习有用的特征表征至关重要。此外,为了加强对潜在特征的学习,我们引入了跨层次蒸馏(CLD)来扩展比较学习方法。通过同时用两类损失对模型进行训练,多层次比较学习过程中的每个层次都能得到语义信息丰富的层次的指导。这样就能更有效地监督比较学习,从而提高性能。在 NTU-60、NTU-120 和 PKU-MMD 数据集上进行的广泛实验证明了我们提出的框架的有效性。学习到的动作表征具有很强的可移植性,并取得了最先进的结果。
{"title":"Hierarchical Spatial-Temporal Masked Contrast for Skeleton Action Recognition","authors":"Wenming Cao;Aoyu Zhang;Zhihai He;Yicha Zhang;Xinpeng Yin","doi":"10.1109/TAI.2024.3430260","DOIUrl":"https://doi.org/10.1109/TAI.2024.3430260","url":null,"abstract":"In the field of 3-D action recognition, self-supervised learning has shown promising results but remains a challenging task. Previous approaches to motion modeling often relied on selecting features solely from the temporal or spatial domain, which limited the extraction of higher-level semantic information. Additionally, traditional one-to-one approaches in multilevel comparative learning overlooked the relationships between different levels, hindering the learning representation of the model. To address these issues, we propose the hierarchical spatial-temporal masked network (HSTM) for learning 3-D action representations. HSTM introduces a novel masking method that operates simultaneously in both the temporal and spatial dimensions. This approach leverages semantic relevance to identify meaningful regions in time and space, guiding the masking process based on semantic richness. This guidance is crucial for learning useful feature representations effectively. Furthermore, to enhance the learning of potential features, we introduce cross-level distillation (CLD) to extend the comparative learning approach. By training the model with two types of losses simultaneously, each level of the multilevel comparative learning process can be guided by levels rich in semantic information. This allows for more effective supervision of comparative learning, leading to improved performance. Extensive experiments conducted on the NTU-60, NTU-120, and PKU-MMD datasets demonstrate the effectiveness of our proposed framework. The learned action representations exhibit strong transferability and achieve state-of-the-art results.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5801-5814"},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Label-Efficient Time Series Representation Learning: A Review
Pub Date : 2024-07-17 DOI: 10.1109/TAI.2024.3430236
Emadeldeen Eldele;Mohamed Ragab;Zhenghua Chen;Min Wu;Chee-Keong Kwoh;Xiaoli Li
Label-efficient time series representation learning, which aims to learn effective representations with limited labeled data, is crucial for deploying deep learning models in real-world applications. To address the scarcity of labeled time series data, various strategies, e.g., transfer learning, self-supervised learning, and semisupervised learning, have been developed. In this survey, we introduce a novel taxonomy for the first time, categorizing existing approaches as in-domain or cross domain based on their reliance on external data sources or not. Furthermore, we present a review of the recent advances in each strategy, conclude the limitations of current methodologies, and suggest future research directions that promise further improvements in the field.
{"title":"Label-Efficient Time Series Representation Learning: A Review","authors":"Emadeldeen Eldele;Mohamed Ragab;Zhenghua Chen;Min Wu;Chee-Keong Kwoh;Xiaoli Li","doi":"10.1109/TAI.2024.3430236","DOIUrl":"https://doi.org/10.1109/TAI.2024.3430236","url":null,"abstract":"Label-efficient time series representation learning, which aims to learn effective representations with limited labeled data, is crucial for deploying deep learning models in real-world applications. To address the scarcity of labeled time series data, various strategies, e.g., transfer learning, self-supervised learning, and semisupervised learning, have been developed. In this survey, we introduce a novel taxonomy for the first time, categorizing existing approaches as in-domain or cross domain based on their reliance on external data sources or not. Furthermore, we present a review of the recent advances in each strategy, conclude the limitations of current methodologies, and suggest future research directions that promise further improvements in the field.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6027-6042"},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Study of Enhancing Federated Learning on Non-IID Data With Server Learning 通过服务器学习加强非 IID 数据上的联合学习的研究。
Pub Date : 2024-07-17 DOI: 10.1109/TAI.2024.3430250
Van Sy Mai;Richard J. La;Tao Zhang
Federated learning (FL) has emerged as a means of distributed learning using local data stored at clients with a coordinating server. Recent studies showed that FL can suffer from poor performance and slower convergence when training data at the clients are not independent and identically distributed (IID). Here, we consider auxiliary server learning (SL) as a complementary approach to improving the performance of FL on non-IID data. Our analysis and experiments show that this approach can achieve significant improvements in both model accuracy and convergence time even when the dataset utilized by the server is small and its distribution differs from that of the clients’ aggregate data. Moreover, experimental results suggest that auxiliary SL delivers benefits when employed together with other techniques proposed to mitigate the performance degradation of FL on non-IID data.
联合学习(FL)是一种利用存储在客户端的本地数据与协调服务器进行分布式学习的方法。最近的研究表明,当客户端的训练数据不是独立且同分布的(IID)时,FL 的性能会变差,收敛速度也会变慢。在此,我们考虑将辅助服务器学习作为一种补充方法,以提高 FL 在非独立同分布数据上的性能。我们的分析和实验表明,即使服务器使用的数据集很小,而且其分布与客户端的总数据分布不同,这种方法也能显著提高模型的准确性和收敛时间。此外,实验结果表明,当辅助服务器学习与其他技术一起使用时,能有效缓解 FL 在非 IID 数据上的性能下降问题。
{"title":"A Study of Enhancing Federated Learning on Non-IID Data With Server Learning","authors":"Van Sy Mai;Richard J. La;Tao Zhang","doi":"10.1109/TAI.2024.3430250","DOIUrl":"10.1109/TAI.2024.3430250","url":null,"abstract":"Federated learning (FL) has emerged as a means of distributed learning using local data stored at clients with a coordinating server. Recent studies showed that FL can suffer from poor performance and slower convergence when training data at the clients are not independent and identically distributed (IID). Here, we consider auxiliary server learning (SL) as a \u0000<italic>complementary</i>\u0000 approach to improving the performance of FL on non-IID data. Our analysis and experiments show that this approach can achieve significant improvements in both model accuracy and convergence time even when the dataset utilized by the server is small and its distribution differs from that of the clients’ aggregate data. Moreover, experimental results suggest that auxiliary SL delivers benefits when employed together with other techniques proposed to mitigate the performance degradation of FL on non-IID data.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5589-5604"},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Machine Learning for Semiconductor Process Optimization: A Systematic Review
Pub Date : 2024-07-17 DOI: 10.1109/TAI.2024.3429479
Ying-Lin Chen;Sara Sacchi;Bappaditya Dey;Victor Blanco;Sandip Halder;Philippe Leray;Stefan De Gendt
As machine learning (ML) continues to find applications, extensive research is currently underway across various domains. This study examines the current methodologies of ML being investigated to optimize semiconductor manufacturing processes. Our research involved searching the SPIE Digital Library, IEEE Xplore, and ArXiv databases, identifying 58 publications in the field of ML-based semiconductor process optimization. These investigations employ ML techniques such as feature extraction, feature selection, and neural network architecture are analyzed using different algorithms. These models find applications in advanced process control, virtual metrology, and quality control, critical aspects in semiconductor manufacturing for enhancing throughput and reducing production costs. We categorize the articles based on the methods and applications employed, summarizing the primary findings. Furthermore, we discuss the general conclusion of several studies. Overall, the reviewed literature suggests that ML-based semiconductor manufacturing is rapidly gaining popularity and advancing at a swift pace.
{"title":"Exploring Machine Learning for Semiconductor Process Optimization: A Systematic Review","authors":"Ying-Lin Chen;Sara Sacchi;Bappaditya Dey;Victor Blanco;Sandip Halder;Philippe Leray;Stefan De Gendt","doi":"10.1109/TAI.2024.3429479","DOIUrl":"https://doi.org/10.1109/TAI.2024.3429479","url":null,"abstract":"As machine learning (ML) continues to find applications, extensive research is currently underway across various domains. This study examines the current methodologies of ML being investigated to optimize semiconductor manufacturing processes. Our research involved searching the SPIE Digital Library, IEEE Xplore, and ArXiv databases, identifying 58 publications in the field of ML-based semiconductor process optimization. These investigations employ ML techniques such as feature extraction, feature selection, and neural network architecture are analyzed using different algorithms. These models find applications in advanced process control, virtual metrology, and quality control, critical aspects in semiconductor manufacturing for enhancing throughput and reducing production costs. We categorize the articles based on the methods and applications employed, summarizing the primary findings. Furthermore, we discuss the general conclusion of several studies. Overall, the reviewed literature suggests that ML-based semiconductor manufacturing is rapidly gaining popularity and advancing at a swift pace.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"5969-5989"},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on artificial intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1