首页 > 最新文献

IEEE transactions on artificial intelligence最新文献

英文 中文
Hedge-Embedded Linguistic Fuzzy Neural Networks for Systems Identification and Control 用于系统识别和控制的绿篱嵌入式语言模糊神经网络
Pub Date : 2024-04-30 DOI: 10.1109/TAI.2024.3395416
Hamed Rafiei;Mohammad-R. Akbarzadeh-T.
In the realm of natural language processing, hedge-embedded structures have contributed considerably by appreciating linguistic variables and distinguishing overlapped classes. This aspect of natural languages considerably affects the building of linguistically interpretable architectures for fuzzy neural networks (FNNs). Here, we propose extending the idea of hedge-embedded linguistic fuzzy neural networks (LiFNNs) to the systems identification and control paradigm. This perspective leads us to the universal approximation property for this mathematical construct using the Stone–Weierstrass theorem and the proof of stability for the resulting nonlinear system identification process using the Lyapunov function. Furthermore, the power activation functions in the membership degrees of the proposed network enable linguistic hedge interpretation and more precise learning. Finally, the proposed LiFNN, optimized using a backpropagation learning algorithm, is evaluated on several problems in function approximation (periodic functions and quadratic Hermite function), system identification (a nonlinear system), and direct adaptive control fields. Results show that memberships are more distinguishable in the proposed LiFNN, leading to $sim$50% less error on the average and higher granulation and interpretability.
在自然语言处理领域,对冲嵌入式结构通过理解语言变量和区分重叠类别做出了巨大贡献。自然语言的这一特点极大地影响了模糊神经网络(FNN)语言可解释架构的构建。在此,我们建议将对冲嵌入式语言模糊神经网络(LiFNN)的理念扩展到系统识别和控制范例中。从这一角度出发,我们利用 Stone-Weierstrass 定理得出了这一数学结构的通用近似属性,并利用 Lyapunov 函数证明了由此产生的非线性系统识别过程的稳定性。此外,拟议网络成员度中的幂激活函数可实现语言对冲解释和更精确的学习。最后,利用反向传播学习算法优化的拟议 LiFNN 在函数逼近(周期函数和二次赫米特函数)、系统识别(非线性系统)和直接自适应控制领域的几个问题上进行了评估。结果表明,提议的 LiFNN 中的成员更容易区分,平均误差减少了 50%,颗粒度和可解释性更高。
{"title":"Hedge-Embedded Linguistic Fuzzy Neural Networks for Systems Identification and Control","authors":"Hamed Rafiei;Mohammad-R. Akbarzadeh-T.","doi":"10.1109/TAI.2024.3395416","DOIUrl":"https://doi.org/10.1109/TAI.2024.3395416","url":null,"abstract":"In the realm of natural language processing, hedge-embedded structures have contributed considerably by appreciating linguistic variables and distinguishing overlapped classes. This aspect of natural languages considerably affects the building of linguistically interpretable architectures for fuzzy neural networks (FNNs). Here, we propose extending the idea of hedge-embedded linguistic fuzzy neural networks (LiFNNs) to the systems identification and control paradigm. This perspective leads us to the universal approximation property for this mathematical construct using the Stone–Weierstrass theorem and the proof of stability for the resulting nonlinear system identification process using the Lyapunov function. Furthermore, the power activation functions in the membership degrees of the proposed network enable linguistic hedge interpretation and more precise learning. Finally, the proposed LiFNN, optimized using a backpropagation learning algorithm, is evaluated on several problems in function approximation (periodic functions and quadratic Hermite function), system identification (a nonlinear system), and direct adaptive control fields. Results show that memberships are more distinguishable in the proposed LiFNN, leading to \u0000<inline-formula><tex-math>$sim$</tex-math></inline-formula>\u000050% less error on the average and higher granulation and interpretability.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Code Summarization With Tree Transformer Enhanced by Position-Related Syntax Complement 用位置相关语法补全增强的树形变换器改进代码总结工作
Pub Date : 2024-04-30 DOI: 10.1109/TAI.2024.3395231
Jie Song;Zexin Zhang;Zirui Tang;Shi Feng;Yu Gu
Code summarization aims to generate natural language (NL) summaries automatically given the source code snippet, which aids developers in understanding source code faster and improves software maintenance. Recent approaches using NL techniques in code summarization fall short of adequately capturing the syntactic characteristics of programming languages (PLs), particularly the position-related syntax, from which the semantics of the source code can be extracted. In this article, we present Syntax transforMer (SyMer) based on the transformer architecture where we enhance it with position-related syntax complement (PSC) to better capture syntactic characteristics. PSC takes advantage of unambiguous relations among code tokens in abstract syntax tree (AST), as well as the gathered attention on crucial code tokens indicated by its syntactic structure. The experimental results demonstrate that SyMer outperforms state-of-the-art models by at least 2.4% bilingual evaluation understudy (BLEU), 1.0% metric for evaluation of translation with explicit ORdering (METEOR) on Java benchmark, and 4.8% (BLEU), 5.1% (METEOR), and 3.2% recall-oriented understudy for gisting evaluation - longest common subsequence (ROUGE-L) on Python benchmark.
代码摘要旨在根据源代码片段自动生成自然语言(NL)摘要,从而帮助开发人员更快地理解源代码并改进软件维护。最近在代码摘要中使用自然语言技术的方法未能充分捕捉到编程语言(PL)的语法特征,尤其是与位置相关的语法,而源代码的语义可以从这些语法中提取出来。在本文中,我们介绍了基于转换器架构的语法转换器(SyMer),并通过位置相关语法补充(PSC)对其进行增强,以更好地捕捉语法特征。PSC 利用了抽象语法树(AST)中代码标记之间的明确关系,以及语法结构所显示的对关键代码标记的关注。实验结果表明,在 Java 基准上,SyMer 的双语评估结果(BLEU)至少优于最先进的模型 2.4%,显式 ORdering 翻译评估指标(METEOR)优于最先进的模型 1.0%;在 Python 基准上,语法评估--最长公共子序列(ROUGE-L)优于最先进的模型 4.8%(BLEU)、5.1%(METEOR)和 3.2%(recall-oriented understudy)。
{"title":"Improving Code Summarization With Tree Transformer Enhanced by Position-Related Syntax Complement","authors":"Jie Song;Zexin Zhang;Zirui Tang;Shi Feng;Yu Gu","doi":"10.1109/TAI.2024.3395231","DOIUrl":"https://doi.org/10.1109/TAI.2024.3395231","url":null,"abstract":"Code summarization aims to generate natural language (NL) summaries automatically given the source code snippet, which aids developers in understanding source code faster and improves software maintenance. Recent approaches using NL techniques in code summarization fall short of adequately capturing the syntactic characteristics of programming languages (PLs), particularly the position-related syntax, from which the semantics of the source code can be extracted. In this article, we present Syntax transforMer (SyMer) based on the transformer architecture where we enhance it with position-related syntax complement (PSC) to better capture syntactic characteristics. PSC takes advantage of unambiguous relations among code tokens in abstract syntax tree (AST), as well as the gathered attention on crucial code tokens indicated by its syntactic structure. The experimental results demonstrate that SyMer outperforms state-of-the-art models by at least 2.4% bilingual evaluation understudy (BLEU), 1.0% metric for evaluation of translation with explicit ORdering (METEOR) on Java benchmark, and 4.8% (BLEU), 5.1% (METEOR), and 3.2% recall-oriented understudy for gisting evaluation - longest common subsequence (ROUGE-L) on Python benchmark.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Redefining Real-Time Road Quality Analysis With Vision Transformers on Edge Devices 利用边缘设备上的视觉转换器重新定义实时道路质量分析
Pub Date : 2024-04-29 DOI: 10.1109/TAI.2024.3394797
Tasnim Ahmed;Naveed Ejaz;Salimur Choudhury
Road infrastructure is essential for transportation safety and efficiency. However, the current methods for assessing road conditions, crucial for effective planning and maintenance, suffer from high costs, time-intensive procedures, infrequent data collection, and limited real-time capabilities. This article presents an efficient lightweight system to analyze road quality from video feeds in real time. The backbone of the system is EdgeFusionViT, a novel vision transformer (ViT)-based architecture that uses an attention-based late fusion mechanism. The proposed architecture outperforms lightweight convolutional neural network (CNN)-based and ViT-based models. Its practicality is demonstrated by its deployment on an edge device, the Nvidia Jetson Orin Nano, enabling real-time road analysis at 12 frames per second. EdgeFusionViT outperforms existing benchmarks, achieving an impressive accuracy of 89.76% on the road surface condition dataset (RSCD). Notably, the model maintains a commendable accuracy of 76.89% even when trained with only 2% of the dataset, demonstrating its robustness and efficiency. These findings highlight the system's potential in road infrastructure management. It aids in creating safer, more efficient transport systems through timely, accurate road condition assessments. The study sets a new benchmark and opens up possibilities for advanced machine learning in infrastructure management.
道路基础设施对运输安全和效率至关重要。然而,目前评估道路状况的方法对有效规划和维护至关重要,但却存在成本高、程序耗时、数据收集不频繁、实时性有限等问题。本文介绍了一种高效的轻量级系统,可通过视频馈送实时分析道路质量。该系统的支柱是 EdgeFusionViT,它是一种基于视觉转换器(ViT)的新型架构,采用基于注意力的后期融合机制。所提出的架构优于基于卷积神经网络(CNN)的轻量级模型和基于 ViT 的模型。通过在边缘设备 Nvidia Jetson Orin Nano 上的部署,以每秒 12 帧的速度进行实时道路分析,证明了该架构的实用性。EdgeFusionViT 超越了现有基准,在路面状况数据集(RSCD)上实现了 89.76% 的惊人准确率。值得注意的是,即使只使用 2% 的数据集进行训练,该模型也能保持 76.89% 的准确率,这证明了它的鲁棒性和高效性。这些发现凸显了该系统在道路基础设施管理方面的潜力。通过及时、准确的道路状况评估,该系统有助于创建更安全、更高效的交通系统。这项研究树立了一个新的基准,为基础设施管理中的高级机器学习开辟了可能性。
{"title":"Redefining Real-Time Road Quality Analysis With Vision Transformers on Edge Devices","authors":"Tasnim Ahmed;Naveed Ejaz;Salimur Choudhury","doi":"10.1109/TAI.2024.3394797","DOIUrl":"https://doi.org/10.1109/TAI.2024.3394797","url":null,"abstract":"Road infrastructure is essential for transportation safety and efficiency. However, the current methods for assessing road conditions, crucial for effective planning and maintenance, suffer from high costs, time-intensive procedures, infrequent data collection, and limited real-time capabilities. This article presents an efficient lightweight system to analyze road quality from video feeds in real time. The backbone of the system is EdgeFusionViT, a novel vision transformer (ViT)-based architecture that uses an attention-based late fusion mechanism. The proposed architecture outperforms lightweight convolutional neural network (CNN)-based and ViT-based models. Its practicality is demonstrated by its deployment on an edge device, the Nvidia Jetson Orin Nano, enabling real-time road analysis at 12 frames per second. EdgeFusionViT outperforms existing benchmarks, achieving an impressive accuracy of 89.76% on the road surface condition dataset (RSCD). Notably, the model maintains a commendable accuracy of 76.89% even when trained with only 2% of the dataset, demonstrating its robustness and efficiency. These findings highlight the system's potential in road infrastructure management. It aids in creating safer, more efficient transport systems through timely, accurate road condition assessments. The study sets a new benchmark and opens up possibilities for advanced machine learning in infrastructure management.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quadratic Neuron-Empowered Heterogeneous Autoencoder for Unsupervised Anomaly Detection 用于无监督异常检测的四元神经元赋能异构自动编码器
Pub Date : 2024-04-29 DOI: 10.1109/TAI.2024.3394795
Jing-Xiao Liao;Bo-Jian Hou;Hang-Cheng Dong;Hao Zhang;Xiaoge Zhang;Jinwei Sun;Shiping Zhang;Feng-Lei Fan
Inspired by the complexity and diversity of biological neurons, a quadratic neuron is proposed to replace the inner product in the current neuron with a simplified quadratic function. Employing such a novel type of neurons offers a new perspective on developing deep learning. When analyzing quadratic neurons, we find that there exists a function such that a heterogeneous network can approximate it well with a polynomial number of neurons but a purely conventional or quadratic network needs an exponential number of neurons to achieve the same level of error. Encouraged by this inspiring theoretical result on heterogeneous networks, we directly integrate conventional and quadratic neurons in an autoencoder to make a new type of heterogeneous autoencoders. To our best knowledge, it is the first heterogeneous autoencoder that is made of different types of neurons. Next, we apply the proposed heterogeneous autoencoder to unsupervised anomaly detection (AD) for tabular data and bearing fault signals. The AD faces difficulties such as data unknownness, anomaly feature heterogeneity, and feature unnoticeability, which is suitable for the proposed heterogeneous autoencoder. Its high feature representation ability can characterize a variety of anomaly data (heterogeneity), discriminate the anomaly from the normal (unnoticeability), and accurately learn the distribution of normal samples (unknownness). Experiments show that heterogeneous autoencoders perform competitively compared with other state-of-the-art models.
受生物神经元复杂性和多样性的启发,我们提出了一种二次神经元,用简化的二次函数取代当前神经元中的内积。采用这种新型神经元为开发深度学习提供了新的视角。在分析二次方神经元时,我们发现存在这样一个函数:异构网络只需使用多项式数量的神经元就能很好地逼近它,但纯粹的传统或二次方网络则需要指数数量的神经元才能达到相同的误差水平。在这一鼓舞人心的异构网络理论成果的鼓舞下,我们直接将传统神经元和二次神经元整合到自动编码器中,从而制造出一种新型的异构自动编码器。据我们所知,这是第一个由不同类型神经元组成的异构自编码器。接下来,我们将提出的异构自编码器应用于表格数据和轴承故障信号的无监督异常检测(AD)。异常检测面临着数据未知性、异常特征异质性和特征不可察觉性等困难,而这正是所提出的异构自编码器的适用范围。其较高的特征表示能力可以表征各种异常数据(异质性)、区分异常与正常(不可知性)以及准确学习正常样本的分布(未知性)。实验表明,与其他最先进的模型相比,异构自动编码器的表现极具竞争力。
{"title":"Quadratic Neuron-Empowered Heterogeneous Autoencoder for Unsupervised Anomaly Detection","authors":"Jing-Xiao Liao;Bo-Jian Hou;Hang-Cheng Dong;Hao Zhang;Xiaoge Zhang;Jinwei Sun;Shiping Zhang;Feng-Lei Fan","doi":"10.1109/TAI.2024.3394795","DOIUrl":"https://doi.org/10.1109/TAI.2024.3394795","url":null,"abstract":"Inspired by the complexity and diversity of biological neurons, a quadratic neuron is proposed to replace the inner product in the current neuron with a simplified quadratic function. Employing such a novel type of neurons offers a new perspective on developing deep learning. When analyzing quadratic neurons, we find that there exists a function such that a heterogeneous network can approximate it well with a polynomial number of neurons but a purely conventional or quadratic network needs an exponential number of neurons to achieve the same level of error. Encouraged by this inspiring theoretical result on heterogeneous networks, we directly integrate conventional and quadratic neurons in an autoencoder to make a new type of heterogeneous autoencoders. To our best knowledge, it is the first heterogeneous autoencoder that is made of different types of neurons. Next, we apply the proposed heterogeneous autoencoder to unsupervised anomaly detection (AD) for tabular data and bearing fault signals. The AD faces difficulties such as data unknownness, anomaly feature heterogeneity, and feature unnoticeability, which is suitable for the proposed heterogeneous autoencoder. Its high feature representation ability can characterize a variety of anomaly data (heterogeneity), discriminate the anomaly from the normal (unnoticeability), and accurately learn the distribution of normal samples (unknownness). Experiments show that heterogeneous autoencoders perform competitively compared with other state-of-the-art models.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Linear Regression-Based Autonomous Intelligent Optimization for Constrained Multiobjective Problems 基于线性回归的自主智能优化,解决受限多目标问题
Pub Date : 2024-04-18 DOI: 10.1109/TAI.2024.3391230
Yan Wang;Xiaoyan Sun;Yong Zhang;Dunwei Gong;Hejuan Hu;Mingcheng Zuo
It is very challenging to autonomously generate algorithms suitable for constrained multiobjective optimization problems due to the diverse performance of existing algorithms. In this article, we propose a linear regression (LR)-based autonomous intelligent optimization method. It first extracts typical features of a constrained multiobjective optimization problem by focused sampling to form a feature vector. Then, a LR model is designed to learn the relationship between optimization problems and intelligent optimization algorithms (IOAs). Finally, the trained model autonomously generates a suitable IOA by inputting the feature vector. The proposed method is applied to six constrained multiobjective benchmark test sets with various characteristics and compared with seven popular optimization algorithms. The experimental results verify the effectiveness of the proposed method. In addition, the proposed method is used to solve the operation optimization problems of an integrated coal mine energy system, and the experimental results show its practicability.
由于现有算法的性能参差不齐,要自主生成适用于受限多目标优化问题的算法非常具有挑战性。本文提出了一种基于线性回归(LR)的自主智能优化方法。它首先通过集中采样提取约束多目标优化问题的典型特征,形成特征向量。然后,设计一个 LR 模型来学习优化问题与智能优化算法(IOA)之间的关系。最后,训练有素的模型通过输入特征向量自主生成合适的 IOA。所提出的方法被应用于六个具有不同特征的受限多目标基准测试集,并与七种流行的优化算法进行了比较。实验结果验证了所提方法的有效性。此外,还将所提方法用于解决煤矿综合能源系统的运行优化问题,实验结果表明了该方法的实用性。
{"title":"Linear Regression-Based Autonomous Intelligent Optimization for Constrained Multiobjective Problems","authors":"Yan Wang;Xiaoyan Sun;Yong Zhang;Dunwei Gong;Hejuan Hu;Mingcheng Zuo","doi":"10.1109/TAI.2024.3391230","DOIUrl":"https://doi.org/10.1109/TAI.2024.3391230","url":null,"abstract":"It is very challenging to autonomously generate algorithms suitable for constrained multiobjective optimization problems due to the diverse performance of existing algorithms. In this article, we propose a linear regression (LR)-based autonomous intelligent optimization method. It first extracts typical features of a constrained multiobjective optimization problem by focused sampling to form a feature vector. Then, a LR model is designed to learn the relationship between optimization problems and intelligent optimization algorithms (IOAs). Finally, the trained model autonomously generates a suitable IOA by inputting the feature vector. The proposed method is applied to six constrained multiobjective benchmark test sets with various characteristics and compared with seven popular optimization algorithms. The experimental results verify the effectiveness of the proposed method. In addition, the proposed method is used to solve the operation optimization problems of an integrated coal mine energy system, and the experimental results show its practicability.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Guest Editorial: New Developments in Explainable and Interpretable Artificial Intelligence 特邀社论:可解释和可解读人工智能的新发展
Pub Date : 2024-04-16 DOI: 10.1109/TAI.2024.3356669
K. P. Suba Subbalakshmi;Wojciech Samek;Xia Ben Hu
This special issue brings together seven articles that address different aspects of explainable and interpretable artificial intelligence (AI). Over the years, machine learning (ML) and AI models have posted strong performance across several tasks. This has sparked interest in deploying these methods in critical applications like health and finance. However, to be deployable in the field, ML and AI models must be trustworthy. Explainable and interpretable AI are two areas of research that have become increasingly important to ensure trustworthiness and hence deployability of advanced AI and ML methods. Interpretable AI are models that obey some domain-specific constraints so that they are better understandable by humans. In essence, they are not black-box models. On the other hand, explainable AI refers to models and methods that are typically used to explain another black-box model.
本特刊汇集了七篇文章,探讨了可解释和可解释人工智能(AI)的不同方面。多年来,机器学习(ML)和人工智能模型在多项任务中表现出色。这激发了人们将这些方法部署到健康和金融等关键应用领域的兴趣。然而,要在该领域部署,ML 和 AI 模型必须值得信赖。可解释人工智能和可解释人工智能是两个日益重要的研究领域,可确保先进人工智能和 ML 方法的可信度和可部署性。可解释的人工智能模型遵从某些特定领域的约束条件,因此更容易被人类理解。从本质上讲,它们不是黑盒模型。另一方面,可解释人工智能指的是通常用于解释另一个黑盒模型的模型和方法。
{"title":"Guest Editorial: New Developments in Explainable and Interpretable Artificial Intelligence","authors":"K. P. Suba Subbalakshmi;Wojciech Samek;Xia Ben Hu","doi":"10.1109/TAI.2024.3356669","DOIUrl":"https://doi.org/10.1109/TAI.2024.3356669","url":null,"abstract":"This special issue brings together seven articles that address different aspects of explainable and interpretable artificial intelligence (AI). Over the years, machine learning (ML) and AI models have posted strong performance across several tasks. This has sparked interest in deploying these methods in critical applications like health and finance. However, to be deployable in the field, ML and AI models must be trustworthy. Explainable and interpretable AI are two areas of research that have become increasingly important to ensure trustworthiness and hence deployability of advanced AI and ML methods. Interpretable AI are models that obey some domain-specific constraints so that they are better understandable by humans. In essence, they are not black-box models. On the other hand, explainable AI refers to models and methods that are typically used to explain another black-box model.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10500898","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140559370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Strategic Gradient Transmission With Targeted Privacy-Awareness in Model Training: A Stackelberg Game Analysis 模型训练中具有针对性隐私意识的策略梯度传输:斯塔克尔伯格博弈分析
Pub Date : 2024-04-16 DOI: 10.1109/TAI.2024.3389611
Hezhe Sun;Yufei Wang;Huiwen Yang;Kaixuan Huo;Yuzhe Li
Privacy-aware machine learning paradigms have sparked widespread concern due to their ability to safeguard the local privacy of data owners, preventing the leakage of private information to untrustworthy platforms or malicious third parties. This article focuses on characterizing the interactions between the learner and the data owner within this privacy-aware training process. Here, the data owner hesitates to transmit the original gradient to the learner due to potential cybersecurity issues, such as gradient leakage and membership inference. To address this concern, we propose a Stackelberg game framework that models the training process. In this framework, the data owner's objective is not to maximize the discrepancy between the learner's obtained gradient and the true gradient but rather to ensure that the learner obtains a gradient closely resembling one deliberately designed by the data owner, while the learner's objective is to recover the true gradient as accurately as possible. We derive the optimal encoder and decoder using mismatched cost functions and characterize the equilibrium for specific cases, balancing model accuracy and local privacy. Numerical examples illustrate the main results, and we conclude with expanding discussions to suggest future investigations into reliable countermeasure designs.
隐私感知机器学习范式能够保护数据所有者的本地隐私,防止私人信息泄露给不可信的平台或恶意第三方,因此引发了广泛关注。本文的重点是描述这种隐私感知训练过程中学习者与数据所有者之间的互动。在这里,由于潜在的网络安全问题,如梯度泄漏和成员推理,数据所有者在向学习者传输原始梯度时犹豫不决。为了解决这个问题,我们提出了一个斯塔克尔伯格博弈框架来模拟训练过程。在这个框架中,数据所有者的目标不是最大化学习者获得的梯度与真实梯度之间的差异,而是确保学习者获得的梯度与数据所有者刻意设计的梯度非常相似,而学习者的目标是尽可能准确地恢复真实梯度。我们利用不匹配的成本函数推导出了最优编码器和解码器,并描述了特定情况下的平衡,在模型准确性和局部隐私之间取得了平衡。数字示例说明了主要结果,最后我们将展开讨论,为未来研究可靠的对策设计提供建议。
{"title":"Strategic Gradient Transmission With Targeted Privacy-Awareness in Model Training: A Stackelberg Game Analysis","authors":"Hezhe Sun;Yufei Wang;Huiwen Yang;Kaixuan Huo;Yuzhe Li","doi":"10.1109/TAI.2024.3389611","DOIUrl":"https://doi.org/10.1109/TAI.2024.3389611","url":null,"abstract":"Privacy-aware machine learning paradigms have sparked widespread concern due to their ability to safeguard the local privacy of data owners, preventing the leakage of private information to untrustworthy platforms or malicious third parties. This article focuses on characterizing the interactions between the learner and the data owner within this privacy-aware training process. Here, the data owner hesitates to transmit the original gradient to the learner due to potential cybersecurity issues, such as gradient leakage and membership inference. To address this concern, we propose a Stackelberg game framework that models the training process. In this framework, the data owner's objective is not to maximize the discrepancy between the learner's obtained gradient and the true gradient but rather to ensure that the learner obtains a gradient closely resembling one deliberately designed by the data owner, while the learner's objective is to recover the true gradient as accurately as possible. We derive the optimal encoder and decoder using mismatched cost functions and characterize the equilibrium for specific cases, balancing model accuracy and local privacy. Numerical examples illustrate the main results, and we conclude with expanding discussions to suggest future investigations into reliable countermeasure designs.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Explainable Intellectual Property Protection Method for Deep Neural Networks Based on Intrinsic Features 基于内在特征的可解释深度神经网络知识产权保护方法
Pub Date : 2024-04-16 DOI: 10.1109/TAI.2024.3388389
Mingfu Xue;Xin Wang;Yinghao Wu;Shifeng Ni;Leo Yu Zhang;Yushu Zhang;Weiqiang Liu
Intellectual property (IP) protection for deep neural networks (DNNs) has raised serious concerns in recent years. Most existing works embed watermarks in the DNN model for IP protection, which need to modify the model and do not consider/mention interpretability. In this article, for the first time, we propose an interpretable IP protection method for DNN based on explainable artificial intelligence. Compared with existing works, the proposed method does not modify the DNN model, and the decision of the ownership verification is interpretable. We extract the intrinsic features of the DNN model by using deep Taylor decomposition. Since the intrinsic feature is composed of unique interpretation of the model's decision, the intrinsic feature can be regarded as fingerprint of the model. If the fingerprint of a suspected model is the same as the original model, the suspected model is considered as a pirated model. Experimental results demonstrate that the fingerprints can be successfully used to verify the ownership of the model and the test accuracy of the model is not affected. Furthermore, the proposed method is robust to fine-tuning attack, pruning attack, watermark overwriting attack, and adaptive attack.
近年来,深度神经网络(DNN)的知识产权(IP)保护引起了人们的严重关注。现有研究大多在 DNN 模型中嵌入水印进行知识产权保护,这需要修改模型,且没有考虑/提及可解释性。本文首次提出了一种基于可解释人工智能的 DNN 可解释知识产权保护方法。与现有方法相比,本文提出的方法不需要修改 DNN 模型,而且所有权验证的决定是可解释的。我们利用深度泰勒分解法提取 DNN 模型的内在特征。由于内在特征是由对模型判定的唯一解释组成的,因此内在特征可视为模型的指纹。如果可疑模型的指纹与原始模型相同,则该可疑模型被视为盗版模型。实验结果表明,指纹可成功用于验证模型的所有权,模型的测试准确性不受影响。此外,所提出的方法对微调攻击、剪枝攻击、水印覆盖攻击和自适应攻击具有鲁棒性。
{"title":"An Explainable Intellectual Property Protection Method for Deep Neural Networks Based on Intrinsic Features","authors":"Mingfu Xue;Xin Wang;Yinghao Wu;Shifeng Ni;Leo Yu Zhang;Yushu Zhang;Weiqiang Liu","doi":"10.1109/TAI.2024.3388389","DOIUrl":"https://doi.org/10.1109/TAI.2024.3388389","url":null,"abstract":"Intellectual property (IP) protection for deep neural networks (DNNs) has raised serious concerns in recent years. Most existing works embed watermarks in the DNN model for IP protection, which need to modify the model and do not consider/mention interpretability. In this article, for the first time, we propose an interpretable IP protection method for DNN based on explainable artificial intelligence. Compared with existing works, the proposed method does not modify the DNN model, and the decision of the ownership verification is interpretable. We extract the intrinsic features of the DNN model by using deep Taylor decomposition. Since the intrinsic feature is composed of unique interpretation of the model's decision, the intrinsic feature can be regarded as fingerprint of the model. If the fingerprint of a suspected model is the same as the original model, the suspected model is considered as a pirated model. Experimental results demonstrate that the fingerprints can be successfully used to verify the ownership of the model and the test accuracy of the model is not affected. Furthermore, the proposed method is robust to fine-tuning attack, pruning attack, watermark overwriting attack, and adaptive attack.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Unified Conditional Diffusion Framework for Dual Protein Targets-Based Bioactive Molecule Generation 基于双蛋白靶点的生物活性分子生成的统一条件扩散框架
Pub Date : 2024-04-11 DOI: 10.1109/TAI.2024.3387402
Lei Huang;Zheng Yuan;Huihui Yan;Rong Sheng;Linjing Liu;Fuzhou Wang;Weidun Xie;Nanjun Chen;Fei Huang;Songfang Huang;Ka-Chun Wong;Yaoyun Zhang
Advances in deep generative models shed light on de novo molecule generation with desired properties. However, molecule generation targeted for dual protein targets still faces formidable challenges including insufficient protein 3-D structure data requisition for conditioned model training, inflexibility of auto-regressive sampling, and model generalization to unseen targets. Here, this study proposed diffusion model for dual targets-based molecule generation (DiffDTM), a novel unified structure-free deep generative framework based on a diffusion model for dual-target based molecule generation to address the above issues. Specifically, DiffDTM receives representations of protein sequences and molecular graphs pretrained on large-scale datasets as inputs instead of protein and molecular conformations and incorporates an information fusion module to achieve conditional generation in a one-shot manner. We perform comprehensive multiview experiments to demonstrate that DiffDTM can generate druglike, synthesis-accessible, novel, and high-binding affinity molecules targeting specific dual proteins, outperforming the state-of-the-art (SOTA) models in terms of multiple evaluation metrics. Furthermore, DiffDTM could directly generate molecules toward dopamine receptor D2 (DRD2) and 5-hydroxytryptamine receptor 1A (HTR1A) as new antipsychotics. Experimental comparisons highlight the generalizability of DiffDTM to easily adapt to unseen dual targets and generate bioactive molecules, addressing the issues of insufficient active molecule data for model training when new targets are encountered.
深度生成模型的进步为从头生成具有所需特性的分子提供了启示。然而,针对双蛋白质靶标的分子生成仍然面临着巨大的挑战,包括用于条件模型训练的蛋白质三维结构数据征集不足、自动回归采样缺乏灵活性以及模型泛化到未见靶标等。为解决上述问题,本研究提出了基于扩散模型的双目标分子生成扩散模型(DiffDTM),这是一种基于扩散模型的新型统一无结构深度生成框架。具体来说,DiffDTM 接收在大规模数据集上预训练的蛋白质序列和分子图的表示作为输入,而不是蛋白质和分子构象,并结合信息融合模块,以一次性的方式实现条件生成。我们进行了全面的多视角实验,证明 DiffDTM 可以生成药物样的、可合成的、新颖的和高结合亲和力的分子,靶向特定的双蛋白,在多个评价指标方面优于最先进的(SOTA)模型。此外,DiffDTM 还能直接生成针对多巴胺受体 D2(DRD2)和 5- 羟色胺受体 1A(HTR1A)的分子,作为新型抗精神病药物。实验比较凸显了 DiffDTM 的通用性,它可以轻松适应未知的双重靶点并生成生物活性分子,解决了遇到新靶点时模型训练所需的活性分子数据不足的问题。
{"title":"A Unified Conditional Diffusion Framework for Dual Protein Targets-Based Bioactive Molecule Generation","authors":"Lei Huang;Zheng Yuan;Huihui Yan;Rong Sheng;Linjing Liu;Fuzhou Wang;Weidun Xie;Nanjun Chen;Fei Huang;Songfang Huang;Ka-Chun Wong;Yaoyun Zhang","doi":"10.1109/TAI.2024.3387402","DOIUrl":"https://doi.org/10.1109/TAI.2024.3387402","url":null,"abstract":"Advances in deep generative models shed light on \u0000<italic>de novo</i>\u0000 molecule generation with desired properties. However, molecule generation targeted for dual protein targets still faces formidable challenges including insufficient protein 3-D structure data requisition for conditioned model training, inflexibility of auto-regressive sampling, and model generalization to unseen targets. Here, this study proposed diffusion model for dual targets-based molecule generation (DiffDTM), a novel unified structure-free deep generative framework based on a diffusion model for dual-target based molecule generation to address the above issues. Specifically, DiffDTM receives representations of protein sequences and molecular graphs pretrained on large-scale datasets as inputs instead of protein and molecular conformations and incorporates an information fusion module to achieve conditional generation in a one-shot manner. We perform comprehensive multiview experiments to demonstrate that DiffDTM can generate druglike, synthesis-accessible, novel, and high-binding affinity molecules targeting specific dual proteins, outperforming the state-of-the-art (SOTA) models in terms of multiple evaluation metrics. Furthermore, DiffDTM could directly generate molecules toward dopamine receptor D2 (DRD2) and 5-hydroxytryptamine receptor 1A (HTR1A) as new antipsychotics. Experimental comparisons highlight the generalizability of DiffDTM to easily adapt to unseen dual targets and generate bioactive molecules, addressing the issues of insufficient active molecule data for model training when new targets are encountered.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Intelligent Fingerprinting Technique for Low-Power Embedded IoT Devices 低功耗嵌入式物联网设备的智能指纹识别技术
Pub Date : 2024-04-10 DOI: 10.1109/TAI.2024.3386498
Varun Kohli;Muhammad Naveed Aman;Biplab Sikdar
The Internet of Things (IoT) has been a popular topic for research and development in the past decade. The resource-constrained and wireless nature of IoT devices presents a large surface of vulnerabilities, and traditional network security methods involving complex cryptography are not feasible. Studies show that Denial of Service (DoS), physical intrusion, spoofing, and node forgery are prevalent threats in the IoT, and there is a need for robust, lightweight device fingerprinting schemes. We identify eight criteria of effective fingerprinting methods for resource-constrained IoT devices and propose an intelligent, lightweight, whitelist-based fingerprinting method that satisfies these properties. The proposed method uses the power-up Static Random Access Memory (SRAM) stack as fingerprint features and autoencoder networks (AEN) for fingerprint registration and verification. We also present a threat mitigation framework based on network isolation levels to handle potential and identified threats. Experiments are conducted with a heterogeneous pool of 10 advanced virtual reduced instruction set computer (AVR) Harvard architecture prover devices from different vendors, and Dell Latitude and Dell XPS 13 laptops are used as verifier testbeds. The proposed method has a 99.9% accuracy, 100% precision, and 99.6% recall on known and unknown heterogeneous devices, which is an improvement over several past works. The independence of fingerprints stored in the AENs enables easy distribution and update, and the observed evaluation latency ($sim$ $10^{-4}$ s) and data collection latency ($sim$ $1$ s) make our method practical for real-world scenarios. Lastly, we analyze the proposed method with regard to the eight criteria and highlight its limitations for future improvement.
物联网(IoT)是近十年来研究和开发的热门话题。物联网设备的资源受限和无线特性带来了巨大的漏洞,而涉及复杂密码学的传统网络安全方法并不可行。研究表明,拒绝服务(DoS)、物理入侵、欺骗和节点伪造是物联网中普遍存在的威胁,因此需要稳健、轻量级的设备指纹方案。我们为资源受限的物联网设备确定了有效指纹识别方法的八项标准,并提出了一种智能、轻量级、基于白名单的指纹识别方法,它能满足这些特性。所提出的方法使用开机静态随机存取存储器(SRAM)堆栈作为指纹特征,并使用自动编码器网络(AEN)进行指纹注册和验证。我们还提出了一个基于网络隔离级别的威胁缓解框架,以处理潜在的和已识别的威胁。实验使用了由不同供应商提供的 10 台高级虚拟精简指令集计算机(AVR)哈佛架构验证器设备组成的异构池,并使用戴尔 Latitude 和戴尔 XPS 13 笔记本电脑作为验证器测试平台。所提出的方法在已知和未知异构设备上的准确率为 99.9%,精确率为 100%,召回率为 99.6%,比过去的几项工作有所提高。存储在AEN中的指纹的独立性使其易于分发和更新,观察到的评估延迟($sim$ $10^{-4}$ s)和数据收集延迟($sim$ $1$ s)使我们的方法在现实世界的应用场景中非常实用。最后,我们根据八项标准对所提出的方法进行了分析,并强调了该方法的局限性,以供今后改进。
{"title":"An Intelligent Fingerprinting Technique for Low-Power Embedded IoT Devices","authors":"Varun Kohli;Muhammad Naveed Aman;Biplab Sikdar","doi":"10.1109/TAI.2024.3386498","DOIUrl":"https://doi.org/10.1109/TAI.2024.3386498","url":null,"abstract":"The Internet of Things (IoT) has been a popular topic for research and development in the past decade. The resource-constrained and wireless nature of IoT devices presents a large surface of vulnerabilities, and traditional network security methods involving complex cryptography are not feasible. Studies show that Denial of Service (DoS), physical intrusion, spoofing, and node forgery are prevalent threats in the IoT, and there is a need for robust, lightweight device fingerprinting schemes. We identify eight criteria of effective fingerprinting methods for resource-constrained IoT devices and propose an intelligent, lightweight, whitelist-based fingerprinting method that satisfies these properties. The proposed method uses the power-up Static Random Access Memory (SRAM) stack as fingerprint features and autoencoder networks (AEN) for fingerprint registration and verification. We also present a threat mitigation framework based on network isolation levels to handle potential and identified threats. Experiments are conducted with a heterogeneous pool of 10 advanced virtual reduced instruction set computer (AVR) Harvard architecture prover devices from different vendors, and Dell Latitude and Dell XPS 13 laptops are used as verifier testbeds. The proposed method has a 99.9% accuracy, 100% precision, and 99.6% recall on known and unknown heterogeneous devices, which is an improvement over several past works. The independence of fingerprints stored in the AENs enables easy distribution and update, and the observed evaluation latency (\u0000<inline-formula><tex-math>$sim$</tex-math></inline-formula>\u0000 \u0000<inline-formula><tex-math>$10^{-4}$</tex-math></inline-formula>\u0000 s) and data collection latency (\u0000<inline-formula><tex-math>$sim$</tex-math></inline-formula>\u0000 \u0000<inline-formula><tex-math>$1$</tex-math></inline-formula>\u0000 s) make our method practical for real-world scenarios. Lastly, we analyze the proposed method with regard to the eight criteria and highlight its limitations for future improvement.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142164998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on artificial intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1