IEEE transactions on artificial intelligence最新文献

英文中文

Artificial Intelligence Across Europe: A Study on Awareness, Attitude and Trust

IEEE transactions on artificial intelligence

Pub Date : 2024-09-17 DOI: 10.1109/TAI.2024.3461633

Teresa Scantamburlo;Atia Cortés;Francesca Foffano;Cristian Barrué;Veronica Distefano;Long Pham;Alessandro Fabris

This article presents the results of an extensive study investigating the opinions on artificial intelligence (AI) of a sample of 4006 European citizens from eight distinct countries (France, Germany, Italy, Netherlands, Poland, Romania, Spain, and Sweden). The aim of the study is to gain a better understanding of people's views and perceptions within the European context, which is already marked by important policy actions and regulatory processes. To survey the perceptions of the citizens of Europe, we design and validate a new questionnaire (PAICE) structured around three dimensions: people's awareness, attitude, and trust. We observe that while awareness is characterized by a low level of self-assessed competency, the attitude toward AI is very positive for more than half of the population. Reflecting on the collected results, we highlight implicit contradictions and identify trends that may interfere with the creation of an ecosystem of trust and the development of inclusive AI policies. The introduction of rules that ensure legal and ethical standards, along with the activity of high-level educational entities, and the promotion of AI literacy are identified as key factors in supporting a trustworthy AI ecosystem. We make some recommendations for AI governance focused on the European context and conclude with suggestions for future work.

{"title":"Artificial Intelligence Across Europe: A Study on Awareness, Attitude and Trust","authors":"Teresa Scantamburlo;Atia Cortés;Francesca Foffano;Cristian Barrué;Veronica Distefano;Long Pham;Alessandro Fabris","doi":"10.1109/TAI.2024.3461633","DOIUrl":"https://doi.org/10.1109/TAI.2024.3461633","url":null,"abstract":"This article presents the results of an extensive study investigating the opinions on artificial intelligence (AI) of a sample of 4006 European citizens from eight distinct countries (France, Germany, Italy, Netherlands, Poland, Romania, Spain, and Sweden). The aim of the study is to gain a better understanding of people's views and perceptions within the European context, which is already marked by important policy actions and regulatory processes. To survey the perceptions of the citizens of Europe, we design and validate a new questionnaire (PAICE) structured around three dimensions: people's awareness, attitude, and trust. We observe that while awareness is characterized by a low level of self-assessed competency, the attitude toward AI is very positive for more than half of the population. Reflecting on the collected results, we highlight implicit contradictions and identify trends that may interfere with the creation of an ecosystem of trust and the development of inclusive AI policies. The introduction of rules that ensure legal and ethical standards, along with the activity of high-level educational entities, and the promotion of AI literacy are identified as key factors in supporting a trustworthy AI ecosystem. We make some recommendations for AI governance focused on the European context and conclude with suggestions for future work.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 2","pages":"477-490"},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10681325","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

OAFuser: Toward Omni-Aperture Fusion for Light Field Semantic Segmentation 面向全孔径融合的光场语义分割

IEEE transactions on artificial intelligence

Pub Date : 2024-09-11 DOI: 10.1109/TAI.2024.3457931

Fei Teng;Jiaming Zhang;Kunyu Peng;Yaonan Wang;Rainer Stiefelhagen;Kailun Yang

Light field cameras are capable of capturing intricate angular and spatial details. This allows for acquiring complex light patterns and details from multiple angles, significantly enhancing the precision of image semantic segmentation. However, two significant issues arise: 1) The extensive angular information of light field cameras contains a large amount of redundant data, which is overwhelming for the limited hardware resources of intelligent agents. 2) A relative displacement difference exists in the data collected by different microlenses. To address these issues, we propose an omni-aperture fusion model (OAFuser) that leverages dense context from the central view and extracts the angular information from subaperture images to generate semantically consistent results. To simultaneously streamline the redundant information from the light field cameras and avoid feature loss during network propagation, we present a simple yet very effective subaperture fusion module (SAFM). This module efficiently embeds subaperture images in angular features, allowing the network to process each subaperture image with a minimal computational demand of only (

${sim}1rm GFlops$

). Furthermore, to address the mismatched spatial information across viewpoints, we present a center angular rectification module (CARM) to realize feature resorting and prevent feature occlusion caused by misalignment. The proposed OAFuser achieves state-of-the-art performance on four UrbanLF datasets in terms of all evaluation metrics and sets a new record of

$84.93%$

in mIoU on the UrbanLF-Real Extended dataset, with a gain of

${+}3.69%$

. The source code for OAFuser is available at https://github.com/FeiBryantkit/OAFuser.

光场相机能够捕捉到复杂的角度和空间细节。这允许从多个角度获取复杂的光模式和细节，大大提高了图像语义分割的精度。但是存在两个重要的问题：1)光场相机广泛的角度信息包含了大量的冗余数据，这对于智能体有限的硬件资源来说是压倒性的。2)不同微透镜采集的数据存在相对位移差异。为了解决这些问题，我们提出了一种全孔径融合模型（OAFuser），该模型利用中心视图的密集上下文并从子孔径图像中提取角度信息以生成语义一致的结果。为了同时精简来自光场相机的冗余信息并避免网络传播过程中的特征丢失，我们提出了一种简单但非常有效的子孔径融合模块（SAFM）。该模块有效地将子孔径图像嵌入到角度特征中，使网络能够以最小的计算需求（${sim}1rm GFlops$）处理每个子孔径图像。此外，为了解决视点间空间信息不匹配的问题，我们提出了圆心角校正模块（CARM），实现特征求助，防止因不对准导致的特征遮挡。根据所有评估指标，提出的OAFuser在四个UrbanLF数据集上实现了最先进的性能，并在UrbanLF- real扩展数据集上创造了84.93%$的mIoU新记录，增益为${+}3.69%$。OAFuser的源代码可从https://github.com/FeiBryantkit/OAFuser获得。

{"title":"OAFuser: Toward Omni-Aperture Fusion for Light Field Semantic Segmentation","authors":"Fei Teng;Jiaming Zhang;Kunyu Peng;Yaonan Wang;Rainer Stiefelhagen;Kailun Yang","doi":"10.1109/TAI.2024.3457931","DOIUrl":"https://doi.org/10.1109/TAI.2024.3457931","url":null,"abstract":"Light field cameras are capable of capturing intricate angular and spatial details. This allows for acquiring complex light patterns and details from multiple angles, significantly enhancing the precision of image semantic segmentation. However, two significant issues arise: 1) The extensive angular information of light field cameras contains a large amount of redundant data, which is overwhelming for the limited hardware resources of intelligent agents. 2) A relative displacement difference exists in the data collected by different microlenses. To address these issues, we propose an \u0000<italic>omni-aperture fusion model (OAFuser)\u0000 that leverages dense context from the central view and extracts the angular information from subaperture images to generate semantically consistent results. To simultaneously streamline the redundant information from the light field cameras and avoid feature loss during network propagation, we present a simple yet very effective \u0000<italic>subaperture fusion module (SAFM)\u0000. This module efficiently embeds subaperture images in angular features, allowing the network to process each subaperture image with a minimal computational demand of only (\u0000<inline-formula><tex-math>${sim}1rm GFlops$</tex-math></inline-formula>\u0000). Furthermore, to address the mismatched spatial information across viewpoints, we present a \u0000<italic>center angular rectification module (CARM)\u0000 to realize feature resorting and prevent feature occlusion caused by misalignment. The proposed OAFuser achieves state-of-the-art performance on four UrbanLF datasets in terms of \u0000<italic>all evaluation metrics\u0000 and sets a new record of \u0000<inline-formula><tex-math>$84.93%$</tex-math></inline-formula>\u0000 in mIoU on the UrbanLF-Real Extended dataset, with a gain of \u0000<inline-formula><tex-math>${+}3.69%$</tex-math></inline-formula>\u0000. The source code for OAFuser is available at \u0000<uri>https://github.com/FeiBryantkit/OAFuser</uri>\u0000.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6225-6239"},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Editorial: From Explainable Artificial Intelligence (xAI) to Understandable Artificial Intelligence (uAI) 社论：从可解释的人工智能（xAI）到可理解的人工智能（uAI）

IEEE transactions on artificial intelligence

Pub Date : 2024-09-10 DOI: 10.1109/TAI.2024.3439048

Hussein Abbass;Keeley Crockett;Jonathan Garibaldi;Alexander Gegov;Uzay Kaymak;Joao Miguel C. Sousa

引用次数: 0

IEEE Transactions on Artificial Intelligence Publication Information IEEE Transactions on Artificial Intelligence 出版信息

IEEE transactions on artificial intelligence

Pub Date : 2024-09-10 DOI: 10.1109/TAI.2024.3449732

引用次数: 0

Multiobjective Dynamic Flexible Job Shop Scheduling With Biased Objectives via Multitask Genetic Programming 基于多任务遗传规划的有偏目标柔性作业车间动态调度

IEEE transactions on artificial intelligence

Pub Date : 2024-09-09 DOI: 10.1109/TAI.2024.3456086

Fangfang Zhang;Gaofeng Shi;Yi Mei;Mengjie Zhang

Dynamic flexible job shop scheduling is an important combinatorial optimization problem that has rich real-world applications such as product processing in manufacturing. Genetic programming has been successfully used to learn scheduling heuristics for dynamic flexible job shop scheduling. Intuitively, users prefer small and effective scheduling heuristics that can not only generate promising schedules but also are computationally efficient and easy to be understood. However, a scheduling heuristic with better effectiveness tends to have a larger size, and the effectiveness of rules and rule size are potentially conflicting objectives. With the traditional dominance relation-based multiobjective algorithms, there is a search bias toward rule size, since rule size is much easier to optimized than effectiveness, and larger rules are easily abandoned, resulting in the loss of effectiveness. To address this issue, this article develops a novel multiobjective genetic programming algorithm that takes size and effectiveness of scheduling heuristics for optimization via multitask learning mechanism. Specifically, we construct two tasks for the multiobjective optimization with biased objectives using different search mechanisms for each task. The focus of the proposed algorithm is to improve the effectiveness of learned small rules by knowledge sharing between constructed tasks which is implemented with the crossover operator. The results show that our proposed algorithm performs significantly better, i.e., with smaller and more effective scheduling heuristics, than the state-of-the-art algorithms in the examined scenarios. By analyzing the population diversity, we find that the proposed algorithm has a good balance between exploration and exploitation during the evolutionary process.

动态柔性作业车间调度是一个重要的组合优化问题，具有丰富的实际应用，如制造业中的产品加工。利用遗传规划成功地学习了动态柔性作业车间调度的启发式算法。直观上，用户更喜欢小而有效的调度启发式算法，这种算法不仅可以生成有希望的调度，而且计算效率高，易于理解。然而，具有更好有效性的调度启发式往往具有更大的大小，并且规则的有效性和规则大小可能是相互冲突的目标。传统的基于优势关系的多目标算法存在对规则大小的搜索偏差，因为规则大小比有效性更容易优化，较大的规则容易被抛弃，从而导致有效性的丧失。为了解决这一问题，本文开发了一种新的多目标遗传规划算法，该算法利用多任务学习机制来优化调度启发式算法的规模和有效性。具体来说，我们构建了两个带有偏置目标的多目标优化任务，每个任务使用不同的搜索机制。该算法的重点是通过交叉算子实现任务间的知识共享，提高学习到的小规则的有效性。结果表明，我们提出的算法在测试场景中表现明显更好，即具有更小和更有效的调度启发式，而不是最先进的算法。通过对种群多样性的分析，我们发现该算法在进化过程中能够很好地平衡探索和利用。

{"title":"Multiobjective Dynamic Flexible Job Shop Scheduling With Biased Objectives via Multitask Genetic Programming","authors":"Fangfang Zhang;Gaofeng Shi;Yi Mei;Mengjie Zhang","doi":"10.1109/TAI.2024.3456086","DOIUrl":"https://doi.org/10.1109/TAI.2024.3456086","url":null,"abstract":"Dynamic flexible job shop scheduling is an important combinatorial optimization problem that has rich real-world applications such as product processing in manufacturing. Genetic programming has been successfully used to learn scheduling heuristics for dynamic flexible job shop scheduling. Intuitively, users prefer small and effective scheduling heuristics that can not only generate promising schedules but also are computationally efficient and easy to be understood. However, a scheduling heuristic with better effectiveness tends to have a larger size, and the effectiveness of rules and rule size are potentially conflicting objectives. With the traditional dominance relation-based multiobjective algorithms, there is a search bias toward rule size, since rule size is much easier to optimized than effectiveness, and larger rules are easily abandoned, resulting in the loss of effectiveness. To address this issue, this article develops a novel multiobjective genetic programming algorithm that takes size and effectiveness of scheduling heuristics for optimization via multitask learning mechanism. Specifically, we construct two tasks for the multiobjective optimization with biased objectives using different search mechanisms for each task. The focus of the proposed algorithm is to improve the effectiveness of learned small rules by knowledge sharing between constructed tasks which is implemented with the crossover operator. The results show that our proposed algorithm performs significantly better, i.e., with smaller and more effective scheduling heuristics, than the state-of-the-art algorithms in the examined scenarios. By analyzing the population diversity, we find that the proposed algorithm has a good balance between exploration and exploitation during the evolutionary process.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 1","pages":"169-183"},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Face Forgery Detection Based on Fine-Grained Clues and Noise Inconsistency 基于细粒度线索和噪声不一致性的人脸伪造检测

IEEE transactions on artificial intelligence

Pub Date : 2024-09-06 DOI: 10.1109/TAI.2024.3455311

Dengyong Zhang;Ruiyi He;Xin Liao;Feng Li;Jiaxin Chen;Gaobo Yang

Deepfake detection has gained increasing research attention in media forensics, and a variety of works have been produced. However, subtle artifacts might be eliminated by compression, and the convolutional neural networks (CNNs)-based detectors are invalidated for fake face images with compression. In this work, we propose a two-stream network for deepfake detection. We observed that high-frequency noise features and spatial features are inherently complementary to each other. Thus, both spatial features and high-frequency noise features are exploited for face forgery detection. Specifically, we design a double-frequency transformer module (DFTM) to guide the learning of spatial features from local artifact regions. To effectively fuse spatial features and high-frequency noise features, a dual-domain attention fusion module (DDAFM) is designed. We also introduce a local relationship constraint loss, which requires only image-level labels, for model training. We evaluate the proposed approach on five large-scale benchmark datasets, and extensive experimental results demonstrate the proposed approach outperforms most SOTA works.

在媒体取证领域，深度伪造检测受到越来越多的研究关注，各种研究成果层出不穷。然而，压缩可能会消除细微的伪影，基于卷积神经网络（CNN）的检测器在压缩后对假脸图像的检测无效。在这项工作中，我们提出了一种双流网络深度检假技术。我们发现，高频噪声特征和空间特征在本质上是互补的。因此，空间特征和高频噪声特征都可用于人脸伪造检测。具体来说，我们设计了一个双频变压器模块（DFTM）来引导从局部伪造区域学习空间特征。为了有效融合空间特征和高频噪声特征，我们设计了双域注意力融合模块（DDAFM）。我们还为模型训练引入了局部关系约束损失，它只需要图像级标签。我们在五个大型基准数据集上对所提出的方法进行了评估，大量实验结果表明所提出的方法优于大多数 SOTA 作品。

{"title":"Face Forgery Detection Based on Fine-Grained Clues and Noise Inconsistency","authors":"Dengyong Zhang;Ruiyi He;Xin Liao;Feng Li;Jiaxin Chen;Gaobo Yang","doi":"10.1109/TAI.2024.3455311","DOIUrl":"https://doi.org/10.1109/TAI.2024.3455311","url":null,"abstract":"Deepfake detection has gained increasing research attention in media forensics, and a variety of works have been produced. However, subtle artifacts might be eliminated by compression, and the convolutional neural networks (CNNs)-based detectors are invalidated for fake face images with compression. In this work, we propose a two-stream network for deepfake detection. We observed that high-frequency noise features and spatial features are inherently complementary to each other. Thus, both spatial features and high-frequency noise features are exploited for face forgery detection. Specifically, we design a double-frequency transformer module (DFTM) to guide the learning of spatial features from local artifact regions. To effectively fuse spatial features and high-frequency noise features, a dual-domain attention fusion module (DDAFM) is designed. We also introduce a local relationship constraint loss, which requires only image-level labels, for model training. We evaluate the proposed approach on five large-scale benchmark datasets, and extensive experimental results demonstrate the proposed approach outperforms most SOTA works.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 1","pages":"144-158"},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Improved and Explainable Electricity Price Forecasting Model via SHAP-Based Error Compensation Approach 基于shap误差补偿的改进的可解释电价预测模型

IEEE transactions on artificial intelligence

Pub Date : 2024-09-06 DOI: 10.1109/TAI.2024.3455313

Leena Heistrene;Juri Belikov;Dmitry Baimel;Liran Katzir;Ram Machlev;Kfir Levy;Shie Mannor;Yoash Levron

Forecasting errors in power markets, even as small as 1%, can have significant financial implications. However, even high-performance artificial intelligence (AI) based electricity price forecasting (EPF) models have instances when their prediction error is much higher than those shown by mean performance metrics. To date, explainable AI has been used to enhance the model transparency and trustworthiness of AI-based EPF models. However, this article demonstrates that insights from explainable AI (XAI) techniques can be expanded beyond its primary task of explanatory visualizations. This work presents a XAI-based error compensation approach to improve model performance and identify irregular predictions. The first phase of the proposed approach involves error quantification through a Shapley additive explanations (SHAP) based corrector model that fine-tunes the base predictor's forecasts. Using this corrector model's SHAP explanations, the proposed approach distinguishes high-accuracy predictions from lower ones in the second stage. Additionally, these explanations are more simplified than the base model, making them easier for nonexpert users such as bidding agents. Performance enhancement and insightful user-centric explanations are crucial for real-world scenarios such as price spikes during network congestion, high renewable penetration, and fluctuating fuel costs. Case studies discussed here show the efficacy of the proposed approach independent of model architecture, feature combination, or behavioral patterns of electricity prices in different markets.

电力市场中的预测误差，即使只有 1%，也会产生重大的财务影响。然而，即使是基于人工智能（AI）的高性能电价预测（EPF）模型，也会出现预测误差远高于平均性能指标的情况。迄今为止，可解释人工智能已被用于提高基于人工智能的电价预测模型的透明度和可信度。然而，本文表明，可解释人工智能（XAI）技术的见解可以扩展到其解释性可视化的主要任务之外。这项工作提出了一种基于 XAI 的误差补偿方法，以提高模型性能并识别不规则预测。所提方法的第一阶段涉及通过基于夏普利加法解释（SHAP）的校正器模型进行误差量化，该模型可对基础预测器的预测进行微调。利用该修正模型的 SHAP 解释，建议的方法可在第二阶段区分高精度预测和低精度预测。此外，这些解释比基础模型更加简化，更便于非专业用户（如竞标代理）使用。性能提升和以用户为中心的深刻解释对于现实世界中的各种情况至关重要，例如网络拥堵时的价格飙升、可再生能源的高渗透率以及燃料成本的波动。本文讨论的案例研究表明，所提出的方法不受模型架构、特征组合或不同市场电价行为模式的影响，具有很强的功效。

{"title":"An Improved and Explainable Electricity Price Forecasting Model via SHAP-Based Error Compensation Approach","authors":"Leena Heistrene;Juri Belikov;Dmitry Baimel;Liran Katzir;Ram Machlev;Kfir Levy;Shie Mannor;Yoash Levron","doi":"10.1109/TAI.2024.3455313","DOIUrl":"https://doi.org/10.1109/TAI.2024.3455313","url":null,"abstract":"Forecasting errors in power markets, even as small as 1%, can have significant financial implications. However, even high-performance artificial intelligence (AI) based electricity price forecasting (EPF) models have instances when their prediction error is much higher than those shown by mean performance metrics. To date, explainable AI has been used to enhance the model transparency and trustworthiness of AI-based EPF models. However, this article demonstrates that insights from explainable AI (XAI) techniques can be expanded beyond its primary task of explanatory visualizations. This work presents a XAI-based error compensation approach to improve model performance and identify irregular predictions. The first phase of the proposed approach involves error quantification through a Shapley additive explanations (SHAP) based corrector model that fine-tunes the base predictor's forecasts. Using this corrector model's SHAP explanations, the proposed approach distinguishes high-accuracy predictions from lower ones in the second stage. Additionally, these explanations are more simplified than the base model, making them easier for nonexpert users such as bidding agents. Performance enhancement and insightful user-centric explanations are crucial for real-world scenarios such as price spikes during network congestion, high renewable penetration, and fluctuating fuel costs. Case studies discussed here show the efficacy of the proposed approach independent of model architecture, feature combination, or behavioral patterns of electricity prices in different markets.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 1","pages":"159-168"},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Direct Adversarial Latent Estimation to Evaluate Decision Boundary Complexity in Black Box Models 黑箱模型中决策边界复杂性的直接对抗潜在估计

IEEE transactions on artificial intelligence

Pub Date : 2024-09-06 DOI: 10.1109/TAI.2024.3455308

Ashley S. Dale;Lauren Christopher

A trustworthy artificial intelligence (AI) model should be robust to perturbed data, where robustness correlates with the dimensionality and linearity of feature representations in the model latent space. Existing methods for evaluating feature representations in the latent space are restricted to white-box models. In this work, we introduce direct adversarial latent estimation (DALE) for evaluating the robustness of feature representations and decision boundaries for target black-box models. A surrogate latent space is created using a variational autoencoder (VAE) trained on a disjoint dataset from an object classification backbone, then the VAE latent space is traversed to create sets of adversarial images. An object classification model is trained using transfer learning on the VAE image reconstructions, then classifies instances in the adversarial image set. We propose that the number of times the classification changes in an image set indicates the complexity of the decision boundaries in the classifier latent space; more complex decision boundaries are found to be more robust. This is confirmed by comparing the DALE distributions to the degradation of the classifier F1 scores in the presence of adversarial attacks. This work enables the first comparisons of latent-space complexity between black box models by relating model robustness to complex decision boundaries.

一个值得信赖的人工智能（AI）模型应该对扰动数据具有鲁棒性，其中鲁棒性与模型潜在空间中特征表示的维数和线性相关。现有的评估潜在空间中特征表示的方法仅限于白盒模型。在这项工作中，我们引入了直接对抗潜在估计（DALE）来评估目标黑箱模型的特征表示和决策边界的鲁棒性。在对象分类主干的不相交数据集上训练变分自编码器（VAE）创建代理潜空间，然后遍历变分自编码器潜空间以创建对抗图像集。在VAE图像重建上使用迁移学习训练目标分类模型，然后在对抗图像集中对实例进行分类。我们提出一个图像集中分类变化的次数表示分类器潜在空间中决策边界的复杂性；研究发现，越复杂的决策边界越稳健。通过比较DALE分布与存在对抗性攻击时分类器F1分数的退化，可以证实这一点。这项工作通过将模型鲁棒性与复杂决策边界联系起来，实现了黑箱模型之间潜在空间复杂性的首次比较。

{"title":"Direct Adversarial Latent Estimation to Evaluate Decision Boundary Complexity in Black Box Models","authors":"Ashley S. Dale;Lauren Christopher","doi":"10.1109/TAI.2024.3455308","DOIUrl":"https://doi.org/10.1109/TAI.2024.3455308","url":null,"abstract":"A trustworthy artificial intelligence (AI) model should be robust to perturbed data, where robustness correlates with the dimensionality and linearity of feature representations in the model latent space. Existing methods for evaluating feature representations in the latent space are restricted to white-box models. In this work, we introduce \u0000<italic>direct adversarial latent estimation\u0000 (DALE) for evaluating the robustness of feature representations and decision boundaries for target black-box models. A surrogate latent space is created using a variational autoencoder (VAE) trained on a disjoint dataset from an object classification backbone, then the VAE latent space is traversed to create sets of adversarial images. An object classification model is trained using transfer learning on the VAE image reconstructions, then classifies instances in the adversarial image set. We propose that the number of times the classification changes in an image set indicates the complexity of the decision boundaries in the classifier latent space; more complex decision boundaries are found to be more robust. This is confirmed by comparing the DALE distributions to the degradation of the classifier F1 scores in the presence of adversarial attacks. This work enables the first comparisons of latent-space complexity between black box models by relating model robustness to complex decision boundaries.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6043-6053"},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AugDiff: Diffusion-Based Feature Augmentation for Multiple Instance Learning in Whole Slide Image AugDiff：基于扩散的全幻灯片图像多实例学习特征增强

IEEE transactions on artificial intelligence

Pub Date : 2024-09-05 DOI: 10.1109/TAI.2024.3454591

Zhuchen Shao;Liuxi Dai;Yifeng Wang;Haoqian Wang;Yongbing Zhang

Multiple instance learning (MIL), a powerful strategy for weakly supervised learning, is able to perform various prediction tasks on gigapixel whole slide images (WSIs). However, the tens of thousands of patches in WSIs usually incur a vast computational burden for image augmentation, limiting the performance improvement in MIL. Currently, the feature augmentation-based MIL framework is a promising solution, while existing methods such as mixup often produce unrealistic features. To explore a more efficient and practical augmentation method, we introduce the diffusion model (DM) into MIL for the first time and propose a feature augmentation framework called AugDiff. The diverse generation capabilities of DM guarantee a various range of feature augmentations, while its iterative generation approach effectively preserves semantic integrity during these augmentations. We conduct extensive experiments over four distinct cancer datasets, two different feature extractors, and three prevalent MIL algorithms to evaluate the performance of AugDiff. Ablation study and visualization further verify the effectiveness. Moreover, we highlight AugDiff's higher quality augmented feature over image augmentation and its superiority over self-supervised learning. The generalization over external datasets indicates its broader applications. The code is open-sourced on https://github.com/szc19990412/AugDiff.

多实例学习（MIL）是一种强大的弱监督学习策略，能够在千兆像素的整张幻灯片图像（WSI）上执行各种预测任务。然而，WSI 中数以万计的斑块通常会给图像增强带来巨大的计算负担，从而限制了 MIL 性能的提高。目前，基于特征增强的 MIL 框架是一种很有前景的解决方案，而现有的方法（如 mixup）往往会产生不切实际的特征。为了探索一种更高效、更实用的增强方法，我们首次在 MIL 中引入了扩散模型（DM），并提出了名为 AugDiff 的特征增强框架。DM 多样化的生成能力保证了各种特征增强，而其迭代生成方法在这些增强过程中有效地保持了语义的完整性。我们在四个不同的癌症数据集、两种不同的特征提取器和三种流行的 MIL 算法上进行了广泛的实验，以评估 AugDiff 的性能。消融研究和可视化进一步验证了其有效性。此外，我们还强调了 AugDiff 比图像增强具有更高质量的增强特征，而且比自我监督学习更具优势。在外部数据集上的泛化表明其应用范围更广。代码开源于 https://github.com/szc19990412/AugDiff。

{"title":"AugDiff: Diffusion-Based Feature Augmentation for Multiple Instance Learning in Whole Slide Image","authors":"Zhuchen Shao;Liuxi Dai;Yifeng Wang;Haoqian Wang;Yongbing Zhang","doi":"10.1109/TAI.2024.3454591","DOIUrl":"https://doi.org/10.1109/TAI.2024.3454591","url":null,"abstract":"Multiple instance learning (MIL), a powerful strategy for weakly supervised learning, is able to perform various prediction tasks on gigapixel whole slide images (WSIs). However, the tens of thousands of patches in WSIs usually incur a vast computational burden for image augmentation, limiting the performance improvement in MIL. Currently, the feature augmentation-based MIL framework is a promising solution, while existing methods such as mixup often produce unrealistic features. To explore a more efficient and practical augmentation method, we introduce the diffusion model (DM) into MIL for the first time and propose a feature augmentation framework called AugDiff. The diverse generation capabilities of DM guarantee a various range of feature augmentations, while its iterative generation approach effectively preserves semantic integrity during these augmentations. We conduct extensive experiments over four distinct cancer datasets, two different feature extractors, and three prevalent MIL algorithms to evaluate the performance of AugDiff. Ablation study and visualization further verify the effectiveness. Moreover, we highlight AugDiff's higher quality augmented feature over image augmentation and its superiority over self-supervised learning. The generalization over external datasets indicates its broader applications. The code is open-sourced on \u0000<uri>https://github.com/szc19990412/AugDiff</uri>\u0000.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6617-6628"},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142825949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spatiotemporal Object Detection for Improved Aerial Vehicle Detection in Traffic Monitoring 交通监控中改进飞行器检测的时空目标检测

IEEE transactions on artificial intelligence

Pub Date : 2024-09-05 DOI: 10.1109/TAI.2024.3454566

Kristina Telegraph;Christos Kyrkou

This work presents advancements in multiclass vehicle detection using unmanned aerial vehicle (UAV) cameras through the development of spatiotemporal object detection models. The study introduces a spatiotemporal vehicle detection dataset (STVD) containing

$6600$

annotated sequential frame images captured by UAVs, enabling comprehensive training and evaluation of algorithms for holistic spatiotemporal perception. A YOLO-based object detection algorithm is enhanced to incorporate temporal dynamics, resulting in improved performance over single frame models. The integration of attention mechanisms into spatiotemporal models is shown to further enhance performance. Experimental validation demonstrates significant progress, with the best spatiotemporal model exhibiting a 16.22% improvement over single frame models, while it is demonstrated that attention mechanisms hold the potential for additional performance gains.

这项工作通过开发时空目标检测模型，介绍了使用无人机（UAV）相机进行多类别车辆检测的进展。该研究引入了一个时空车辆检测数据集（STVD），其中包含由无人机捕获的6600张带注释的序列帧图像，能够对整体时空感知算法进行全面的训练和评估。一种基于yolo的目标检测算法被增强，以结合时间动态，从而提高了单帧模型的性能。将注意机制整合到时空模型中可以进一步提高表现。实验验证显示了显著的进步，与单帧模型相比，最佳时空模型表现出16.22%的改进，同时表明注意机制具有额外性能提升的潜力。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

IEEE transactions on artificial intelligence

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀