首页 > 最新文献

IEEE transactions on artificial intelligence最新文献

英文 中文
C2RS: Multimodal Knowledge Graph Completion With Cross-Modal Consistency and Relation Semantics 基于跨模态一致性和关系语义的多模态知识图补全
Pub Date : 2025-03-05 DOI: 10.1109/TAI.2025.3548621
Yulou Shu;Wengen Li;Jiaqi Wang;Yichao Zhang;Jihong Guan;Shuigeng Zhou
Multimodal knowledge graph completion (MKGC) has been a popular research topic in recent years. However, existing methods rarely consider the alignment of different entity modalities in the process of multimodal fusion, and often lack sufficient attention to the semantic information conveyed by relations, thus resulting in unsatisfactory completion performance. To address these two issues, we propose a new MKGC model called C2RS. This model first designs a cross-modal consistency contrastive learning task to align different entity modalities for accurate entity representation. Then, C2RS develops a relation semantic encoding module based on the distributions of knowledge graph (KG) triples to extract the semantic information of relations for comprehensive relation representation. Finally, we encode the candidate triples with a triple encoder and identify the correct entities through a scoring function to complete the multimodal KG. According to the extensive experiments on three public MKGC datasets, C2RS obviously outperforms the baseline methods.
多模态知识图补全(MKGC)是近年来研究的热点。然而,现有方法在多模态融合过程中很少考虑不同实体模态的对齐,往往对关系所传递的语义信息缺乏足够的关注,从而导致补全性能不理想。为了解决这两个问题,我们提出了一个新的MKGC模型,称为C2RS。该模型首先设计了一个跨模态一致性对比学习任务,以对齐不同的实体模态以获得准确的实体表示。然后,C2RS开发了基于知识图三元组分布的关系语义编码模块,提取关系的语义信息,进行全面的关系表示。最后,我们使用三元编码器对候选三元组进行编码,并通过评分函数识别正确的实体,完成多模态KG。在三个公开的MKGC数据集上进行了大量的实验,结果表明C2RS明显优于基线方法。
{"title":"C2RS: Multimodal Knowledge Graph Completion With Cross-Modal Consistency and Relation Semantics","authors":"Yulou Shu;Wengen Li;Jiaqi Wang;Yichao Zhang;Jihong Guan;Shuigeng Zhou","doi":"10.1109/TAI.2025.3548621","DOIUrl":"https://doi.org/10.1109/TAI.2025.3548621","url":null,"abstract":"Multimodal knowledge graph completion (MKGC) has been a popular research topic in recent years. However, existing methods rarely consider the alignment of different entity modalities in the process of multimodal fusion, and often lack sufficient attention to the semantic information conveyed by relations, thus resulting in unsatisfactory completion performance. To address these two issues, we propose a new MKGC model called C<sup>2</sup>RS. This model first designs a cross-modal consistency contrastive learning task to align different entity modalities for accurate entity representation. Then, C<sup>2</sup>RS develops a relation semantic encoding module based on the distributions of knowledge graph (KG) triples to extract the semantic information of relations for comprehensive relation representation. Finally, we encode the candidate triples with a triple encoder and identify the correct entities through a scoring function to complete the multimodal KG. According to the extensive experiments on three public MKGC datasets, C<sup>2</sup>RS obviously outperforms the baseline methods.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 11","pages":"2940-2952"},"PeriodicalIF":0.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145456009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learn to Learn: A Mirror Meta-Learning Method for Retinal Disease Diagnosis on Fundus Images 学会学习:眼底图像视网膜疾病诊断的镜像元学习方法
Pub Date : 2025-03-05 DOI: 10.1109/TAI.2025.3566082
Haoran Peng;Jianqiang Li;Wenxiu Cheng;Linna Zhao;Yu Guan;Zhaosheng Li;Li Li;Xi Xu
Retinal diseases, such as glaucoma, age-related macular degeneration, and high myopia, are major contributors to global vision loss, emphasizing the need for early detection and intervention. Current deep learning approaches for diagnosing retinal diseases using fundus images primarily focus on single-disease classification due to the scarcity and expense of diverse datasets. This limitation restricts their generalization across multiple ocular diseases and impedes transfer learning to untrained disease types. In this article, we introduce a novel model-agnostic meta-learning framework, called mirror meta-learning (MML), which incorporates an autoencoder module to supervise the backpropagation path in few-shot learning, enhancing model initialization and adaptation. MML’s effectiveness is validated using four publicly available retinal disease binary classification datasets and a proprietary high myopia dataset. In addition, MML demonstrates robustness when tested on three well-established few-shot learning datasets. Our results show the proposed model’s superiority in terms of performance and generalizability in ocular disease classification tasks.
青光眼、年龄相关性黄斑变性和高度近视等视网膜疾病是全球视力丧失的主要原因,因此需要及早发现和干预。由于各种数据集的稀缺和昂贵,目前使用眼底图像诊断视网膜疾病的深度学习方法主要集中在单一疾病分类上。这一限制限制了它们在多种眼部疾病中的推广,并阻碍了将学习转移到未经训练的疾病类型。在本文中,我们介绍了一种新的模型不可知元学习框架,称为镜像元学习(MML),它包含一个自动编码器模块来监督少量学习中的反向传播路径,增强模型初始化和自适应。MML的有效性使用四个公开可用的视网膜疾病二分类数据集和专有的高度近视数据集进行验证。此外,MML在三个完善的少量学习数据集上测试时显示出鲁棒性。我们的研究结果表明,该模型在眼部疾病分类任务的性能和通用性方面具有优势。
{"title":"Learn to Learn: A Mirror Meta-Learning Method for Retinal Disease Diagnosis on Fundus Images","authors":"Haoran Peng;Jianqiang Li;Wenxiu Cheng;Linna Zhao;Yu Guan;Zhaosheng Li;Li Li;Xi Xu","doi":"10.1109/TAI.2025.3566082","DOIUrl":"https://doi.org/10.1109/TAI.2025.3566082","url":null,"abstract":"Retinal diseases, such as glaucoma, age-related macular degeneration, and high myopia, are major contributors to global vision loss, emphasizing the need for early detection and intervention. Current deep learning approaches for diagnosing retinal diseases using fundus images primarily focus on single-disease classification due to the scarcity and expense of diverse datasets. This limitation restricts their generalization across multiple ocular diseases and impedes transfer learning to untrained disease types. In this article, we introduce a novel model-agnostic meta-learning framework, called mirror meta-learning (MML), which incorporates an autoencoder module to supervise the backpropagation path in few-shot learning, enhancing model initialization and adaptation. MML’s effectiveness is validated using four publicly available retinal disease binary classification datasets and a proprietary high myopia dataset. In addition, MML demonstrates robustness when tested on three well-established few-shot learning datasets. Our results show the proposed model’s superiority in terms of performance and generalizability in ocular disease classification tasks.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 12","pages":"3391-3405"},"PeriodicalIF":0.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145612208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lightweight Dynamic Convolutional Network for Crowd Counting Based on Curriculum Reinforcement Learning 基于课程强化学习的人群计数轻量级动态卷积网络
Pub Date : 2025-03-05 DOI: 10.1109/TAI.2025.3566923
Yange Li;Fan Yu;Qun Chen
In public spaces, high pedestrian concentrations usually lead to congestion and may even pose trampling risks. Upon reaching a specific density threshold, implementing control measures becomes necessary to regulate pedestrian inflows. Therefore, detecting and identifying crowded areas is crucial for pedestrian flow control. Crowd counting is a key technique for achieving this goal. Recently, researchers have dedicated significant efforts to designing convolutional neural networks with various architectures for solving this problem. However, the existing models have structures with high computing power requirements for extreme situations, making it difficult for them to run on edge devices such as surveillance computers. In this article, we propose a lightweight crowd counting model with a dynamic convolutional kernel for the crowd counting task. The model is built via an encoder–decoder structure. The encoder extracts high-quality features through inverted residual layers implemented via MobileNetV2, which are replaced by a dynamic convolutional kernel. The decoder generates a density map through upsampling and linear layers. A skip connection structure is added to facilitate information exchange between the codecs and reduce the loss of information. Moreover, a training strategy based on curriculum reinforcement learning is presented. This strategy facilitates the integration of samples from diverse datasets, and the difficulty level of each sampling step is dynamically adjusted with a reinforcement learning model. In addition, this strategy can be used to organize the training sequence in each iteration on the basis of sample complexity, thereby achieving enhanced training stability and improved model performance. Comprehensive experimental evidence demonstrates that our model produces superior outcomes to those of competing methods across several benchmark datasets.
在公共场所,行人高度集中通常会导致拥堵,甚至可能造成踩踏风险。当达到特定的密度阈值时,就有必要实施控制措施来控制行人流入。因此,检测和识别拥挤区域对于行人流量控制至关重要。人群计数是实现这一目标的关键技术。最近,研究人员投入了大量的精力来设计具有各种架构的卷积神经网络来解决这个问题。然而,现有模型的结构对极端情况的计算能力要求很高,难以在监控计算机等边缘设备上运行。在本文中,我们提出了一个轻量级的人群计数模型与动态卷积核的人群计数任务。该模型是通过一个编码器-解码器结构建立的。编码器通过MobileNetV2实现的反向残差层提取高质量的特征,这些残差层被动态卷积核取代。解码器通过上采样和线性层生成密度图。为了方便编解码器之间的信息交换和减少信息丢失,增加了跳接结构。此外,提出了一种基于课程强化学习的训练策略。该策略有利于整合来自不同数据集的样本,并且通过强化学习模型动态调整每个采样步骤的难度等级。此外,该策略可以根据样本复杂度在每次迭代中组织训练序列,从而增强训练稳定性,提高模型性能。综合实验证据表明,我们的模型在多个基准数据集上产生优于竞争方法的结果。
{"title":"Lightweight Dynamic Convolutional Network for Crowd Counting Based on Curriculum Reinforcement Learning","authors":"Yange Li;Fan Yu;Qun Chen","doi":"10.1109/TAI.2025.3566923","DOIUrl":"https://doi.org/10.1109/TAI.2025.3566923","url":null,"abstract":"In public spaces, high pedestrian concentrations usually lead to congestion and may even pose trampling risks. Upon reaching a specific density threshold, implementing control measures becomes necessary to regulate pedestrian inflows. Therefore, detecting and identifying crowded areas is crucial for pedestrian flow control. Crowd counting is a key technique for achieving this goal. Recently, researchers have dedicated significant efforts to designing convolutional neural networks with various architectures for solving this problem. However, the existing models have structures with high computing power requirements for extreme situations, making it difficult for them to run on edge devices such as surveillance computers. In this article, we propose a lightweight crowd counting model with a dynamic convolutional kernel for the crowd counting task. The model is built via an encoder–decoder structure. The encoder extracts high-quality features through inverted residual layers implemented via MobileNetV2, which are replaced by a dynamic convolutional kernel. The decoder generates a density map through upsampling and linear layers. A skip connection structure is added to facilitate information exchange between the codecs and reduce the loss of information. Moreover, a training strategy based on curriculum reinforcement learning is presented. This strategy facilitates the integration of samples from diverse datasets, and the difficulty level of each sampling step is dynamically adjusted with a reinforcement learning model. In addition, this strategy can be used to organize the training sequence in each iteration on the basis of sample complexity, thereby achieving enhanced training stability and improved model performance. Comprehensive experimental evidence demonstrates that our model produces superior outcomes to those of competing methods across several benchmark datasets.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 11","pages":"3115-3131"},"PeriodicalIF":0.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145455735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
K-Nearest Neighbor Algorithm Based on the Framework of Ordered Pair of Normalized Real Numbers 基于有序实数对框架的k近邻算法
Pub Date : 2025-03-05 DOI: 10.1109/TAI.2025.3566925
Yi Zheng;Xuanbin Ding;Xiang Zhao;Xiaoqin Pan;Lei Zhou
The K-nearest neighbors (kNNs) algorithm, a cornerstone of supervised learning, relies on similarity measures constrained by real-number-based distance metrics. A critical limitation of traditional kNN research lies in its confinement to the real-number domain, which inherently restricts its ability to model nonlinear feature interactions in high-dimensional data and amplifies sensitivity to feature redundancy and class imbalance. These limitations arise from the inherent linearity and unidimensional nature of real-number representations, which restrict their ability to model complex feature interdependencies. To transcend these limitations, this article proposes ordered pairs of normalized real numbers (OPNs)-kNN, a novel framework grounded in OPNs. Departing from the conventional real-number paradigm, OPNs-kNN constructs feature pairs as multidimensional OPNs tuples and employs a generalized OPNs-valued metric to explicitly model nonlinear relationships, thereby addressing the inherent shortcomings of real-number-based kNN. Extensive experiments on nine University of California, Irvine (UCI) benchmark datasets (e.g., glass, wines, and seeds) demonstrate that OPNs-kNN achieves statistically significant improvements in classification accuracy, precision, recall, and F1-score compared with traditional kNN and its enhanced variants. This work pioneers a nonreal-number computational framework, proving that moving beyond real-number constraints enables more expressive representations of data relationships, opening new directions for designing robust machine learning models in complex domains.
k近邻(kNNs)算法是监督学习的基石,它依赖于基于实数的距离度量约束的相似性度量。传统kNN研究的一个关键局限性在于其局限于实数域,这固有地限制了其在高维数据中建模非线性特征相互作用的能力,并且放大了对特征冗余和类不平衡的敏感性。这些限制来自于实数表示固有的线性和单维性,这限制了它们对复杂特征相互依赖性进行建模的能力。为了超越这些限制,本文提出了有序规范化实数对(opn)-kNN,一个基于opn的新框架。与传统的实数范式不同,OPNs-kNN将特征对构建为多维OPNs元组,并采用广义的OPNs值度量来显式建模非线性关系,从而解决了基于实数的kNN的固有缺陷。在加州大学欧文分校(UCI)的9个基准数据集(如玻璃、葡萄酒和种子)上进行的大量实验表明,与传统kNN及其增强变体相比,OPNs-kNN在分类准确率、精度、召回率和f1分数方面取得了统计上显著的提高。这项工作开创了一个非实数计算框架,证明超越实数约束可以更有表现力地表示数据关系,为在复杂领域设计健壮的机器学习模型开辟了新的方向。
{"title":"K-Nearest Neighbor Algorithm Based on the Framework of Ordered Pair of Normalized Real Numbers","authors":"Yi Zheng;Xuanbin Ding;Xiang Zhao;Xiaoqin Pan;Lei Zhou","doi":"10.1109/TAI.2025.3566925","DOIUrl":"https://doi.org/10.1109/TAI.2025.3566925","url":null,"abstract":"The K-nearest neighbors (kNNs) algorithm, a cornerstone of supervised learning, relies on similarity measures constrained by real-number-based distance metrics. A critical limitation of traditional kNN research lies in its confinement to the real-number domain, which inherently restricts its ability to model nonlinear feature interactions in high-dimensional data and amplifies sensitivity to feature redundancy and class imbalance. These limitations arise from the inherent linearity and unidimensional nature of real-number representations, which restrict their ability to model complex feature interdependencies. To transcend these limitations, this article proposes ordered pairs of normalized real numbers (OPNs)-kNN, a novel framework grounded in OPNs. Departing from the conventional real-number paradigm, OPNs-kNN constructs feature pairs as multidimensional OPNs tuples and employs a generalized OPNs-valued metric to explicitly model nonlinear relationships, thereby addressing the inherent shortcomings of real-number-based kNN. Extensive experiments on nine University of California, Irvine (UCI) benchmark datasets (e.g., glass, wines, and seeds) demonstrate that OPNs-kNN achieves statistically significant improvements in classification accuracy, precision, recall, and F1-score compared with traditional kNN and its enhanced variants. This work pioneers a nonreal-number computational framework, proving that moving beyond real-number constraints enables more expressive representations of data relationships, opening new directions for designing robust machine learning models in complex domains.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 11","pages":"3132-3147"},"PeriodicalIF":0.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145428944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Graph Convolutional Autoencoder With Conditional Normalizing Flow for Power Distribution Systems Fault Classification and Location 基于条件归一化流的深度图卷积自编码器在配电系统故障分类与定位中的应用
Pub Date : 2025-03-04 DOI: 10.1109/TAI.2025.3547878
Mohsen Saffari;Mahdi Khodayar;Mohammad E. Khodayar;Seyed Saeed Fazlhashemi
Accurate fault classification and location are critical to ensure the reliability and resilience of large-scale power distribution systems (PDSs). The existing data-driven works in this area struggle to capture essential space-time correlations of PDS measurements and often rely on deterministic and shallow neural architectures. Furthermore, they encounter challenges such as over-smoothing and the inability to capture deep correlations. To overcome these limitations, a novel deep space-time generative graph convolutional autoencoder (SGGCA) is proposed. First, the PDS is modeled as a space-time graph where the nodes and edges show the bus measurements and line impedance values, respectively. The proposed SGGCA's encoder captures deep correlations of the space-time graph using a new graph convolution with early connections and identity transformations to mitigate the over-smoothing. Our encoder encompasses a new recurrent method to adjust graph convolution parameters without relying on node embeddings on the temporal dimension. Additionally, it incorporates generative modeling by capturing the probability distribution function of the latent representation through a conditional normalizing flow model. The extracted generative space-time features are enhanced by a multi-head attention mechanism to better capture task-relevant characteristics of the PDS measurements. The extracted features are fed to sparse decoders to classify and locate the faults in the PDS. The feature sparsity of decoders ensures a high generalization capacity and avoids overfitting. The proposed method is evaluated on the IEEE 69-bus and 123-bus systems. It achieves substantial improvements in fault classification accuracy by 3.33% and 6.26% and enhances fault location accuracy by 6.33% and 5.73% for the respective PDSs compared with state-of-the-art models.
准确的故障分类和定位对于保证大型配电系统的可靠性和恢复能力至关重要。该领域现有的数据驱动工作难以捕捉PDS测量的基本时空相关性,并且通常依赖于确定性和浅层神经结构。此外,它们还面临着过度平滑和无法捕捉深度相关性等挑战。为了克服这些限制,提出了一种新的深空生成图卷积自编码器(SGGCA)。首先,将PDS建模为一个时空图,其中节点和边缘分别表示总线测量值和线路阻抗值。所提出的SGGCA编码器使用具有早期连接和单位变换的新图卷积来捕获时空图的深度相关性,以减轻过度平滑。我们的编码器包含了一种新的循环方法来调整图卷积参数,而不依赖于时间维度的节点嵌入。此外,它结合了生成建模,通过条件归一化流模型捕获潜在表示的概率分布函数。通过多头注意机制增强提取的生成时空特征,以更好地捕捉PDS测量的任务相关特征。将提取的特征馈送到稀疏解码器中,对PDS中的故障进行分类和定位。解码器的特征稀疏性保证了高泛化能力,避免了过拟合。在IEEE 69总线和123总线系统上对该方法进行了评估。与现有模型相比,pds的故障分类准确率分别提高了3.33%和6.26%,故障定位准确率分别提高了6.33%和5.73%。
{"title":"Deep Graph Convolutional Autoencoder With Conditional Normalizing Flow for Power Distribution Systems Fault Classification and Location","authors":"Mohsen Saffari;Mahdi Khodayar;Mohammad E. Khodayar;Seyed Saeed Fazlhashemi","doi":"10.1109/TAI.2025.3547878","DOIUrl":"https://doi.org/10.1109/TAI.2025.3547878","url":null,"abstract":"Accurate fault classification and location are critical to ensure the reliability and resilience of large-scale power distribution systems (PDSs). The existing data-driven works in this area struggle to capture essential space-time correlations of PDS measurements and often rely on deterministic and shallow neural architectures. Furthermore, they encounter challenges such as over-smoothing and the inability to capture deep correlations. To overcome these limitations, a novel deep space-time generative graph convolutional autoencoder (SGGCA) is proposed. First, the PDS is modeled as a space-time graph where the nodes and edges show the bus measurements and line impedance values, respectively. The proposed SGGCA's encoder captures deep correlations of the space-time graph using a new graph convolution with early connections and identity transformations to mitigate the over-smoothing. Our encoder encompasses a new recurrent method to adjust graph convolution parameters without relying on node embeddings on the temporal dimension. Additionally, it incorporates generative modeling by capturing the probability distribution function of the latent representation through a conditional normalizing flow model. The extracted generative space-time features are enhanced by a multi-head attention mechanism to better capture task-relevant characteristics of the PDS measurements. The extracted features are fed to sparse decoders to classify and locate the faults in the PDS. The feature sparsity of decoders ensures a high generalization capacity and avoids overfitting. The proposed method is evaluated on the IEEE 69-bus and 123-bus systems. It achieves substantial improvements in fault classification accuracy by 3.33% and 6.26% and enhances fault location accuracy by 6.33% and 5.73% for the respective PDSs compared with state-of-the-art models.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 9","pages":"2448-2463"},"PeriodicalIF":0.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144926898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Artificial Intelligence Publication Information IEEE人工智能学报
Pub Date : 2025-03-03 DOI: 10.1109/TAI.2025.3544009
{"title":"IEEE Transactions on Artificial Intelligence Publication Information","authors":"","doi":"10.1109/TAI.2025.3544009","DOIUrl":"https://doi.org/10.1109/TAI.2025.3544009","url":null,"abstract":"","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 2","pages":"C2-C2"},"PeriodicalIF":0.0,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10908601","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Guest Editorial: Operationalizing Responsible AI 嘉宾评论:实施负责任的人工智能
Pub Date : 2025-03-03 DOI: 10.1109/TAI.2025.3527806
Qinghua Lu;Apostol Vassilev;Jun Zhu;Foutse Khomh
{"title":"Guest Editorial: Operationalizing Responsible AI","authors":"Qinghua Lu;Apostol Vassilev;Jun Zhu;Foutse Khomh","doi":"10.1109/TAI.2025.3527806","DOIUrl":"https://doi.org/10.1109/TAI.2025.3527806","url":null,"abstract":"","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 2","pages":"252-253"},"PeriodicalIF":0.0,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10908600","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large-Scale Heliostat Field Optimization for Solar Power Tower System Using Matrix-Based Differential Evolution 基于矩阵差分进化的太阳能塔式系统定日镜场优化
Pub Date : 2025-03-03 DOI: 10.1109/TAI.2025.3545813
Dan-Ting Duan;Jian-Yu Li;Bing Sun;Xiao-Fang Liu;Qiang Yang;Qi-Jia Jiang;Zhi-Hui Zhan;Sam Kwong;Jun Zhang
Intelligent optimization of a solar power tower heliostat field (SPTHF) is critical for harnessing solar energy in various scenarios. However, existing SPTHF optimization methods are typically based on specific geometric layout constraints and assume that each heliostat has the same size and height. As a result, these methods are not flexible or practical in many real-world SPTHF application scenarios. Therefore, this article proposes a novel flexible SPTHF (FSPTHF) model that is more practical and involves fewer assumptions. This model enables the use of different layouts and simultaneous optimization of the parameters of each heliostat. As an FSPTHF can involve hundreds or even thousands of heliostats, optimizing the parameters of all heliostats results in a challenging large-scale optimization problem. To efficiently solve this problem, this article proposes a matrix-based differential evolution algorithm, called HMDE, for large-scale heliostat design. The HMDE uses a matrix-based encoding and representation method to improve optimization accuracy and convergence speed, incorporating two novel designs. First, a dual elite-based mutation method is proposed to enhance the convergence speed of HMDE by learning from multiple elite individuals. Second, a multi-level crossover method is proposed to improve the optimization accuracy and convergence speed by integrating element-level and vector-level crossover based on matrix representation. Extensive experiments were conducted on 30 problem instances based on real-world data with three different layouts and problem dimensions up to 12 000, where state-of-the-art algorithms were used for comparison. The experimental results show that the proposed HMDE can effectively solve large-scale FSPTHF optimization problems.
太阳能发电塔定日镜场的智能优化对各种场景下的太阳能利用至关重要。然而,现有的SPTHF优化方法通常基于特定的几何布局约束,并假设每个定日镜具有相同的尺寸和高度。因此,在许多实际的SPTHF应用场景中,这些方法既不灵活也不实用。因此,本文提出了一种新的灵活SPTHF (FSPTHF)模型,该模型更实用,涉及的假设更少。该模型允许使用不同的布局和同时优化每个定日镜的参数。由于FSPTHF可能涉及数百甚至数千个定日镜,因此优化所有定日镜的参数是一个具有挑战性的大规模优化问题。为了有效地解决这一问题,本文提出了一种基于矩阵的差分进化算法HMDE,用于大型定日镜设计。HMDE采用基于矩阵的编码和表示方法来提高优化精度和收敛速度,结合了两种新颖的设计。首先,提出了一种基于双精英的突变方法,通过学习多个精英个体来提高HMDE的收敛速度;其次,提出了一种基于矩阵表示的元素级和矢量级交叉相结合的多级交叉方法,提高了优化精度和收敛速度;在30个问题实例上进行了广泛的实验,这些问题实例基于真实世界的数据,具有三种不同的布局和多达12000个问题维度,其中使用了最先进的算法进行比较。实验结果表明,该算法可以有效地解决大规模FSPTHF优化问题。
{"title":"Large-Scale Heliostat Field Optimization for Solar Power Tower System Using Matrix-Based Differential Evolution","authors":"Dan-Ting Duan;Jian-Yu Li;Bing Sun;Xiao-Fang Liu;Qiang Yang;Qi-Jia Jiang;Zhi-Hui Zhan;Sam Kwong;Jun Zhang","doi":"10.1109/TAI.2025.3545813","DOIUrl":"https://doi.org/10.1109/TAI.2025.3545813","url":null,"abstract":"Intelligent optimization of a solar power tower heliostat field (SPTHF) is critical for harnessing solar energy in various scenarios. However, existing SPTHF optimization methods are typically based on specific geometric layout constraints and assume that each heliostat has the same size and height. As a result, these methods are not flexible or practical in many real-world SPTHF application scenarios. Therefore, this article proposes a novel flexible SPTHF (FSPTHF) model that is more practical and involves fewer assumptions. This model enables the use of different layouts and simultaneous optimization of the parameters of each heliostat. As an FSPTHF can involve hundreds or even thousands of heliostats, optimizing the parameters of all heliostats results in a challenging large-scale optimization problem. To efficiently solve this problem, this article proposes a matrix-based differential evolution algorithm, called HMDE, for large-scale heliostat design. The HMDE uses a matrix-based encoding and representation method to improve optimization accuracy and convergence speed, incorporating two novel designs. First, a dual elite-based mutation method is proposed to enhance the convergence speed of HMDE by learning from multiple elite individuals. Second, a multi-level crossover method is proposed to improve the optimization accuracy and convergence speed by integrating element-level and vector-level crossover based on matrix representation. Extensive experiments were conducted on 30 problem instances based on real-world data with three different layouts and problem dimensions up to 12 000, where state-of-the-art algorithms were used for comparison. The experimental results show that the proposed HMDE can effectively solve large-scale FSPTHF optimization problems.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 9","pages":"2422-2436"},"PeriodicalIF":0.0,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10908719","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144926905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
$n$-LIPO: Framework for Diverse Cooperative Agent Generation Using Policy Compatibility 基于策略兼容性的多种协作智能体生成框架
Pub Date : 2025-03-01 DOI: 10.1109/TAI.2025.3566067
Rujikorn Charakorn;Poramate Manoonpong;Nat Dilokthanakul
Diverse training partners in multiagent tasks are crucial for training a robust and adaptable cooperative agent. Prior methods often rely on state-action information to diversify partners’ behaviors, but this can lead to minor variations instead of diverse behaviors and solutions. We address this limitation by introducing a novel training objective based on “policy compatibility.” Our method learns diverse behaviors by encouraging agents within a team to be compatible with each other while being incompatible with agents from other teams. We theoretically prove that incompatible policies are inherently dissimilar, allowing us to use policy compatibility as a proxy for diversity. We call this method learning incompatible policies for $n$ -player cooperative games ($n$-LIPO). We propose to further diversify individual policies by incorporating a mutual information objective using state-action information. We empirically demonstrate that $n$-LIPO effectively generates diverse joint policies in various two-player and multi-player cooperative environments. In a complex cooperative task, two-player multi-recipe Overcooked, we find that $n$-LIPO generates a population of behaviorally diverse partners. These populations are then used to train robust generalist agents that can generalize better than using baseline populations. Finally, we demonstrate that $n$-LIPO can be applied to a high-dimensional StarCraft multiagent challenge (SMAC) multiplayer cooperative environment to discover diverse winning strategies when only a single goal exists. Additional visualization can also be accessed at https://sites.google.com/view/n-lipo/home.
在多智能体任务中,多样化的训练伙伴是训练一个鲁棒性和适应性强的合作智能体的关键。先前的方法往往依赖于状态-行为信息来多样化合作伙伴的行为,但这可能导致微小的变化,而不是多样化的行为和解决方案。我们通过引入基于“策略兼容性”的新训练目标来解决这一限制。我们的方法通过鼓励团队内的代理相互兼容而与其他团队的代理不兼容来学习不同的行为。我们从理论上证明,不相容的政策本质上是不同的,允许我们使用政策兼容性作为多样性的代理。我们称这种方法为学习n个参与者合作博弈(n -LIPO)的不兼容策略。我们建议通过结合使用国家行为信息的共同信息目标来进一步多样化个别政策。我们实证证明了$n$-LIPO在不同的双人和多人合作环境中有效地生成了不同的联合策略。在一个复杂的双人多配方Overcooked合作任务中,我们发现$n$-LIPO产生了一个行为多样化的合作伙伴群体。然后使用这些种群来训练健壮的通才代理,这些代理可以比使用基线种群更好地进行泛化。最后,我们证明了$n$-LIPO可以应用于高维星际争霸多智能体挑战(SMAC)多人合作环境,在只有一个目标时发现多种获胜策略。更多的可视化也可以访问https://sites.google.com/view/n-lipo/home。
{"title":"$n$-LIPO: Framework for Diverse Cooperative Agent Generation Using Policy Compatibility","authors":"Rujikorn Charakorn;Poramate Manoonpong;Nat Dilokthanakul","doi":"10.1109/TAI.2025.3566067","DOIUrl":"https://doi.org/10.1109/TAI.2025.3566067","url":null,"abstract":"Diverse training partners in multiagent tasks are crucial for training a robust and adaptable cooperative agent. Prior methods often rely on state-action information to diversify partners’ behaviors, but this can lead to minor variations instead of diverse behaviors and solutions. We address this limitation by introducing a novel training objective based on “policy compatibility.” Our method learns diverse behaviors by encouraging agents within a team to be compatible with each other while being incompatible with agents from other teams. We theoretically prove that incompatible policies are inherently dissimilar, allowing us to use policy compatibility as a proxy for diversity. We call this method <italic>learning incompatible policies for</i> <inline-formula><tex-math>$n$</tex-math></inline-formula> <italic>-player cooperative games</i> (<inline-formula><tex-math>$n$</tex-math></inline-formula>-LIPO). We propose to further diversify individual policies by incorporating a mutual information objective using state-action information. We empirically demonstrate that <inline-formula><tex-math>$n$</tex-math></inline-formula>-LIPO effectively generates diverse joint policies in various two-player and multi-player cooperative environments. In a complex cooperative task, two-player multi-recipe Overcooked, we find that <inline-formula><tex-math>$n$</tex-math></inline-formula>-LIPO generates a population of behaviorally diverse partners. These populations are then used to train robust generalist agents that can generalize better than using baseline populations. Finally, we demonstrate that <inline-formula><tex-math>$n$</tex-math></inline-formula>-LIPO can be applied to a high-dimensional StarCraft multiagent challenge (SMAC) multiplayer cooperative environment to discover diverse winning strategies when only a single goal exists. Additional visualization can also be accessed at <uri>https://sites.google.com/view/n-lipo/home</uri>.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 11","pages":"3100-3114"},"PeriodicalIF":0.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145455994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GeoDCL: Weak Geometrical Distortion-Based Contrastive Learning for Fine-Grained Fashion Image Retrieval 基于弱几何扭曲的细粒度时尚图像检索对比学习
Pub Date : 2025-02-28 DOI: 10.1109/TAI.2025.3545791
Ling Xiao;Toshihiko Yamasaki
This article addresses fine-grained fashion image retrieval (FIR), which aims at the detailed and precise retrieval of fashion items from extensive databases. Conventional fine-grained FIR methods design complex attention modules to enhance attribute-aware feature discrimination. However, they often ignore the multiview characteristics of real-world fashion data, leading to diminished model accuracy. Furthermore, our empirical analysis revealed that the straightforward application of standard contrastive learning methods to fine-grained FIR often yields suboptimal results. To alleviate this issue, we propose a novel weak geometrical distortion-based contrastive learning (GeoDCL) strategy. Specifically, GeoDCL incorporates both a novel positive pair design and a novel contrastive loss. GeoDCL can be seamlessly integrated into state-of-the-art (SOTA) fine-grained FIR methods during the training stage to enhance performance during inference. When GeoDCL is applied, the model structures of SOTA methods require no modifications. Additionally, GeoDCL is not utilized during inference, ensuring no increase in inference time. Experiments on the FashionAI, DeepFashion, and Zappos50K datasets verified GeoDCL's effectiveness in consistently improving SOTA models. In particular, GeoDCL drastically improved ASENet_V2 from 60.76% to 66.48% in mAP on the FashionAI dataset.
本文讨论细粒度时尚图像检索(FIR),其目的是从广泛的数据库中详细而精确地检索时尚项目。传统的细粒度FIR方法通过设计复杂的注意模块来增强属性感知特征识别。然而,他们经常忽略现实世界时尚数据的多视图特征,导致模型准确性降低。此外,我们的实证分析表明,直接将标准对比学习方法应用于细粒度FIR通常会产生次优结果。为了解决这个问题,我们提出了一种新的基于弱几何扭曲的对比学习策略。具体来说,GeoDCL结合了一种新的正对设计和一种新的对比损耗。GeoDCL可以在训练阶段无缝集成到最先进的(SOTA)细粒度FIR方法中,以提高推理期间的性能。当应用GeoDCL时,SOTA方法的模型结构不需要修改。此外,在推理过程中不使用GeoDCL,确保不会增加推理时间。在FashionAI、DeepFashion和Zappos50K数据集上的实验验证了GeoDCL在不断改进SOTA模型方面的有效性。特别是,GeoDCL极大地提高了ASENet_V2在FashionAI数据集上的mAP从60.76%提高到66.48%。
{"title":"GeoDCL: Weak Geometrical Distortion-Based Contrastive Learning for Fine-Grained Fashion Image Retrieval","authors":"Ling Xiao;Toshihiko Yamasaki","doi":"10.1109/TAI.2025.3545791","DOIUrl":"https://doi.org/10.1109/TAI.2025.3545791","url":null,"abstract":"This article addresses fine-grained fashion image retrieval (FIR), which aims at the detailed and precise retrieval of fashion items from extensive databases. Conventional fine-grained FIR methods design complex attention modules to enhance attribute-aware feature discrimination. However, they often ignore the multiview characteristics of real-world fashion data, leading to diminished model accuracy. Furthermore, our empirical analysis revealed that the straightforward application of standard contrastive learning methods to fine-grained FIR often yields suboptimal results. To alleviate this issue, we propose a novel weak geometrical distortion-based contrastive learning (GeoDCL) strategy. Specifically, GeoDCL incorporates both a novel positive pair design and a novel contrastive loss. GeoDCL can be seamlessly integrated into state-of-the-art (SOTA) fine-grained FIR methods during the training stage to enhance performance during inference. When GeoDCL is applied, the model structures of SOTA methods require no modifications. Additionally, GeoDCL is not utilized during inference, ensuring no increase in inference time. Experiments on the FashionAI, DeepFashion, and Zappos50K datasets verified GeoDCL's effectiveness in consistently improving SOTA models. In particular, GeoDCL drastically improved ASENet_V2 from 60.76% to 66.48% in mAP on the FashionAI dataset.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 9","pages":"2409-2421"},"PeriodicalIF":0.0,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144926894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on artificial intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1