首页 > 最新文献

IEEE Transactions on Emerging Topics in Computational Intelligence最新文献

英文 中文
GT-WHAR: A Generic Graph-Based Temporal Framework for Wearable Human Activity Recognition With Multiple Sensors GT-WHAR:利用多个传感器进行可穿戴人体活动识别的通用图式时态框架
IF 5.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-04-02 DOI: 10.1109/TETCI.2024.3378331
Hailin Zou;Zijie Chen;Jing Zhang;Lei Wang;Fuchun Zhang;Jianqing Li;Yuanyuan Pan
Using wearable sensors to identify human activities has elicited significant interest within the discipline of ubiquitous computing for everyday facilitation. Recent research has employed hybrid models to better leverage the modal information of sensors and temporal information, enabling improved performance for wearable human activity recognition. Nevertheless, the lack of effective exploitation of human structural information and limited capacity for cross-channel fusion remains a major challenge. This study proposes a generic design, called GT-WHAR, to accommodate the varying application scenarios and datasets while performing effective feature extraction and fusion. Firstly, a novel and unified representation paradigm, namely Body-Sensing Graph Representation, has been proposed to represent body movement by a graph set, which incorporates structural information by considering the intrinsic connectivity of the skeletal structure. Secondly, the newly designed Body-Node Attention Graph Network employs graph neural networks to extract and fuse the cross-channel information within the graph set. Eventually, the graph network has been embedded in the proposed Bidirectional Temporal Learning Network, facilitating the extraction of temporal information in conjunction with the learned structural features. GT-WHAR outperformed the state-of-the-art methods in extensive experiments conducted on benchmark datasets, proving its validity and efficacy. Besides, we have demonstrated the generality of the framework through multiple research questions and provided an in-depth investigation of various influential factors.
利用可穿戴传感器识别人类活动已引起泛在计算学科的极大兴趣,从而为日常生活提供便利。最近的研究采用了混合模型来更好地利用传感器的模态信息和时间信息,从而提高了可穿戴人体活动识别的性能。然而,缺乏对人体结构信息的有效利用以及跨通道融合能力有限仍然是一个重大挑战。本研究提出了一种名为 GT-WHAR 的通用设计,以适应不同的应用场景和数据集,同时进行有效的特征提取和融合。首先,本研究提出了一种新颖而统一的表示范式,即体感图表示法(Body-Sensing Graph Representation),通过考虑骨骼结构的内在连接性,将结构信息纳入图集,从而用图集表示人体运动。其次,新设计的身体节点注意力图网络采用图神经网络来提取和融合图集中的跨通道信息。最后,将图网络嵌入到所提出的双向时态学习网络中,便于结合所学结构特征提取时态信息。在基准数据集上进行的大量实验中,GT-WHAR 的表现优于最先进的方法,证明了其有效性和功效。此外,我们还通过多个研究问题证明了该框架的通用性,并对各种影响因素进行了深入研究。
{"title":"GT-WHAR: A Generic Graph-Based Temporal Framework for Wearable Human Activity Recognition With Multiple Sensors","authors":"Hailin Zou;Zijie Chen;Jing Zhang;Lei Wang;Fuchun Zhang;Jianqing Li;Yuanyuan Pan","doi":"10.1109/TETCI.2024.3378331","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3378331","url":null,"abstract":"Using wearable sensors to identify human activities has elicited significant interest within the discipline of ubiquitous computing for everyday facilitation. Recent research has employed hybrid models to better leverage the modal information of sensors and temporal information, enabling improved performance for wearable human activity recognition. Nevertheless, the lack of effective exploitation of human structural information and limited capacity for cross-channel fusion remains a major challenge. This study proposes a generic design, called GT-WHAR, to accommodate the varying application scenarios and datasets while performing effective feature extraction and fusion. Firstly, a novel and unified representation paradigm, namely \u0000<italic>Body-Sensing Graph Representation</i>\u0000, has been proposed to represent body movement by a graph set, which incorporates structural information by considering the intrinsic connectivity of the skeletal structure. Secondly, the newly designed \u0000<italic>Body-Node Attention Graph Network</i>\u0000 employs graph neural networks to extract and fuse the cross-channel information within the graph set. Eventually, the graph network has been embedded in the proposed \u0000<italic>Bidirectional Temporal Learning Network</i>\u0000, facilitating the extraction of temporal information in conjunction with the learned structural features. GT-WHAR outperformed the state-of-the-art methods in extensive experiments conducted on benchmark datasets, proving its validity and efficacy. Besides, we have demonstrated the generality of the framework through multiple research questions and provided an in-depth investigation of various influential factors.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 6","pages":"3912-3924"},"PeriodicalIF":5.3,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142691753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Physics-Informed Graph Capsule Generative Autoencoder for Probabilistic AC Optimal Power Flow 面向概率交流优化功率流的物理信息图囊生成式自动编码器
IF 5.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-04-02 DOI: 10.1109/TETCI.2024.3377671
Mohsen Saffari;Mahdi Khodayar;Mohammad E. Khodayar
Due to the increasing demand for electricity and the inherent uncertainty in power generation, finding efficient solutions to the stochastic alternating current optimal power flow (AC-OPF) problem has become crucial. However, the nonlinear and non-convex nature of AC-OPF, coupled with the growing stochasticity resulting from the integration of renewable energy sources, presents significant challenges in achieving fast and reliable solutions. To address these challenges, this study proposes a novel graph-based generative methodology that effectively captures the uncertainties in power system measurements, enabling the learning of probability distribution functions for generation dispatch and voltage setpoints. Our approach involves modeling the power system as a weighted graph and utilizing a deep spectral graph convolution network to extract powerful spatial patterns from the input graph measurements. A unique variational approach is introduced to identify the most relevant latent features that accurately describe the setpoints of the AC-OPF problem. Additionally, a capsule network with a new greedy dynamic routing algorithm is proposed to precisely decode the latent features and estimate the probabilistic AC-OPF problem. Further, a set of carefully designed physics-informed loss functions is incorporated in the training procedure of the model to ensure adherence to the fundamental physics rules governing power systems. Notably, the proposed physics-informed loss functions not only enhance the accuracy of AC-OPF estimation by effectively regularizing the deep learning model but also significantly reduce the time complexity. Extensive experimental evaluations conducted on various benchmarks demonstrate our proposed model's superiority over both probabilistic and deterministic approaches in terms of relevant criteria.
由于电力需求的不断增长以及发电过程中固有的不确定性,为随机交流优化功率流(AC-OPF)问题找到高效的解决方案变得至关重要。然而,AC-OPF 的非线性和非凸性质,加上可再生能源整合带来的日益增长的随机性,为实现快速可靠的解决方案带来了巨大挑战。为了应对这些挑战,本研究提出了一种新颖的基于图的生成方法,它能有效捕捉电力系统测量中的不确定性,从而学习发电调度和电压设定点的概率分布函数。我们的方法包括将电力系统建模为加权图,并利用深度谱图卷积网络从输入图测量中提取强大的空间模式。我们引入了一种独特的变分方法,用于识别最相关的潜在特征,以准确描述交流-OPF 问题的设定点。此外,还提出了一种带有新型贪婪动态路由算法的胶囊网络,用于精确解码潜在特征和估计概率 AC-OPF 问题。此外,还在模型的训练过程中加入了一套精心设计的物理信息损失函数,以确保遵守电力系统的基本物理规则。值得注意的是,所提出的物理信息损失函数不仅通过有效正则化深度学习模型提高了交流-OPF 估计的准确性,还显著降低了时间复杂性。在各种基准上进行的广泛实验评估表明,我们提出的模型在相关标准方面优于概率和确定性方法。
{"title":"Physics-Informed Graph Capsule Generative Autoencoder for Probabilistic AC Optimal Power Flow","authors":"Mohsen Saffari;Mahdi Khodayar;Mohammad E. Khodayar","doi":"10.1109/TETCI.2024.3377671","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3377671","url":null,"abstract":"Due to the increasing demand for electricity and the inherent uncertainty in power generation, finding efficient solutions to the stochastic alternating current optimal power flow (AC-OPF) problem has become crucial. However, the nonlinear and non-convex nature of AC-OPF, coupled with the growing stochasticity resulting from the integration of renewable energy sources, presents significant challenges in achieving fast and reliable solutions. To address these challenges, this study proposes a novel graph-based generative methodology that effectively captures the uncertainties in power system measurements, enabling the learning of probability distribution functions for generation dispatch and voltage setpoints. Our approach involves modeling the power system as a weighted graph and utilizing a deep spectral graph convolution network to extract powerful spatial patterns from the input graph measurements. A unique variational approach is introduced to identify the most relevant latent features that accurately describe the setpoints of the AC-OPF problem. Additionally, a capsule network with a new greedy dynamic routing algorithm is proposed to precisely decode the latent features and estimate the probabilistic AC-OPF problem. Further, a set of carefully designed physics-informed loss functions is incorporated in the training procedure of the model to ensure adherence to the fundamental physics rules governing power systems. Notably, the proposed physics-informed loss functions not only enhance the accuracy of AC-OPF estimation by effectively regularizing the deep learning model but also significantly reduce the time complexity. Extensive experimental evaluations conducted on various benchmarks demonstrate our proposed model's superiority over both probabilistic and deterministic approaches in terms of relevant criteria.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 5","pages":"3382-3395"},"PeriodicalIF":5.3,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142368573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced Adjacency-Constrained Hierarchical Clustering Using Fine-Grained Pseudo Labels 使用细粒度伪标签的增强型邻接约束分层聚类法
IF 5.3 3区 计算机科学 Q1 Mathematics Pub Date : 2024-04-02 DOI: 10.1109/TETCI.2024.3367811
Jie Yang;Chin-Teng Lin
Hierarchical clustering is able to provide partitions of different granularity levels. However, most existing hierarchical clustering techniques perform clustering in the original feature space of the data, which may suffer from overlap, sparseness, or other undesirable characteristics, resulting in noncompetitive performance. In the field of deep clustering, learning representations using pseudo labels has recently become a research hotspot. Yet most existing approaches employ coarse-grained pseudo labels, which may contain noise or incorrect labels. Hence, the learned feature space does not produce a competitive model. In this paper, we introduce the idea of fine-grained labels of supervised learning into unsupervised clustering, giving rise to the enhanced adjacency-constrained hierarchical clustering (ECHC) model. The full framework comprises four steps. One, adjacency-constrained hierarchical clustering (CHC) is used to produce relatively pure fine-grained pseudo labels. Two, those fine-grained pseudo labels are used to train a shallow multilayer perceptron to generate good representations. Three, the corresponding representation of each sample in the learned space is used to construct a similarity matrix. Four, CHC is used to generate the final partition based on the similarity matrix. The experimental results show that the proposed ECHC framework not only outperforms 14 shallow clustering methods on eight real-world datasets but also surpasses current state-of-the-art deep clustering models on six real-world datasets. In addition, on five real-world datasets, ECHC achieves comparable results to supervised algorithms.
分层聚类能够提供不同粒度的分区。然而,现有的大多数分层聚类技术都是在数据的原始特征空间中进行聚类,而原始特征空间可能存在重叠、稀疏或其他不良特征,从而导致性能缺乏竞争力。在深度聚类领域,使用伪标签学习表示最近成为研究热点。然而,大多数现有方法都采用粗粒度伪标签,其中可能包含噪声或错误标签。因此,学习到的特征空间无法生成有竞争力的模型。在本文中,我们将有监督学习的细粒度标签思想引入到无监督聚类中,从而产生了增强型邻接约束分层聚类(ECHC)模型。整个框架包括四个步骤。首先,使用邻接约束分层聚类(CHC)生成相对纯粹的细粒度伪标签。其二,这些细粒度伪标签用于训练浅层多层感知器,以生成良好的表征。第三,每个样本在所学空间中的相应表示用于构建相似性矩阵。第四,使用 CHC 根据相似性矩阵生成最终分区。实验结果表明,所提出的 ECHC 框架不仅在 8 个真实世界数据集上优于 14 种浅层聚类方法,而且在 6 个真实世界数据集上超越了当前最先进的深度聚类模型。此外,在五个真实世界数据集上,ECHC 取得了与监督算法相当的结果。
{"title":"Enhanced Adjacency-Constrained Hierarchical Clustering Using Fine-Grained Pseudo Labels","authors":"Jie Yang;Chin-Teng Lin","doi":"10.1109/TETCI.2024.3367811","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3367811","url":null,"abstract":"Hierarchical clustering is able to provide partitions of different granularity levels. However, most existing hierarchical clustering techniques perform clustering in the original feature space of the data, which may suffer from overlap, sparseness, or other undesirable characteristics, resulting in noncompetitive performance. In the field of deep clustering, learning representations using pseudo labels has recently become a research hotspot. Yet most existing approaches employ coarse-grained pseudo labels, which may contain noise or incorrect labels. Hence, the learned feature space does not produce a competitive model. In this paper, we introduce the idea of fine-grained labels of supervised learning into unsupervised clustering, giving rise to the enhanced adjacency-constrained hierarchical clustering (ECHC) model. The full framework comprises four steps. One, adjacency-constrained hierarchical clustering (CHC) is used to produce relatively pure fine-grained pseudo labels. Two, those fine-grained pseudo labels are used to train a shallow multilayer perceptron to generate good representations. Three, the corresponding representation of each sample in the learned space is used to construct a similarity matrix. Four, CHC is used to generate the final partition based on the similarity matrix. The experimental results show that the proposed ECHC framework not only outperforms 14 shallow clustering methods on eight real-world datasets but also surpasses current state-of-the-art deep clustering models on six real-world datasets. In addition, on five real-world datasets, ECHC achieves comparable results to supervised algorithms.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 3","pages":"2481-2492"},"PeriodicalIF":5.3,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transformer and Graph Convolution-Based Unsupervised Detection of Machine Anomalous Sound Under Domain Shifts 基于变换器和图卷积的域偏移下机器异常声音无监督检测
IF 5.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-04-02 DOI: 10.1109/TETCI.2024.3377728
Jingke Yan;Yao Cheng;Qin Wang;Lei Liu;Weihua Zhang;Bo Jin
Thanks to the development of deep learning, machine abnormal sound detection (MASD) based on unsupervised learning has exhibited excellent performance. However, in the task of unsupervised MASD, there are discrepancies between the acoustic characteristics of the test set and the training set under the physical parameter changes (domain shifts) of the same machine's operating conditions. Existing methods not only struggle to stably learn the sound signal features under various domain shifts but also inevitably increase computational overhead. To address these issues, we propose an unsupervised machine abnormal sound detection model based on Transformer and Dynamic Graph Convolution (Unsuper-TDGCN) in this paper. Firstly, we design a network that models time-frequency domain features to capture both global and local spatial and time-frequency interactions, thus improving the model's stability under domain shifts. Then, we introduce a Dynamic Graph Convolutional Network (DyGCN) to model the dependencies between features under domain shifts, enhancing the model's ability to perceive changes in domain features. Finally, a Domain Self-adaptive Network (DSN) is employed to compensate for the performance decline caused by domain shifts, thereby improving the model's adaptive ability for detecting anomalous sounds in MASD tasks under domain shifts. The effectiveness of our proposed model has been validated on multiple datasets.
得益于深度学习的发展,基于无监督学习的机器异常声音检测(MASD)表现出了卓越的性能。然而,在无监督 MASD 任务中,测试集和训练集的声学特征在同一机器运行条件下的物理参数变化(域偏移)中存在差异。现有的方法不仅难以在各种域变换下稳定地学习声音信号特征,而且不可避免地增加了计算开销。针对这些问题,我们在本文中提出了一种基于变压器和动态图卷积(Unsuper-TDGCN)的无监督机器异常声音检测模型。首先,我们设计了一种时频域特征建模网络,以捕捉全局和局部空间与时频的相互作用,从而提高模型在域偏移情况下的稳定性。然后,我们引入了动态图卷积网络(DyGCN)来模拟域变化下特征之间的依赖关系,从而提高了模型感知域特征变化的能力。最后,我们采用了领域自适应网络(DSN)来补偿因领域转移而导致的性能下降,从而提高了模型在领域转移情况下检测 MASD 任务中异常声音的自适应能力。我们提出的模型的有效性已在多个数据集上得到验证。
{"title":"Transformer and Graph Convolution-Based Unsupervised Detection of Machine Anomalous Sound Under Domain Shifts","authors":"Jingke Yan;Yao Cheng;Qin Wang;Lei Liu;Weihua Zhang;Bo Jin","doi":"10.1109/TETCI.2024.3377728","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3377728","url":null,"abstract":"Thanks to the development of deep learning, machine abnormal sound detection (MASD) based on unsupervised learning has exhibited excellent performance. However, in the task of unsupervised MASD, there are discrepancies between the acoustic characteristics of the test set and the training set under the physical parameter changes (domain shifts) of the same machine's operating conditions. Existing methods not only struggle to stably learn the sound signal features under various domain shifts but also inevitably increase computational overhead. To address these issues, we propose an unsupervised machine abnormal sound detection model based on Transformer and Dynamic Graph Convolution (Unsuper-TDGCN) in this paper. Firstly, we design a network that models time-frequency domain features to capture both global and local spatial and time-frequency interactions, thus improving the model's stability under domain shifts. Then, we introduce a Dynamic Graph Convolutional Network (DyGCN) to model the dependencies between features under domain shifts, enhancing the model's ability to perceive changes in domain features. Finally, a Domain Self-adaptive Network (DSN) is employed to compensate for the performance decline caused by domain shifts, thereby improving the model's adaptive ability for detecting anomalous sounds in MASD tasks under domain shifts. The effectiveness of our proposed model has been validated on multiple datasets.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 4","pages":"2827-2842"},"PeriodicalIF":5.3,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141965845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial Temporal Aggregation for Efficient Continuous Sign Language Recognition 时空聚合实现高效连续手语识别
IF 5.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-04-02 DOI: 10.1109/TETCI.2024.3378649
Lianyu Hu;Liqing Gao;Zekang Liu;Wei Feng
Despite the recent progress of continuous sign language recognition (CSLR), most state-of-the-art methods process input sign language videos frame by frame to predict sentences. This usually causes a heavy computational burden and is inefficient and even infeasible in real-world scenarios. Inspired by the fact that videos are inherently redundant where not all frames are essential for recognition, we propose spatial temporal aggregation (STAgg) to address this problem. Specifically, STAgg synthesizes adjacent similar frames into a unified robust representation before being fed into the recognition module, thus highly reducing the computation complexity and memory demand. We first give a detailed analysis on commonly-used aggregation methods like subsampling, max pooling and average, and then naturally derive our STAgg from the expected design criterion. Compared to commonly used pooling and subsampling counterparts, extensive ablation studies verify the superiority of our proposed three diverse STAgg variants in both accuracy and efficiency. The best version achieves comparative accuracy with state-of-the-art competitors, but is 1.35× faster with only 0.50× computational costs, consuming 0.70× training time and 0.65× memory usage. Experiments on four large-scale datasets upon multiple backbones fully verify the generalizability and effectiveness of the proposed STAgg. Another advantage of STAgg is enabling more powerful backbones, which may further boost the accuracy of CSLR under similar computational/memory budgets. We also visualize the results of STAgg to support intuitive and insightful analysis of the effects of STAgg.
尽管最近连续手语识别(CSLR)取得了进展,但大多数最先进的方法都是逐帧处理输入的手语视频来预测句子。这通常会造成沉重的计算负担,在现实世界中效率低下,甚至不可行。视频本身是冗余的,并非所有帧都是识别所必需的,受此启发,我们提出了空间时间聚合(STAgg)来解决这一问题。具体来说,STAgg 将相邻的相似帧合成为统一的鲁棒表示,然后再输入识别模块,从而大大降低了计算复杂度和内存需求。我们首先详细分析了子采样、最大池化和平均等常用的聚合方法,然后根据预期的设计准则自然地推导出我们的 STAgg。与常用的池化和子样本对应方法相比,大量的消融研究验证了我们提出的三种不同的 STAgg 变体在准确性和效率方面的优越性。最佳版本的准确率与最先进的竞争对手相当,但速度快 1.35 倍,计算成本仅为 0.50 倍,训练时间为 0.70 倍,内存使用率为 0.65 倍。在多个骨干网上对四个大规模数据集进行的实验充分验证了所提出的 STAgg 的通用性和有效性。STAgg 的另一个优势是支持更强大的骨干网,这可能会在类似的计算/内存预算下进一步提高 CSLR 的准确性。我们还将 STAgg 的结果可视化,以支持对 STAgg 效果进行直观、深入的分析。
{"title":"Spatial Temporal Aggregation for Efficient Continuous Sign Language Recognition","authors":"Lianyu Hu;Liqing Gao;Zekang Liu;Wei Feng","doi":"10.1109/TETCI.2024.3378649","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3378649","url":null,"abstract":"Despite the recent progress of continuous sign language recognition (CSLR), most state-of-the-art methods process input sign language videos frame by frame to predict sentences. This usually causes a heavy computational burden and is inefficient and even infeasible in real-world scenarios. Inspired by the fact that videos are inherently redundant where not all frames are essential for recognition, we propose spatial temporal aggregation (STAgg) to address this problem. Specifically, STAgg synthesizes adjacent similar frames into a unified robust representation before being fed into the recognition module, thus highly reducing the computation complexity and memory demand. We first give a detailed analysis on commonly-used aggregation methods like subsampling, max pooling and average, and then naturally derive our STAgg from the expected design criterion. Compared to commonly used pooling and subsampling counterparts, extensive ablation studies verify the superiority of our proposed three diverse STAgg variants in both accuracy and efficiency. The best version achieves comparative accuracy with state-of-the-art competitors, but is 1.35× faster with only 0.50× computational costs, consuming 0.70× training time and 0.65× memory usage. Experiments on four large-scale datasets upon multiple backbones fully verify the generalizability and effectiveness of the proposed STAgg. Another advantage of STAgg is enabling more powerful backbones, which may further boost the accuracy of CSLR under similar computational/memory budgets. We also visualize the results of STAgg to support intuitive and insightful analysis of the effects of STAgg.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 6","pages":"3925-3935"},"PeriodicalIF":5.3,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142691765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Strategies and its Application in the Mittag-Leffler Synchronization of Delayed Fractional-Order Complex-Valued Reaction-Diffusion Neural Networks 延迟分阶复值反应扩散神经网络的自适应策略及其在 Mittag-Leffler 同步中的应用
IF 5.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-04-01 DOI: 10.1109/TETCI.2024.3375450
G. Narayanan;M. Syed Ali;Rajagopal Karthikeyan;Grienggrai Rajchakit;Sumaya Sanober;Pankaj Kumar
This paper addresses the Mittag-Leffler synchronization problem of fractional-order reaction-diffusion complex-valued neural networks (FRDCVNNs) with delays. New Mittag-Leffler synchronization (MLS) criteria in the form of the $p$-norm for an error model derived from the drive-response model are constructed. In the design of the adaptive feedback controller, the Lyapunov approach is considered in the framework of the $p$-norm technique, and less conservative algebraic conditions that guarantee MLS for the considered model are given. Moreover, the MLS of the considered model without reaction diffusion effect is investigated using adaptive control. Finally, an example is used to validate the proposed control scheme. To demonstrate the advantages and superiority of the proposed technique over existing methods, an image encryption method based on MLS of FRDCVNNs is considered and solved using the proposed method.
本文探讨了带延迟的分数阶反应扩散复值神经网络(FRDCVNN)的米塔格-勒弗勒同步问题。针对从驱动-响应模型导出的误差模型,以 $p$ 准则的形式构建了新的 Mittag-Leffler 同步 (MLS) 准则。在自适应反馈控制器的设计中,考虑了在 $p$ norm 技术框架下的 Lyapunov 方法,并给出了保证所考虑模型 MLS 的不太保守的代数条件。此外,还利用自适应控制研究了无反应扩散效应模型的 MLS。最后,通过一个实例验证了所提出的控制方案。为了证明所提技术相对于现有方法的优势和优越性,我们考虑了一种基于 FRDCVNNs MLS 的图像加密方法,并使用所提方法进行了求解。
{"title":"Adaptive Strategies and its Application in the Mittag-Leffler Synchronization of Delayed Fractional-Order Complex-Valued Reaction-Diffusion Neural Networks","authors":"G. Narayanan;M. Syed Ali;Rajagopal Karthikeyan;Grienggrai Rajchakit;Sumaya Sanober;Pankaj Kumar","doi":"10.1109/TETCI.2024.3375450","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3375450","url":null,"abstract":"This paper addresses the Mittag-Leffler synchronization problem of fractional-order reaction-diffusion complex-valued neural networks (FRDCVNNs) with delays. New Mittag-Leffler synchronization (MLS) criteria in the form of the \u0000<inline-formula><tex-math>$p$</tex-math></inline-formula>\u0000-norm for an error model derived from the drive-response model are constructed. In the design of the adaptive feedback controller, the Lyapunov approach is considered in the framework of the \u0000<inline-formula><tex-math>$p$</tex-math></inline-formula>\u0000-norm technique, and less conservative algebraic conditions that guarantee MLS for the considered model are given. Moreover, the MLS of the considered model without reaction diffusion effect is investigated using adaptive control. Finally, an example is used to validate the proposed control scheme. To demonstrate the advantages and superiority of the proposed technique over existing methods, an image encryption method based on MLS of FRDCVNNs is considered and solved using the proposed method.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 5","pages":"3294-3307"},"PeriodicalIF":5.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142368451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Fixed-Time Robust ZNN Model With Adaptive Parameters for Redundancy Resolution of Manipulators 具有自适应参数的固定时间稳健 ZNN 模型,用于解决机械手的冗余问题
IF 5.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-04-01 DOI: 10.1109/TETCI.2024.3377672
Mengrui Cao;Lin Xiao;Qiuyue Zuo;Ping Tan;Yongjun He;Xieping Gao
Due to the excellent time-varying problem-solving capability of zeroing neural network (ZNN), many redundancy resolution schemes based on ZNN have been proposed for robots. The work proposes a fixed-time robust ZNN (FTRZNN) model with adaptive parameters to effectively address redundancy resolution problems of robots in the presence of noises. Differing from existing ZNN models, the FTRZNN possesses a fixed-time activation function and two adaptive parameters, which greatly improve its performance on convergence speed and robustness. The establishment of the FTRZNN for handling redundancy resolution problems consists of two steps: 1) converting the target practical problem into nonlinear equations firstly; and 2) deriving an FTRZNN for solving the equations. For providing a convincible evidence of the significant advantages of the FTRZNN over existing ZNN models, theoretical analysis in convergence and robustness of the FTRZNN is given, and the performance of the FTRZNN model is compared with existing ZNN models when performing path tracking tasks using a 6R manipulator under different noise disturbances. Finally, the FTRZNN model is employed to control two robot manipulators (UR5 and Jaco) to track desired paths under noise interference, which is simulated on a robotic simulation platform (i.e.,CoppeliaSim). Simulation results indicate the effectiveness and potential practical value of the FTRZNN model.
由于归零神经网络(ZNN)具有出色的时变问题解决能力,许多基于 ZNN 的机器人冗余解析方案被提出。本研究提出了一种具有自适应参数的固定时间鲁棒 ZNN(FTRZNN)模型,以有效解决机器人在噪声存在时的冗余解决难题。与现有的 ZNN 模型不同,FTRZNN 具有一个固定时间激活函数和两个自适应参数,这大大提高了其收敛速度和鲁棒性。建立 FTRZNN 来处理冗余解析问题包括两个步骤:1) 首先将目标实际问题转换为非线性方程;2) 推导出用于求解方程的 FTRZNN。为了有力证明 FTRZNN 相对于现有 ZNN 模型的显著优势,本文对 FTRZNN 的收敛性和鲁棒性进行了理论分析,并比较了 FTRZNN 模型与现有 ZNN 模型在不同噪声干扰下使用 6R 机械手执行路径跟踪任务时的性能。最后,利用 FTRZNN 模型控制两个机器人机械手(UR5 和 Jaco)在噪声干扰下跟踪所需路径,并在机器人仿真平台(即 CoppeliaSim)上进行了仿真。仿真结果表明了 FTRZNN 模型的有效性和潜在实用价值。
{"title":"A Fixed-Time Robust ZNN Model With Adaptive Parameters for Redundancy Resolution of Manipulators","authors":"Mengrui Cao;Lin Xiao;Qiuyue Zuo;Ping Tan;Yongjun He;Xieping Gao","doi":"10.1109/TETCI.2024.3377672","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3377672","url":null,"abstract":"Due to the excellent time-varying problem-solving capability of zeroing neural network (ZNN), many redundancy resolution schemes based on ZNN have been proposed for robots. The work proposes a fixed-time robust ZNN (FTRZNN) model with adaptive parameters to effectively address redundancy resolution problems of robots in the presence of noises. Differing from existing ZNN models, the FTRZNN possesses a fixed-time activation function and two adaptive parameters, which greatly improve its performance on convergence speed and robustness. The establishment of the FTRZNN for handling redundancy resolution problems consists of two steps: 1) converting the target practical problem into nonlinear equations firstly; and 2) deriving an FTRZNN for solving the equations. For providing a convincible evidence of the significant advantages of the FTRZNN over existing ZNN models, theoretical analysis in convergence and robustness of the FTRZNN is given, and the performance of the FTRZNN model is compared with existing ZNN models when performing path tracking tasks using a 6R manipulator under different noise disturbances. Finally, the FTRZNN model is employed to control two robot manipulators (UR5 and Jaco) to track desired paths under noise interference, which is simulated on a robotic simulation platform (i.e.,CoppeliaSim). Simulation results indicate the effectiveness and potential practical value of the FTRZNN model.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 6","pages":"3886-3898"},"PeriodicalIF":5.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142691732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Broad Recommender System: An Efficient Nonlinear Collaborative Filtering Approach 广义推荐系统:一种高效的非线性协作过滤方法
IF 5.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-04-01 DOI: 10.1109/TETCI.2024.3378599
Ling Huang;Can-Rong Guan;Zhen-Wei Huang;Yuefang Gao;Chang-Dong Wang;C. L. P. Chen
Recently, Deep Neural Networks (DNNs) have been largely utilized in Collaborative Filtering (CF) to produce more accurate recommendation results due to their ability of extracting the nonlinear relationships in the user-item pairs. However, the DNNs-based models usually encounter high computational complexity, i.e., consuming very long training time and storing huge amount of trainable parameters. To address these problems, we develop a novel broad recommender system named Broad Collaborative Filtering (BroadCF), which is an efficient nonlinear collaborative filtering approach. Instead of DNNs, Broad Learning System (BLS) is used as a mapping function to learn the nonlinear matching relationships in the user-item pairs, which can avoid the above issues while achieving very satisfactory rating prediction performance. Contrary to DNNs, BLS is a shallow network that captures nonlinear relationships between input features simply and efficiently. However, directly feeding the original rating data into BLS is not suitable due to the very large dimensionality of the original rating vector. To this end, a new preprocessing procedure is designed to generate user-item rating collaborative vector, which is a low-dimensional user-item input vector that can leverage quality judgments of the most similar users/items. Convincing experimental results on seven datasets have demonstrated the effectiveness of the BroadCF algorithm.
最近,深度神经网络(DNN)因其能够提取用户-物品对中的非线性关系,在很大程度上被用于协作过滤(CF),以产生更准确的推荐结果。然而,基于 DNNs 的模型通常具有很高的计算复杂性,即需要消耗很长的训练时间和存储大量的可训练参数。为了解决这些问题,我们开发了一种名为 "广义协同过滤"(BroadCF)的新型广义推荐系统,它是一种高效的非线性协同过滤方法。与 DNNs 不同,Broad Learning System(BLS)被用作映射函数来学习用户-物品配对中的非线性匹配关系,从而避免了上述问题,同时获得了非常令人满意的评级预测性能。与 DNN 不同,BLS 是一种浅层网络,能简单有效地捕捉输入特征之间的非线性关系。然而,由于原始评分向量的维度非常大,直接将原始评分数据输入 BLS 并不合适。为此,我们设计了一种新的预处理程序来生成用户-项目评分协作向量,这是一种低维的用户-项目输入向量,可以利用最相似用户/项目的质量判断。在七个数据集上令人信服的实验结果证明了 BroadCF 算法的有效性。
{"title":"Broad Recommender System: An Efficient Nonlinear Collaborative Filtering Approach","authors":"Ling Huang;Can-Rong Guan;Zhen-Wei Huang;Yuefang Gao;Chang-Dong Wang;C. L. P. Chen","doi":"10.1109/TETCI.2024.3378599","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3378599","url":null,"abstract":"Recently, Deep Neural Networks (DNNs) have been largely utilized in Collaborative Filtering (CF) to produce more accurate recommendation results due to their ability of extracting the nonlinear relationships in the user-item pairs. However, the DNNs-based models usually encounter high computational complexity, i.e., consuming very long training time and storing huge amount of trainable parameters. To address these problems, we develop a novel broad recommender system named Broad Collaborative Filtering (BroadCF), which is an efficient nonlinear collaborative filtering approach. Instead of DNNs, Broad Learning System (BLS) is used as a mapping function to learn the nonlinear matching relationships in the user-item pairs, which can avoid the above issues while achieving very satisfactory rating prediction performance. Contrary to DNNs, BLS is a shallow network that captures nonlinear relationships between input features simply and efficiently. However, directly feeding the original rating data into BLS is not suitable due to the very large dimensionality of the original rating vector. To this end, a new preprocessing procedure is designed to generate user-item rating collaborative vector, which is a low-dimensional user-item input vector that can leverage quality judgments of the most similar users/items. Convincing experimental results on seven datasets have demonstrated the effectiveness of the BroadCF algorithm.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 4","pages":"2843-2857"},"PeriodicalIF":5.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141965846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Small Object Real-Time Detection Method for Power Line Inspection in Low-Illuminance Environments 用于低照度环境下电力线检测的小物体实时检测方法
IF 5.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-04-01 DOI: 10.1109/TETCI.2024.3378651
Yubo Zhao;Jiaqi Wu;Wei Chen;Zehua Wang;Zijian Tian;Fei Richard Yu;Victor C. M. Leung
Power inspection in low-illuminance environments is of great significance for ensuring the all-weather stable operation of the power system. However, low visibility at night seriously interferes with the detection performance of small-sized power devices. In response to the issue, we propose a small object real-time detection method for power line inspection in low-illuminance environments. We design an adaptive transformer-ISP (ATISP) module, in which the optimal parameter regression module generates hyperparameters by sensing input image features to guide the image signal processors (ISPs) to perform image enhancement. With the advantage of ISPs, the ATISP has the advantages of fast inference speed and less training cost. Furthermore, the optimal parameter regression module extracts local features and long-distance dependencies through CNN and Transformer to be able to more fully perceive the input image, so that the generated hyperparameters better enhance image defects. In addition, we use lightweight neural network MobileNetv3 to improve YOLOv7, so that the algorithm maintains excellent small object detection performance while significantly increasing the detection speed. Moreover, the integrated model optimisation uses only the object detection loss functions, which allows ATISP to perform image enhancement just according to the object detection needs, improving small object detection effect and shortening the inference time of ATISP. In extensive experiments, compared with 9 state-of-the-art object detection algorithms, our algorithm has the best small-scale insulator faults detection precision (mAP:75.38$%$) in our DIFE, best small object detection precision (mAP:56.31$%$) in public dataset Exdark, and faster detection speed (FPS:98.81 and 97.53), which prove our method can achieve fast and accurate low-illuminance insulators detection.
低能见度环境下的电力检测对于确保电力系统全天候稳定运行具有重要意义。然而,夜间低能见度严重影响了小型电力设备的检测性能。针对这一问题,我们提出了一种用于低能见度环境下电力线路检测的小目标实时检测方法。我们设计了一个自适应变换器-ISP(ATISP)模块,其中最优参数回归模块通过感知输入图像特征生成超参数,以指导图像信号处理器(ISP)执行图像增强。借助 ISP 的优势,ATISP 具有推理速度快、训练成本低的优点。此外,最优参数回归模块通过 CNN 和 Transformer 提取局部特征和长距离依赖关系,能够更全面地感知输入图像,从而使生成的超参数更好地增强图像缺陷。此外,我们使用轻量级神经网络 MobileNetv3 对 YOLOv7 进行改进,使算法在保持出色的小物体检测性能的同时,大幅提高了检测速度。此外,集成模型优化只使用物体检测损失函数,这使得 ATISP 只需根据物体检测需要进行图像增强,提高了小物体检测效果,缩短了 ATISP 的推理时间。在大量的实验中,与9种最先进的物体检测算法相比,我们的算法在DIFE中具有最佳的小尺度绝缘体故障检测精度(mAP:75.38$/%$),在公共数据集Exdark中具有最佳的小物体检测精度(mAP:56.31$/%$),以及更快的检测速度(FPS:98.81和97.53),证明我们的方法可以实现快速准确的低照度绝缘体检测。
{"title":"A Small Object Real-Time Detection Method for Power Line Inspection in Low-Illuminance Environments","authors":"Yubo Zhao;Jiaqi Wu;Wei Chen;Zehua Wang;Zijian Tian;Fei Richard Yu;Victor C. M. Leung","doi":"10.1109/TETCI.2024.3378651","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3378651","url":null,"abstract":"Power inspection in low-illuminance environments is of great significance for ensuring the all-weather stable operation of the power system. However, low visibility at night seriously interferes with the detection performance of small-sized power devices. In response to the issue, we propose a small object real-time detection method for power line inspection in low-illuminance environments. We design an adaptive transformer-ISP (ATISP) module, in which the optimal parameter regression module generates hyperparameters by sensing input image features to guide the image signal processors (ISPs) to perform image enhancement. With the advantage of ISPs, the ATISP has the advantages of fast inference speed and less training cost. Furthermore, the optimal parameter regression module extracts local features and long-distance dependencies through CNN and Transformer to be able to more fully perceive the input image, so that the generated hyperparameters better enhance image defects. In addition, we use lightweight neural network MobileNetv3 to improve YOLOv7, so that the algorithm maintains excellent small object detection performance while significantly increasing the detection speed. Moreover, the integrated model optimisation uses only the object detection loss functions, which allows ATISP to perform image enhancement just according to the object detection needs, improving small object detection effect and shortening the inference time of ATISP. In extensive experiments, compared with 9 state-of-the-art object detection algorithms, our algorithm has the best small-scale insulator faults detection precision (mAP:75.38\u0000<inline-formula><tex-math>$%$</tex-math></inline-formula>\u0000) in our DIFE, best small object detection precision (mAP:56.31\u0000<inline-formula><tex-math>$%$</tex-math></inline-formula>\u0000) in public dataset Exdark, and faster detection speed (FPS:98.81 and 97.53), which prove our method can achieve fast and accurate low-illuminance insulators detection.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 6","pages":"3936-3950"},"PeriodicalIF":5.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142691755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Network Group Partition and Core Placement Optimization for Neuromorphic Multi-Core and Multi-Chip Systems 神经形态多核和多芯片系统的网络组划分和内核布局优化
IF 5.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-04-01 DOI: 10.1109/TETCI.2024.3379165
Yukuan Yang;Qihang Fan;Tianyi Yan;Jing Pei;Guoqi Li
Neuromorphic chips with multi-core architecture are considered to be of great potential for the next generation of artificial intelligence (AI) chips because of the avoidance of the memory wall effect. Deploying deep neural networks (DNNs) to these chips requires two stages, namely, network partition and core placement. For the network partition, existing schemes are mostly manual or only focus on single-layer, small-scale network partitions. For the core placement, to the best of our knowledge, there is still no work that has completely solved the communication deadlock problem at the clock-level which commonly exists in the applications of neuromorphic multi-core and multi-chip (NMCMC) systems. To address these issues that affect the operating and deployment efficiency of NMCMC systems, we formulate the network group partition problem as an optimization problem for the first time and propose a search-based network group partition scheme to solve the problem. A clock-level multi-chip simulator is established to completely avoid the deadlock problem during the core placement optimization process. What's more, a region constrained simulated annealing (RCSA) algorithm is proposed to improve the efficiency of the core placement optimization. Finally, an automated toolchain for the efficient deployment of DNNs in the NMCMC systems is developed by integrating the proposed network group partition and core placement schemes together. Experiments show the proposed group partition scheme can achieve 22.25%, 17.77%, 14.80% less in core number, 9.44%, 7.96%, 5.16% improvements in memory utilization, and more balanced communication and computation loads compared with existing manual schemes in ResNet-18, ResNet-34, and ResNet-50, respectively. In addition, the proposed core placement optimization based on the RCSA algorithm shows higher efficiency with much fewer optimization steps and can realize 9.52%, 11.91%, and 27.52% higher in throughput compared with sequential core placement without deadlock in the ResNet-18, ResNet-34, and ResNet-50 networks. This work paves the way for applying NMCMC systems to real-world scenarios to reach more powerful machine intelligence.
具有多核架构的神经形态芯片由于可以避免内存墙效应,被认为是下一代人工智能(AI)芯片的巨大潜力所在。在这些芯片上部署深度神经网络(DNN)需要两个阶段,即网络分区和内核布局。在网络分区方面,现有方案大多是手动分区,或仅关注单层、小规模的网络分区。至于内核放置,据我们所知,目前还没有任何工作能彻底解决时钟级的通信死锁问题,而这一问题通常存在于神经形态多核多芯片(NMCMC)系统的应用中。为了解决这些影响 NMCMC 系统运行和部署效率的问题,我们首次将网络组划分问题表述为一个优化问题,并提出了一种基于搜索的网络组划分方案来解决该问题。建立了时钟级多芯片模拟器,完全避免了内核布局优化过程中的死锁问题。此外,还提出了一种区域约束模拟退火(RCSA)算法,以提高内核布局优化的效率。最后,通过将所提出的网络组划分和内核放置方案整合在一起,开发出了在 NMCMC 系统中高效部署 DNN 的自动化工具链。实验表明,在 ResNet-18、ResNet-34 和 ResNet-50 中,与现有人工方案相比,所提出的组分区方案可分别减少 22.25%、17.77% 和 14.80% 的核心数量,提高 9.44%、7.96% 和 5.16% 的内存利用率,并使通信和计算负载更加均衡。此外,在 ResNet-18、ResNet-34 和 ResNet-50 网络中,基于 RCSA 算法提出的内核放置优化方案以更少的优化步骤实现了更高的效率,与无死锁的顺序内核放置方案相比,吞吐量分别提高了 9.52%、11.91% 和 27.52%。这项工作为将 NMCMC 系统应用于现实世界场景,实现更强大的机器智能铺平了道路。
{"title":"Network Group Partition and Core Placement Optimization for Neuromorphic Multi-Core and Multi-Chip Systems","authors":"Yukuan Yang;Qihang Fan;Tianyi Yan;Jing Pei;Guoqi Li","doi":"10.1109/TETCI.2024.3379165","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3379165","url":null,"abstract":"Neuromorphic chips with multi-core architecture are considered to be of great potential for the next generation of artificial intelligence (AI) chips because of the avoidance of the memory wall effect. Deploying deep neural networks (DNNs) to these chips requires two stages, namely, network partition and core placement. For the network partition, existing schemes are mostly manual or only focus on single-layer, small-scale network partitions. For the core placement, to the best of our knowledge, there is still no work that has completely solved the communication deadlock problem at the clock-level which commonly exists in the applications of neuromorphic multi-core and multi-chip (NMCMC) systems. To address these issues that affect the operating and deployment efficiency of NMCMC systems, we formulate the network group partition problem as an optimization problem for the first time and propose a search-based network group partition scheme to solve the problem. A clock-level multi-chip simulator is established to completely avoid the deadlock problem during the core placement optimization process. What's more, a region constrained simulated annealing (RCSA) algorithm is proposed to improve the efficiency of the core placement optimization. Finally, an automated toolchain for the efficient deployment of DNNs in the NMCMC systems is developed by integrating the proposed network group partition and core placement schemes together. Experiments show the proposed group partition scheme can achieve 22.25%, 17.77%, 14.80% less in core number, 9.44%, 7.96%, 5.16% improvements in memory utilization, and more balanced communication and computation loads compared with existing manual schemes in ResNet-18, ResNet-34, and ResNet-50, respectively. In addition, the proposed core placement optimization based on the RCSA algorithm shows higher efficiency with much fewer optimization steps and can realize 9.52%, 11.91%, and 27.52% higher in throughput compared with sequential core placement without deadlock in the ResNet-18, ResNet-34, and ResNet-50 networks. This work paves the way for applying NMCMC systems to real-world scenarios to reach more powerful machine intelligence.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 6","pages":"3966-3981"},"PeriodicalIF":5.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142691690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Emerging Topics in Computational Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1