IEEE transactions on artificial intelligence最新文献_第8页

Simultaneous Learning and Planning Within Sensing Range: An Approach for Local Path Planning 感知范围内的同步学习与规划：一种局部路径规划方法

IEEE transactions on artificial intelligence

Pub Date : 2024-08-05 DOI: 10.1109/TAI.2024.3438094

Lokesh Kumar;Arup Kumar Sadhu;Ranjan Dasgupta

This article proposes an approach for local path planning. Unlike traditional approaches, the proposed local path planner simultaneously learns and plans within the sensing range (SLPA-SR) during local path planning. SLPA-SR is the synergy between the local path planner, the dynamic window approach (DWA), the obstacle avoidance by velocity obstacle (VO) approach, and the proposed next-best reward learning (NBR) algorithms. In the proposed SLPA-SR, the DWA acts as an actuator and helps to balance exploration and exploitation in the proposed NBR. In the proposed NBR, dimensions of state and action do not need to be defined a priori; rather, dimensions of state and action change dynamically. The proposed SLPA-SR is simulated and experimentally validated on the TurtleBot3 Waffle Pi. The performance of the proposed SLPA-SR is tested in several typical environments, both in simulation and hardware experiments. The proposed SLPA-SR outperforms the contender algorithms (i.e., DWA, DWA-RL, improved time elastic band, predictive artificial potential field, and artificial potential field) by a significant margin in terms of run-time, linear velocity, angular velocity, success rate, average trajectory length, and average velocity. The efficacy of the proposed NBR is established by analyzing the percentage of exploitation, average reward, and state-action pair count.

本文提出了一种局部路径规划方法。与传统方法不同，本文提出的局部路径规划器在局部路径规划过程中同时学习和规划感知范围内的路径（SLPA-SR）。SLPA-SR是局部路径规划、动态窗口法（DWA）、速度障碍避障法（VO）和次优奖励学习（NBR）算法之间的协同。在拟议的SLPA-SR中，DWA充当执行器，帮助平衡拟议NBR中的勘探和开发。在拟议的NBR中，不需要先验地定义状态和行为的维度；相反，状态和动作的维度是动态变化的。所提出的SLPA-SR在TurtleBot3华夫派上进行了仿真和实验验证。在仿真和硬件实验两种典型环境中对所提出的SLPA-SR的性能进行了测试。在运行时间、线速度、角速度、成功率、平均轨迹长度和平均速度方面，SLPA-SR算法明显优于现有的竞争算法（即DWA、DWA- rl、改进时间弹性带、预测人工势场和人工势场）。所提出的NBR的有效性是通过分析利用的百分比、平均奖励和状态-行动对计数来确定的。

{"title":"Simultaneous Learning and Planning Within Sensing Range: An Approach for Local Path Planning","authors":"Lokesh Kumar;Arup Kumar Sadhu;Ranjan Dasgupta","doi":"10.1109/TAI.2024.3438094","DOIUrl":"https://doi.org/10.1109/TAI.2024.3438094","url":null,"abstract":"This article proposes an approach for local path planning. Unlike traditional approaches, the proposed local path planner simultaneously learns and plans within the sensing range (SLPA-SR) during local path planning. SLPA-SR is the synergy between the local path planner, the dynamic window approach (DWA), the obstacle avoidance by velocity obstacle (VO) approach, and the proposed next-best reward learning (NBR) algorithms. In the proposed SLPA-SR, the DWA acts as an actuator and helps to balance exploration and exploitation in the proposed NBR. In the proposed NBR, dimensions of state and action do not need to be defined a \u0000<italic>priori\u0000; rather, dimensions of state and action change dynamically. The proposed SLPA-SR is simulated and experimentally validated on the TurtleBot3 Waffle Pi. The performance of the proposed SLPA-SR is tested in several typical environments, both in simulation and hardware experiments. The proposed SLPA-SR outperforms the contender algorithms (i.e., DWA, DWA-RL, improved time elastic band, predictive artificial potential field, and artificial potential field) by a significant margin in terms of run-time, linear velocity, angular velocity, success rate, average trajectory length, and average velocity. The efficacy of the proposed NBR is established by analyzing the percentage of exploitation, average reward, and state-action pair count.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6399-6411"},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LSTM-Based Model Compression for CAN Security in Intelligent Vehicles 基于lstm的智能汽车CAN安全模型压缩

IEEE transactions on artificial intelligence

Pub Date : 2024-08-05 DOI: 10.1109/TAI.2024.3438110

Yuan Feng;Yingxu Lai;Ye Chen;Zhaoyi Zhang;Jingwen Wei

The rapid deployment and low-cost inference of controller area network (CAN) bus anomaly detection models on intelligent vehicles can drive the development of the Green Internet of Vehicles. Anomaly detection on intelligent vehicles often utilizes recurrent neural network models, but computational resources for these models are limited on small platforms. Model compression is essential to ensure CAN bus security with restricted computing resources while improving model computation efficiency. However, the existence of shared cyclic units significantly constrains the compression of recurrent neural networks. In this study, we propose a structured pruning method for long short-term memory (LSTM) based on the contribution values of shared vectors. By analyzing the contribution value of each dimension of shared vectors, the weight matrix of the model is structurally pruned, and the output value of the LSTM layer is supplemented to maintain the information integrity between adjacent network layers. We further propose an approximate matrix multiplication calculation module that runs in the whole process of model calculation and is deployed in parallel with the pruning module. Evaluated on a realistic public CAN bus dataset, our method effectively achieves highly structured pruning, improves model computing efficiency, and maintains performance stability compared to other compression methods.

控制器局域网（CAN）总线异常检测模型在智能汽车上的快速部署和低成本推理可以推动绿色车联网的发展。智能车辆的异常检测通常采用递归神经网络模型，但在小型平台上，这些模型的计算资源有限。为了在有限的计算资源下保证CAN总线的安全性，同时提高模型的计算效率，模型压缩是必不可少的。然而，共享循环单元的存在极大地限制了循环神经网络的压缩。在本研究中，我们提出了一种基于共享向量贡献值的长短期记忆（LSTM）结构化剪枝方法。通过分析共享向量各维的贡献值，对模型的权值矩阵进行结构剪枝，并补充LSTM层的输出值，以保持相邻网络层之间的信息完整性。我们进一步提出了一个近似矩阵乘法计算模块，该模块运行在模型计算的整个过程中，并与剪枝模块并行部署。在一个真实的公共CAN总线数据集上进行了评估，与其他压缩方法相比，我们的方法有效地实现了高度结构化的修剪，提高了模型计算效率，并保持了性能稳定性。

{"title":"LSTM-Based Model Compression for CAN Security in Intelligent Vehicles","authors":"Yuan Feng;Yingxu Lai;Ye Chen;Zhaoyi Zhang;Jingwen Wei","doi":"10.1109/TAI.2024.3438110","DOIUrl":"https://doi.org/10.1109/TAI.2024.3438110","url":null,"abstract":"The rapid deployment and low-cost inference of controller area network (CAN) bus anomaly detection models on intelligent vehicles can drive the development of the Green Internet of Vehicles. Anomaly detection on intelligent vehicles often utilizes recurrent neural network models, but computational resources for these models are limited on small platforms. Model compression is essential to ensure CAN bus security with restricted computing resources while improving model computation efficiency. However, the existence of shared cyclic units significantly constrains the compression of recurrent neural networks. In this study, we propose a structured pruning method for long short-term memory (LSTM) based on the contribution values of shared vectors. By analyzing the contribution value of each dimension of shared vectors, the weight matrix of the model is structurally pruned, and the output value of the LSTM layer is supplemented to maintain the information integrity between adjacent network layers. We further propose an approximate matrix multiplication calculation module that runs in the whole process of model calculation and is deployed in parallel with the pruning module. Evaluated on a realistic public CAN bus dataset, our method effectively achieves highly structured pruning, improves model computing efficiency, and maintains performance stability compared to other compression methods.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6457-6471"},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RADiff: Controllable Diffusion Models for Radio Astronomical Maps Generation RADiff：射电天文地图生成的可控扩散模型

IEEE transactions on artificial intelligence

Pub Date : 2024-08-01 DOI: 10.1109/TAI.2024.3436538

Renato Sortino;Thomas Cecconello;Andrea De Marco;Giuseppe Fiameni;Andrea Pilzer;Daniel Magro;Andrew M. Hopkins;Simone Riggi;Eva Sciacca;Adriano Ingallinera;Cristobal Bordiu;Filomena Bufano;Concetto Spampinato

Along with the nearing completion of the square kilometer array (SKA), comes an increasing demand for accurate and reliable automated solutions to extract valuable information from the vast amount of data it will allow acquiring. Automated source finding is a particularly important task in this context, as it enables the detection and classification of astronomical objects. Deep-learning-based object detection and semantic segmentation models have proven to be suitable for this purpose. However, training such deep networks requires a high volume of labeled data, which is not trivial to obtain in the context of radio astronomy. Since data needs to be manually labeled by experts, this process is not scalable to large dataset sizes, limiting the possibilities of leveraging deep networks to address several tasks. In this work, we propose RADiff, a generative approach based on conditional diffusion models trained over an annotated radio dataset to generate synthetic images, containing radio sources of different morphologies, to augment existing datasets and reduce the problems caused by class imbalances. We also show that it is possible to generate fully synthetic image-annotation pairs to automatically augment any annotated dataset. We evaluate the effectiveness of this approach by training a semantic segmentation model on a real dataset augmented in two ways: 1) using synthetic images obtained from real masks; and 2) generating images from synthetic semantic masks. Finally, we also show how the model can be applied to populate background noise maps for simulating radio maps for data challenges.

随着平方公里阵列（SKA）的接近完工，对准确可靠的自动化解决方案的需求不断增加，以便从大量数据中提取有价值的信息。在这种情况下，自动查找源是一项特别重要的任务，因为它使天文物体的检测和分类成为可能。基于深度学习的对象检测和语义分割模型已被证明适合于这一目的。然而，训练这样的深度网络需要大量的标记数据，这在射电天文学的背景下是不容易获得的。由于数据需要由专家手动标记，因此该过程无法扩展到大型数据集，从而限制了利用深度网络解决多个任务的可能性。在这项工作中，我们提出了RADiff，这是一种基于在带注释的无线电数据集上训练的条件扩散模型的生成方法，用于生成包含不同形态射电源的合成图像，以增强现有数据集并减少类不平衡引起的问题。我们还展示了生成完全合成的图像注释对以自动增加任何注释数据集的可能性。我们通过在两种方式增强的真实数据集上训练语义分割模型来评估该方法的有效性：1)使用从真实掩模中获得的合成图像；2)合成语义掩码生成图像。最后，我们还展示了如何将该模型应用于填充背景噪声图，以模拟数据挑战的无线电地图。

{"title":"RADiff: Controllable Diffusion Models for Radio Astronomical Maps Generation","authors":"Renato Sortino;Thomas Cecconello;Andrea De Marco;Giuseppe Fiameni;Andrea Pilzer;Daniel Magro;Andrew M. Hopkins;Simone Riggi;Eva Sciacca;Adriano Ingallinera;Cristobal Bordiu;Filomena Bufano;Concetto Spampinato","doi":"10.1109/TAI.2024.3436538","DOIUrl":"https://doi.org/10.1109/TAI.2024.3436538","url":null,"abstract":"Along with the nearing completion of the square kilometer array (SKA), comes an increasing demand for accurate and reliable automated solutions to extract valuable information from the vast amount of data it will allow acquiring. Automated source finding is a particularly important task in this context, as it enables the detection and classification of astronomical objects. Deep-learning-based object detection and semantic segmentation models have proven to be suitable for this purpose. However, training such deep networks requires a high volume of labeled data, which is not trivial to obtain in the context of radio astronomy. Since data needs to be manually labeled by experts, this process is not scalable to large dataset sizes, limiting the possibilities of leveraging deep networks to address several tasks. In this work, we propose RADiff, a generative approach based on conditional diffusion models trained over an annotated radio dataset to generate synthetic images, containing radio sources of different morphologies, to augment existing datasets and reduce the problems caused by class imbalances. We also show that it is possible to generate fully synthetic image-annotation pairs to automatically augment any annotated dataset. We evaluate the effectiveness of this approach by training a semantic segmentation model on a real dataset augmented in two ways: 1) using synthetic images obtained from real masks; and 2) generating images from synthetic semantic masks. Finally, we also show how the model can be applied to populate background noise maps for simulating radio maps for data challenges.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6524-6535"},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142825894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Higher-Order Directed Community Detection by A Multiobjective Evolutionary Framework 多目标进化框架下的高阶定向群落检测

IEEE transactions on artificial intelligence

Pub Date : 2024-08-01 DOI: 10.1109/TAI.2024.3436659

Jing Xiao;Jing Cao;Xiao-Ke Xu

Higher-order community detection in real-life networks has recently gained significant attention, because motif-based communities reflect not only higher-order mesoscale structures but also functional characteristics. However, motif-based communities detected by existing methods for directed networks often disregard edge directionality (nonreciprocal directional arcs), so they typically fail to comprehensively reveal intrinsic characteristics of higher-order topology and information flow. To address this issue, first, we model higher-order directed community detection as a biobjective optimization problem, aiming to provide high-quality and diverse compromise partitions that capture both characteristics. Second, we introduce a multiobjective genetic algorithm based on motif density and information flow (MOGA-MI) to approximate the Pareto optimal higher-order directed community partitions. On the one hand, an arc-and-motif neighbor-based genetic generator (AMN-GA) is developed to generate high-quality and diverse offspring individuals; on the other hand, a higher-order directed neighbor community modification (HD-NCM) operation is designed to further improve generated partitions by modifying easily confused nodes into more appropriate motif-neighbor communities. Finally, experimental results demonstrate that the proposed MOGA-MI outperforms state-of-the-art algorithms in terms of higher-order topology and information flow indicators while providing more diverse community information.

基于基序的社区不仅反映了高阶中尺度结构，而且反映了功能特征，因此现实网络中的高阶社区检测近年来受到了广泛关注。然而，现有的基于基序的有向网络社区检测方法往往忽略了边缘的方向性（非互易的方向弧），因此它们通常无法全面揭示高阶拓扑和信息流的内在特征。为了解决这个问题，首先，我们将高阶有向社区检测建模为一个双目标优化问题，旨在提供捕获这两个特征的高质量和多样化折衷分区。其次，我们引入了基于motif密度和信息流的多目标遗传算法（MOGA-MI）来逼近Pareto最优高阶有向社区划分。一方面，开发了一种基于弧基和基序邻域的遗传生成器（AMN-GA），以产生高质量和多样化的后代个体；另一方面，设计了一种高阶定向邻居社区修改（HD-NCM）操作，通过将容易混淆的节点修改为更合适的主题邻居社区，进一步改进生成的分区。最后，实验结果表明，该算法在高阶拓扑和信息流指标方面优于现有算法，同时提供更多样化的社区信息。

{"title":"Higher-Order Directed Community Detection by A Multiobjective Evolutionary Framework","authors":"Jing Xiao;Jing Cao;Xiao-Ke Xu","doi":"10.1109/TAI.2024.3436659","DOIUrl":"https://doi.org/10.1109/TAI.2024.3436659","url":null,"abstract":"Higher-order community detection in real-life networks has recently gained significant attention, because motif-based communities reflect not only higher-order mesoscale structures but also functional characteristics. However, motif-based communities detected by existing methods for directed networks often disregard edge directionality (nonreciprocal directional arcs), so they typically fail to comprehensively reveal intrinsic characteristics of higher-order topology and information flow. To address this issue, first, we model higher-order directed community detection as a biobjective optimization problem, aiming to provide high-quality and diverse compromise partitions that capture both characteristics. Second, we introduce a multiobjective genetic algorithm based on motif density and information flow (MOGA-MI) to approximate the Pareto optimal higher-order directed community partitions. On the one hand, an arc-and-motif neighbor-based genetic generator (AMN-GA) is developed to generate high-quality and diverse offspring individuals; on the other hand, a higher-order directed neighbor community modification (HD-NCM) operation is designed to further improve generated partitions by modifying easily confused nodes into more appropriate motif-neighbor communities. Finally, experimental results demonstrate that the proposed MOGA-MI outperforms state-of-the-art algorithms in terms of higher-order topology and information flow indicators while providing more diverse community information.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6536-6550"},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142825895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cost-Efficient Feature Selection for Horizontal Federated Learning 面向水平联邦学习的经济高效特征选择

IEEE transactions on artificial intelligence

Pub Date : 2024-08-01 DOI: 10.1109/TAI.2024.3436664

Sourasekhar Banerjee;Devvjiit Bhuyan;Erik Elmroth;Monowar Bhuyan

Horizontal federated learning (HFL) exhibits substantial similarities in feature space across distinct clients. However, not all features contribute significantly to the training of the global model. Moreover, the curse of dimensionality delays the training. Therefore, reducing irrelevant and redundant features from the feature space makes training faster and inexpensive. This work aims to identify the common feature subset from the clients in federated settings. We introduce a hybrid approach called Fed-MOFS,¹¹

This manuscript is an extension of Banerjee et al. [1].

utilizing mutual information (MI) and clustering for local FS at each client. Unlike the Fed-FiS, which uses a scoring function for global feature ranking, Fed-MOFS employs multiobjective optimization to prioritize features based on their higher relevance and lower redundancy. This article compares the performance of Fed-MOFS²²

We share our code, data, and supplementary copy through https://github.com/DevBhuyan/Horz-FL/blob/main/README.md.

with conventional and federated FS methods. Moreover, we tested the scalability, stability, and efficacy of both Fed-FiS and Fed-MOFS across diverse datasets. We also assessed how FS influenced model convergence and explored its impact in scenarios with data heterogeneity. Our results show that Fed-MOFS enhances global model performance with a 50% reduction in feature space and is at least twice as fast as the FSHFL method. The computational complexity for both approaches is O(

$d^{2}$

), which is lower than the state of the art.

水平联邦学习（HFL）在不同客户端的特征空间中显示了大量相似性。然而，并非所有特征都对全局模型的训练有显著贡献。此外，维度的诅咒延迟了训练。因此，从特征空间中减少不相关和冗余的特征可以使训练更快，成本更低。这项工作旨在从联邦设置中的客户端识别公共功能子集。我们引入了一种称为Fed-MOFS的混合方法，11本文是Banerjee等人的扩展。在每个客户机上利用互信息（MI）和本地FS集群。与使用评分函数对全局特征进行排序的Fed-FiS不同，Fed-MOFS采用多目标优化方法，根据特征的高相关性和低冗余度对特征进行优先级排序。本文比较了fed - mofs22的性能。我们通过https://github.com/DevBhuyan/Horz-FL/blob/main/README.md分享我们的代码、数据和补充副本。使用传统和联合FS方法。此外，我们在不同的数据集上测试了Fed-FiS和Fed-MOFS的可扩展性、稳定性和有效性。我们还评估了FS如何影响模型收敛，并探讨了其在数据异质性情景下的影响。我们的研究结果表明，Fed-MOFS提高了全局模型性能，特征空间减少了50%，速度至少是FSHFL方法的两倍。这两种方法的计算复杂度都是0 ($d^{2}$)，低于目前的水平。

{"title":"Cost-Efficient Feature Selection for Horizontal Federated Learning","authors":"Sourasekhar Banerjee;Devvjiit Bhuyan;Erik Elmroth;Monowar Bhuyan","doi":"10.1109/TAI.2024.3436664","DOIUrl":"https://doi.org/10.1109/TAI.2024.3436664","url":null,"abstract":"Horizontal federated learning (HFL) exhibits substantial similarities in feature space across distinct clients. However, not all features contribute significantly to the training of the global model. Moreover, the curse of dimensionality delays the training. Therefore, reducing irrelevant and redundant features from the feature space makes training faster and inexpensive. This work aims to identify the common feature subset from the clients in federated settings. We introduce a hybrid approach called Fed-MOFS,\u0000<xref>1</xref>\u0000<fn><label>1</label>This manuscript is an extension of Banerjee et al. <xref>[1]</xref>.</fn>\u0000 utilizing mutual information (MI) and clustering for local FS at each client. Unlike the Fed-FiS, which uses a scoring function for global feature ranking, Fed-MOFS employs multiobjective optimization to prioritize features based on their higher relevance and lower redundancy. This article compares the performance of Fed-MOFS\u0000<xref>2</xref>\u0000<fn><label>2</label>We share our code, data, and supplementary copy through <uri>https://github.com/DevBhuyan/Horz-FL/blob/main/README.md</uri>.</fn>\u0000 with conventional and federated FS methods. Moreover, we tested the scalability, stability, and efficacy of both Fed-FiS and Fed-MOFS across diverse datasets. We also assessed how FS influenced model convergence and explored its impact in scenarios with data heterogeneity. Our results show that Fed-MOFS enhances global model performance with a 50% reduction in feature space and is at least twice as fast as the FSHFL method. The computational complexity for both approaches is O(\u0000<inline-formula><tex-math>$d^{2}$</tex-math></inline-formula>\u0000), which is lower than the state of the art.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6551-6565"},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142825816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CrackLens: Automated Sidewalk Crack Detection and Segmentation CrackLens：人行道裂缝自动检测与分割

IEEE transactions on artificial intelligence

Pub Date : 2024-07-31 DOI: 10.1109/TAI.2024.3435608

Chan Young Koh;Mohamed Ali;Abdeltawab Hendawi

Automatic sidewalk crack detection is necessary for urban infrastructure maintenance to ensure pedestrian safety. Such a task becomes complex on overgrown sidewalks, where crack detection usually misjudges vegetation as cracks. A lack of automated crack detection targets overgrown sidewalk problems; most crack detection focuses on vehicular roadway cracks that are recognizable even at the aerial photography level. Hence, this article introduces CrackLens, an automated sidewalk crack detection framework capable of detecting cracks even on overgrown sidewalks. We include several contributions as follows. First, we designed an automatic data parser using a red, green, and blue (RGB)-depth fusion sidewalk dataset we collected. The RGB and depth information are combined to create depth-embedded matrices, which are used to prelabel and separate the collected dataset into two categories (with and without crack). Second, we created an automatic annotation process using image processing methods and tailored the tool only to annotate cracks on overgrown sidewalks. This process is followed by a binary classification for verification, allowing the tool to target overgrown problems on sidewalks. Lastly, we explored the robustness of our framework by experimenting with it using 8,000 real sidewalk images with some overgrown problems. The evaluation leveraged several transformer-based neural network models. Our framework achieves substantial crack detection and segmentation in overgrown sidewalks by addressing the challenges of limited data and subjective manual annotations.

自动人行道裂缝检测是城市基础设施维护所必需的，以确保行人安全。在杂草丛生的人行道上，这项任务变得非常复杂，因为裂缝检测通常会将植被误判为裂缝。针对杂草丛生的人行道问题缺乏自动裂缝检测；大多数裂缝检测都集中在车行道裂缝上，即使在航拍水平上也能识别。因此，本文介绍了 CrackLens，这是一个人行道裂缝自动检测框架，即使在杂草丛生的人行道上也能检测到裂缝。我们的贡献包括以下几个方面。首先，我们利用收集到的红绿蓝（RGB）深度融合人行道数据集设计了一个自动数据解析器。将 RGB 和深度信息结合起来创建深度嵌入矩阵，用于预先标记并将收集到的数据集分为两类（有裂缝和无裂缝）。其次，我们使用图像处理方法创建了一个自动标注流程，并对该工具进行了定制，使其仅用于标注杂草丛生的人行道上的裂缝。在这一过程之后，我们进行了二元分类验证，从而使该工具能够锁定人行道上的杂草丛生问题。最后，我们使用 8000 张带有一些杂草丛生问题的真实人行道图像进行了实验，从而探索了我们框架的鲁棒性。评估利用了几个基于变压器的神经网络模型。我们的框架通过解决有限数据和主观人工标注的难题，实现了对杂草丛生的人行道的大量裂缝检测和分割。

{"title":"CrackLens: Automated Sidewalk Crack Detection and Segmentation","authors":"Chan Young Koh;Mohamed Ali;Abdeltawab Hendawi","doi":"10.1109/TAI.2024.3435608","DOIUrl":"https://doi.org/10.1109/TAI.2024.3435608","url":null,"abstract":"Automatic sidewalk crack detection is necessary for urban infrastructure maintenance to ensure pedestrian safety. Such a task becomes complex on overgrown sidewalks, where crack detection usually misjudges vegetation as cracks. A lack of automated crack detection targets overgrown sidewalk problems; most crack detection focuses on vehicular roadway cracks that are recognizable even at the aerial photography level. Hence, this article introduces CrackLens, an automated sidewalk crack detection framework capable of detecting cracks even on overgrown sidewalks. We include several contributions as follows. First, we designed an automatic data parser using a red, green, and blue (RGB)-depth fusion sidewalk dataset we collected. The RGB and depth information are combined to create depth-embedded matrices, which are used to prelabel and separate the collected dataset into two categories (with and without crack). Second, we created an automatic annotation process using image processing methods and tailored the tool only to annotate cracks on overgrown sidewalks. This process is followed by a binary classification for verification, allowing the tool to target overgrown problems on sidewalks. Lastly, we explored the robustness of our framework by experimenting with it using 8,000 real sidewalk images with some overgrown problems. The evaluation leveraged several transformer-based neural network models. Our framework achieves substantial crack detection and segmentation in overgrown sidewalks by addressing the challenges of limited data and subjective manual annotations.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5418-5430"},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multimodal Fusion Induced Attention Network for Industrial VOCs Detection 工业挥发性有机化合物检测的多模态融合诱导注意网络

IEEE transactions on artificial intelligence

Pub Date : 2024-07-30 DOI: 10.1109/TAI.2024.3436037

Yu Kang;Kehao Shi;Jifang Tan;Yang Cao;Lijun Zhao;Zhenyi Xu

Industrial volatile organic compounds (VOCs) emissions and leakage have caused serious problems to the environment and public safety. Traditional VOCs monitoring systems require professionals to carry gas sensors into the emission area to collect VOCs, which might cause secondary hazards. VOCs infrared (IR) imaging visual inspection technology is a convenient and low-cost method. However, current visual detection methods with VOCs IR imaging are limited due to blurred imaging and indeterminate gas shapes. Moreover, major works pay attention to only IR modality for VOCs emissions detection, which would neglect semantic expressions of VOCs. To this end, we propose a dual-stream fusion detection framework to deal with visible and IR features of VOCs. Additionally, a multimodal fusion induced attention (MFIA) module is designed to realize feature fusion across modalities. Specifically, MFIA uses the spatial attention fusion module (SAFM) to mine association among modalities in terms of spatial location and generates fused features by spatial location weighting. Then, the modality adapter (MA) and induced attention module (IAM) are proposed to weight latent VOCs regions in IR features, which alleviates the problem of noise interference and degradation of VOCs characterization caused by fusion. Finally, comprehensive experiments are carried out on the challenging VOCs dataset, and the mAP@0.5 and F1-score of the proposed model are 0.527 and 0.601, which outperforms the state-of-the-art methods by 3.3% and 3.4%, respectively.

工业挥发性有机化合物（VOCs）的排放和泄漏对环境和公共安全造成了严重的问题。传统的VOCs监测系统需要专业人员携带气体传感器进入排放区域收集VOCs，这可能会造成二次危害。VOCs红外成像目视检测技术是一种方便、低成本的检测方法。然而，目前VOCs红外成像的视觉检测方法由于图像模糊和气体形状不确定而受到限制。此外，主要的工作只关注红外模式的VOCs排放检测，忽视了VOCs的语义表达。为此，我们提出了一种双流融合检测框架来处理VOCs的可见光和红外特征。此外，设计了多模态融合诱导注意（MFIA）模块，实现了多模态特征融合。具体来说，MFIA利用空间注意融合模块（SAFM）挖掘模态之间在空间位置上的关联，并通过空间位置加权生成融合特征。然后，提出模态适配器（MA）和诱导注意模块（IAM）对红外特征中的潜在VOCs区域进行加权，缓解了融合引起的噪声干扰和VOCs表征的退化问题。最后，在具有挑战性的VOCs数据集上进行了综合实验，所得模型的mAP@0.5和f1得分分别为0.527和0.601，分别优于现有方法3.3%和3.4%。

{"title":"Multimodal Fusion Induced Attention Network for Industrial VOCs Detection","authors":"Yu Kang;Kehao Shi;Jifang Tan;Yang Cao;Lijun Zhao;Zhenyi Xu","doi":"10.1109/TAI.2024.3436037","DOIUrl":"https://doi.org/10.1109/TAI.2024.3436037","url":null,"abstract":"Industrial volatile organic compounds (VOCs) emissions and leakage have caused serious problems to the environment and public safety. Traditional VOCs monitoring systems require professionals to carry gas sensors into the emission area to collect VOCs, which might cause secondary hazards. VOCs infrared (IR) imaging visual inspection technology is a convenient and low-cost method. However, current visual detection methods with VOCs IR imaging are limited due to blurred imaging and indeterminate gas shapes. Moreover, major works pay attention to only IR modality for VOCs emissions detection, which would neglect semantic expressions of VOCs. To this end, we propose a dual-stream fusion detection framework to deal with visible and IR features of VOCs. Additionally, a multimodal fusion induced attention (MFIA) module is designed to realize feature fusion across modalities. Specifically, MFIA uses the spatial attention fusion module (SAFM) to mine association among modalities in terms of spatial location and generates fused features by spatial location weighting. Then, the modality adapter (MA) and induced attention module (IAM) are proposed to weight latent VOCs regions in IR features, which alleviates the problem of noise interference and degradation of VOCs characterization caused by fusion. Finally, comprehensive experiments are carried out on the challenging VOCs dataset, and the mAP@0.5 and F1-score of the proposed model are 0.527 and 0.601, which outperforms the state-of-the-art methods by 3.3% and 3.4%, respectively.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6385-6398"},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluating Negative Sampling Approaches for Neural Topic Models 评估神经主题模型的负抽样方法

IEEE transactions on artificial intelligence

Pub Date : 2024-07-29 DOI: 10.1109/TAI.2024.3432857

Suman Adhya;Avishek Lahiri;Debarshi Kumar Sanyal;Partha Pratim Das

Negative sampling has emerged as an effective technique that enables deep learning models to learn better representations by introducing the paradigm of “learn-to-compare.” The goal of this approach is to add robustness to deep learning models to learn better representation by comparing the positive samples against the negative ones. Despite its numerous demonstrations in various areas of computer vision and natural language processing, a comprehensive study of the effect of negative sampling in an unsupervised domain such as topic modeling has not been well explored. In this article, we present a comprehensive analysis of the impact of different negative sampling strategies on neural topic models. We compare the performance of several popular neural topic models by incorporating a negative sampling technique in the decoder of variational autoencoder-based neural topic models. Experiments on four publicly available datasets demonstrate that integrating negative sampling into topic models results in significant enhancements across multiple aspects, including improved topic coherence, richer topic diversity, and more accurate document classification. Manual evaluations also indicate that the inclusion of negative sampling into neural topic models enhances the quality of the generated topics. These findings highlight the potential of negative sampling as a valuable tool for advancing the effectiveness of neural topic models.

负采样已成为一种有效的技术，通过引入 "学习-比较 "范式，深度学习模型可以学习到更好的表征。这种方法的目标是增加深度学习模型的鲁棒性，通过比较正样本和负样本来学习更好的表征。尽管这种方法在计算机视觉和自然语言处理等多个领域得到了广泛应用，但在主题建模等无监督领域，对负向采样效果的综合研究还没有得到很好的探讨。在本文中，我们全面分析了不同负采样策略对神经主题模型的影响。通过在基于变异自动编码器的神经主题模型的解码器中加入负采样技术，我们比较了几种流行的神经主题模型的性能。在四个公开可用的数据集上进行的实验表明，将负采样整合到主题模型中能显著提高多个方面的性能，包括改善主题一致性、丰富主题多样性和更准确的文档分类。人工评估也表明，将负采样纳入神经主题模型可提高生成主题的质量。这些发现凸显了负抽样作为一种有价值的工具在提高神经主题模型有效性方面的潜力。

{"title":"Evaluating Negative Sampling Approaches for Neural Topic Models","authors":"Suman Adhya;Avishek Lahiri;Debarshi Kumar Sanyal;Partha Pratim Das","doi":"10.1109/TAI.2024.3432857","DOIUrl":"https://doi.org/10.1109/TAI.2024.3432857","url":null,"abstract":"Negative sampling has emerged as an effective technique that enables deep learning models to learn better representations by introducing the paradigm of “learn-to-compare.” The goal of this approach is to add robustness to deep learning models to learn better representation by comparing the positive samples against the negative ones. Despite its numerous demonstrations in various areas of computer vision and natural language processing, a comprehensive study of the effect of negative sampling in an unsupervised domain such as topic modeling has not been well explored. In this article, we present a comprehensive analysis of the impact of different negative sampling strategies on neural topic models. We compare the performance of several popular neural topic models by incorporating a negative sampling technique in the decoder of variational autoencoder-based neural topic models. Experiments on four publicly available datasets demonstrate that integrating negative sampling into topic models results in significant enhancements across multiple aspects, including improved topic coherence, richer topic diversity, and more accurate document classification. Manual evaluations also indicate that the inclusion of negative sampling into neural topic models enhances the quality of the generated topics. These findings highlight the potential of negative sampling as a valuable tool for advancing the effectiveness of neural topic models.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5630-5642"},"PeriodicalIF":0.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Communication-Efficient Federated Learning for Decision Trees 决策树的通信效率联合学习

IEEE transactions on artificial intelligence

Pub Date : 2024-07-25 DOI: 10.1109/TAI.2024.3433419

Shuo Zhao;Zikun Zhu;Xin Li;Ying-Chi Chen

The increasing concerns about data privacy and security have driven the emergence of federated learning, which preserves privacy by collaborative learning across multiple clients without sharing their raw data. In this article, we propose a communication-efficient federated learning algorithm for decision trees (DTs), referred to as FL-DT. The key idea is to exchange the statistics of a small number of features among the server and all clients, enabling identification of the optimal feature to split each DT node without compromising privacy. To efficiently find the splitting feature based on the partially available information at each DT node, a novel formulation is derived to estimate the lower and upper bounds of Gini indexes of all features by solving a sequence of mixed-integer convex programming problems. Our experimental results based on various public datasets demonstrate that FL-DT can reduce the communication overhead substantially without surrendering any classification accuracy, compared to other conventional methods.

对数据隐私和安全的日益关注推动了联合学习的出现，联合学习通过多个客户端之间的协作学习来保护隐私，而无需共享原始数据。在本文中，我们为决策树（DT）提出了一种通信效率高的联合学习算法，称为 FL-DT。其主要思想是在服务器和所有客户端之间交换少量特征的统计信息，从而在不损害隐私的情况下识别出分割每个 DT 节点的最佳特征。为了根据每个 DT 节点的部分可用信息高效地找到分割特征，我们推导出了一种新颖的公式，通过求解一系列混合整数凸编程问题来估计所有特征的基尼指数下限和上限。我们基于各种公共数据集的实验结果表明，与其他传统方法相比，FL-DT 可以在不降低任何分类准确性的情况下大幅减少通信开销。

引用次数: 0

Sector-Based Pairs Trading Strategy With Novel Pair Selection Technique 基于行业的配对交易策略与新配对选择技术

IEEE transactions on artificial intelligence

Pub Date : 2024-07-25 DOI: 10.1109/TAI.2024.3433469

Pranjala G. Kolapwar;Uday V. Kulkarni;Jaishri M. Waghmare

A pair trading strategy (PTS) is a balanced approach that involves simultaneous trading of two highly correlated stocks. This article introduces the PTS-return-based pair selection (PTS-R) strategy which is the modification of the traditional PTS. The PTS-R follows a similar framework to the traditional PTS, differing only in the criteria it employs for selecting stock pairs. Moreover, this article proposes a novel trading strategy called sector-based pairs trading strategy (SBPTS) along with its two variants, namely SBPTS-correlation-based pair selection (SBPTS-C) and SBPTS-return-based pair selection (SBPTS-R). The SBPTS focuses on the pairs of stocks within the same sector. It consists of three innovative phases: the classification of input stocks into the respective sectors, the identification of the best-performing sector, and the selection of stock pairs based on their returns. The goal is to identify the pairs with a strong historical correlation and the highest returns within the best-performing sector. These chosen pairs are then used for trading. The strategies are designed to enhance the efficacy of the pairs trading and are validated through experimentation on real-world stock data over a ten-year historical period from 2013 to 2023. The results demonstrate their effectiveness compared to the existing techniques for pair selection and trading strategy.

配对交易策略（PTS）是一种涉及同时交易两个高度相关股票的平衡方法。本文介绍了基于PTS-收益的配对选择（PTS- r）策略，它是对传统PTS的改进。PTS- r遵循与传统PTS类似的框架，不同之处在于它选择股票对的标准。此外，本文还提出了一种新的交易策略，即基于行业的配对交易策略（SBPTS）及其两个变体，即基于相关性的配对选择（SBPTS- c）和基于收益的配对选择（SBPTS- r）。SBPTS侧重于同一行业内的股票对。它包括三个创新阶段：将输入股票分类到各自的部门，确定表现最佳的部门，以及根据其回报选择股票对。我们的目标是在表现最好的板块中找出具有强烈历史相关性和最高回报的货币对。这些选择的货币对将被用于交易。这些策略旨在提高配对交易的有效性，并通过2013年至2023年10年历史期间的真实股票数据实验进行了验证。结果表明，与现有的配对选择和交易策略技术相比，该方法是有效的。

{"title":"Sector-Based Pairs Trading Strategy With Novel Pair Selection Technique","authors":"Pranjala G. Kolapwar;Uday V. Kulkarni;Jaishri M. Waghmare","doi":"10.1109/TAI.2024.3433469","DOIUrl":"https://doi.org/10.1109/TAI.2024.3433469","url":null,"abstract":"A pair trading strategy (PTS) is a balanced approach that involves simultaneous trading of two highly correlated stocks. This article introduces the PTS-return-based pair selection (PTS-R) strategy which is the modification of the traditional PTS. The PTS-R follows a similar framework to the traditional PTS, differing only in the criteria it employs for selecting stock pairs. Moreover, this article proposes a novel trading strategy called sector-based pairs trading strategy (SBPTS) along with its two variants, namely SBPTS-correlation-based pair selection (SBPTS-C) and SBPTS-return-based pair selection (SBPTS-R). The SBPTS focuses on the pairs of stocks within the same sector. It consists of three innovative phases: the classification of input stocks into the respective sectors, the identification of the best-performing sector, and the selection of stock pairs based on their returns. The goal is to identify the pairs with a strong historical correlation and the highest returns within the best-performing sector. These chosen pairs are then used for trading. The strategies are designed to enhance the efficacy of the pairs trading and are validated through experimentation on real-world stock data over a ten-year historical period from 2013 to 2023. The results demonstrate their effectiveness compared to the existing techniques for pair selection and trading strategy.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 1","pages":"3-13"},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0