Engineering Applications of Artificial Intelligence最新文献_第3页

Multimodal sentiment analysis based on multiple attention 基于多重注意的多模态情感分析

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence

Pub Date : 2024-11-29 DOI: 10.1016/j.engappai.2024.109731

Hongbin Wang, Chun Ren, Zhengtao Yu

The development of the Internet makes various types of data widely appear on various social platforms, multimodal data provides a new perspective for sentiment analysis. Although the data types are different, there are information expressing the same sentiment. The existing researches on extracting those information are static, and this means that there is a problem of extracting common information in a fixed amount. Therefore, to address this problem, we proposes a method named multimodal sentiment analysis based on multiple attention(MAMSA). Firstly, this method utilized the adaptive attention interaction module to dynamically determine the amount of information contributed by text and image features in multimodal fusion, and multimodal common representations are extracted through cross modal attention to improve the performance of each modal feature representation. Secondly, using sentiment information as a guide to extract text and image features related to sentiment. Finally, using hierarchical manner to fully learning the internal correlations between sentiment-text association representation, sentiment-image association representation, and multimodal common information to improve the performance of the model. We conducted extensive experiments using two public multimodal datasets, and the experimental results validated the availability of the proposed method.

互联网的发展使得各种类型的数据广泛出现在各种社交平台上，多模态数据为情感分析提供了新的视角。尽管数据类型不同，但有一些信息表达了相同的情绪。现有的信息提取研究都是静态的，这意味着存在提取固定数量的公共信息的问题。因此，为了解决这一问题，我们提出了一种基于多注意的多模态情感分析方法（MAMSA）。该方法首先利用自适应注意力交互模块动态确定文本和图像特征在多模态融合中贡献的信息量，并通过跨模态注意力提取多模态共同表征，提高各模态特征表征的性能。其次，以情感信息为导向，提取与情感相关的文本和图像特征。最后，采用分层方式充分学习情感-文本关联表示、情感-图像关联表示和多模态公共信息之间的内在相关性，以提高模型的性能。我们使用两个公共的多模态数据集进行了大量的实验，实验结果验证了所提出方法的有效性。

{"title":"Multimodal sentiment analysis based on multiple attention","authors":"Hongbin Wang, Chun Ren, Zhengtao Yu","doi":"10.1016/j.engappai.2024.109731","DOIUrl":"10.1016/j.engappai.2024.109731","url":null,"abstract":"<div><div>The development of the Internet makes various types of data widely appear on various social platforms, multimodal data provides a new perspective for sentiment analysis. Although the data types are different, there are information expressing the same sentiment. The existing researches on extracting those information are static, and this means that there is a problem of extracting common information in a fixed amount. Therefore, to address this problem, we proposes a method named multimodal sentiment analysis based on multiple attention(MAMSA). Firstly, this method utilized the adaptive attention interaction module to dynamically determine the amount of information contributed by text and image features in multimodal fusion, and multimodal common representations are extracted through cross modal attention to improve the performance of each modal feature representation. Secondly, using sentiment information as a guide to extract text and image features related to sentiment. Finally, using hierarchical manner to fully learning the internal correlations between sentiment-text association representation, sentiment-image association representation, and multimodal common information to improve the performance of the model. We conducted extensive experiments using two public multimodal datasets, and the experimental results validated the availability of the proposed method.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109731"},"PeriodicalIF":7.5,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analysis of interrelationships of human errors using linguistic decision-making trial and evaluation laboratory with consensus reaching process 基于共识达成过程的语言决策试验与评价实验室分析人为错误的相互关系

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence

Pub Date : 2024-11-29 DOI: 10.1016/j.engappai.2024.109676

Qiaohong Zheng , Xinwang Liu

Analyzing human errors' interrelationships is one of the most important assignments for human reliability improvement in sociotechnical systems. Human factor analysis and classification system (HFACS) is effective in human error analysis due to its taxonomy and systematical perspective. It reveals interrelationships among human errors emerging in a multi-hierarchy of systems. However, the conventional HFACS method is incapable of quantifying their interrelationship. Especially, due to the nature of human errors, their objective data is limited. Experts' opinions are important resources to facilitate human error analysis. However, limited improved HFACS considers experts' consensus on interrelationships analysis results, especially in linguistic environments. Accordingly, this paper aims to address HFACS-based interrelationships analysis problems utilizing linguistic decision-making trial and evaluation laboratory (DEMATEL) with consensus reaching process (CRP). First, probabilistic linguistic terms are utilized to represent experts' opinions on human errors' interrelationships. Second, CRP is introduced to derive consensual opinions on human errors' interrelationships, shifting the focus to identifying human errors with low consensus levels rather than experts. Then, a hybrid weighting method is introduced to determine the weight of experts' opinions in the information fusion phase, which reflects inherent uncertainty and inter-recognition of experts’ opinions. Furthermore, DEMATEL is introduced to model direct and indirect interrelationships among human errors. Finally, a case study of a drug administration process is conducted to validate the efficiency of the proposed method. The case study indicates that neglect of safety culture development and limited financial and human resources are the top two human errors, with importance degree 0.148 and 0.107.

分析人为错误的相互关系是提高社会技术系统中人为可靠性的最重要任务之一。人因分析与分类系统（HFACS）以其分类学和系统性视角在人为错误分析中发挥着重要作用。它揭示了在多层次系统中出现的人为错误之间的相互关系。然而，传统的HFACS方法无法量化它们之间的相互关系。特别是，由于人为错误的性质，他们的客观数据是有限的。专家意见是促进人为错误分析的重要资源。然而，有限的改进HFACS考虑专家对相互关系分析结果的共识，特别是在语言环境中。因此，本文旨在利用具有共识达成过程（CRP）的语言决策试验和评估实验室（DEMATEL）解决基于hfacs的相互关系分析问题。首先，使用概率语言学术语来表示专家对人为错误相互关系的意见。其次，引入CRP来得出关于人为错误相互关系的共识意见，将重点转移到识别低共识水平的人为错误上，而不是专家。然后，引入混合加权法确定信息融合阶段专家意见的权重，该权重反映了专家意见的内在不确定性和互认性；此外，引入DEMATEL模型对人为错误之间的直接和间接相互关系进行建模。最后，以某药品给药过程为例，验证了该方法的有效性。案例分析表明，忽视安全文化建设和有限的财力和人力资源是前两大人为失误，其重要度分别为0.148和0.107。

{"title":"Analysis of interrelationships of human errors using linguistic decision-making trial and evaluation laboratory with consensus reaching process","authors":"Qiaohong Zheng , Xinwang Liu","doi":"10.1016/j.engappai.2024.109676","DOIUrl":"10.1016/j.engappai.2024.109676","url":null,"abstract":"<div><div>Analyzing human errors' interrelationships is one of the most important assignments for human reliability improvement in sociotechnical systems. Human factor analysis and classification system (HFACS) is effective in human error analysis due to its taxonomy and systematical perspective. It reveals interrelationships among human errors emerging in a multi-hierarchy of systems. However, the conventional HFACS method is incapable of quantifying their interrelationship. Especially, due to the nature of human errors, their objective data is limited. Experts' opinions are important resources to facilitate human error analysis. However, limited improved HFACS considers experts' consensus on interrelationships analysis results, especially in linguistic environments. Accordingly, this paper aims to address HFACS-based interrelationships analysis problems utilizing linguistic decision-making trial and evaluation laboratory (DEMATEL) with consensus reaching process (CRP). First, probabilistic linguistic terms are utilized to represent experts' opinions on human errors' interrelationships. Second, CRP is introduced to derive consensual opinions on human errors' interrelationships, shifting the focus to identifying human errors with low consensus levels rather than experts. Then, a hybrid weighting method is introduced to determine the weight of experts' opinions in the information fusion phase, which reflects inherent uncertainty and inter-recognition of experts’ opinions. Furthermore, DEMATEL is introduced to model direct and indirect interrelationships among human errors. Finally, a case study of a drug administration process is conducted to validate the efficiency of the proposed method. The case study indicates that neglect of safety culture development and limited financial and human resources are the top two human errors, with importance degree 0.148 and 0.107.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109676"},"PeriodicalIF":7.5,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Multi-scale Patch Mixer Network for Time Series Anomaly Detection 一种用于时间序列异常检测的多尺度拼接混频器网络

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence

Pub Date : 2024-11-29 DOI: 10.1016/j.engappai.2024.109687

Qiushi Wang , Yueming Zhu , Zhicheng Sun , Dong Li , Yunbin Ma

With the development of Internet of Things (IoT) technology, a large amount of data with temporal characteristics is collected and stored. How to efficiently and accurately identify anomalies from these data is a major challenge. At present, there are many problems in the application of anomaly detection, including non-stationary data, complex and difficult-to-collect anomalies, the need for real-time detection and the limitation of computing resources. But few methods can comprehensively consider these issues. To overcome these challenges, we propose a lightweight neural network, Multi-scale Patch Mixer Network (MP-MixerNet). It is mainly composed of a Mixer Block based on fully connected layer design, which contains a Temporal-Mixer and a Spatial-Mixer, and can simultaneously model the intra- and inter-series dependencies of multivariate time series. We also perform multi-scale patch segmentation based on frequency analysis, which helps the model extract robust features from multiple period views. In addition, we design an Input Stabilization module to help the model deal with data distribution shift. Experimental results on a public time series anomaly detection dataset show that we are able to achieve higher comprehensive performance with fewer parameters and inference time.

随着物联网技术的发展，大量具有时间特征的数据被采集和存储。如何有效、准确地从这些数据中识别异常是一个重大挑战。目前，异常检测在应用中存在着数据非平稳、异常复杂且难以采集、需要实时检测以及计算资源的限制等诸多问题。但很少有方法能综合考虑这些问题。为了克服这些挑战，我们提出了一个轻量级的神经网络，多尺度Patch Mixer network （MP-MixerNet）。它主要由一个基于全连通层设计的Mixer Block组成，该Block包含一个Temporal-Mixer和一个Spatial-Mixer，可以同时对多变量时间序列的序列内和序列间依赖关系进行建模。我们还基于频率分析进行了多尺度补丁分割，这有助于模型从多个周期视图中提取鲁棒特征。此外，我们还设计了一个输入稳定模块来帮助模型处理数据分布偏移。在一个公开的时间序列异常检测数据集上的实验结果表明，该方法能够以更少的参数和推理时间获得更高的综合性能。

{"title":"A Multi-scale Patch Mixer Network for Time Series Anomaly Detection","authors":"Qiushi Wang , Yueming Zhu , Zhicheng Sun , Dong Li , Yunbin Ma","doi":"10.1016/j.engappai.2024.109687","DOIUrl":"10.1016/j.engappai.2024.109687","url":null,"abstract":"<div><div>With the development of Internet of Things (IoT) technology, a large amount of data with temporal characteristics is collected and stored. How to efficiently and accurately identify anomalies from these data is a major challenge. At present, there are many problems in the application of anomaly detection, including non-stationary data, complex and difficult-to-collect anomalies, the need for real-time detection and the limitation of computing resources. But few methods can comprehensively consider these issues. To overcome these challenges, we propose a lightweight neural network, Multi-scale Patch Mixer Network (MP-MixerNet). It is mainly composed of a Mixer Block based on fully connected layer design, which contains a Temporal-Mixer and a Spatial-Mixer, and can simultaneously model the intra- and inter-series dependencies of multivariate time series. We also perform multi-scale patch segmentation based on frequency analysis, which helps the model extract robust features from multiple period views. In addition, we design an Input Stabilization module to help the model deal with data distribution shift. Experimental results on a public time series anomaly detection dataset show that we are able to achieve higher comprehensive performance with fewer parameters and inference time.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109687"},"PeriodicalIF":7.5,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Self-supervised anomaly detection and localization for X-ray cargo images: Generalization to novel anomalies x射线货物图像的自监督异常检测与定位：新异常的概化

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence

Pub Date : 2024-11-29 DOI: 10.1016/j.engappai.2024.109675

Bipin Gaikwad , Abani Patra , Carl R. Crawford , Eric L. Miller

Robust detection of illicit items using X-ray inspection methods has gained increasing importance in recent years due to the large volume of cargo crossing international borders. In addition to detecting the presence of such items, determining their location, size, and shape is challenging due to the unpredictable nature of anomalies, but essential for expediting security inspections. Viewing the illicit items as anomalies relative to expected cargo, we propose a self-supervised learning framework consisting of an encoder–decoder–classifier–segmenter model, a multi-component loss function, coupled with a training strategy to extract discriminative features tailored for detection of the presence of anomalies, as well as localization of such items in X-ray cargo images. Our framework addresses the challenges posed by limited labeled data and offers a model capable of both detecting and localizing anomalies effectively. Moreover, we present a diverse dataset encompassing various cargo scenarios with and without anomalies, providing a robust evaluation environment for this class of problems. Unlike existing approaches, which are trained to detect specific types of objects with a fixed set of illicit items, our framework is adaptable to real-world scenarios where a wide range of illicit items may be present in the cargo. This versatility enhances the practical applicability of our model. We evaluate the performance of our framework on our dataset as well as two other publicly available datasets, demonstrating our method’s strong detection and localization performance even when faced with complex novel anomalies significantly different from those encountered during training.

近年来，由于大量货物跨越国际边界，使用x射线检查方法对非法物品进行强有力的探测变得越来越重要。除了检测这些物品的存在之外，由于异常的不可预测性，确定它们的位置、大小和形状是具有挑战性的，但对于加快安全检查至关重要。将非法物品视为相对于预期货物的异常，我们提出了一个自监督学习框架，该框架由编码器-解码器-分类器-分割模型、多分量损失函数以及提取用于检测异常存在的判别特征的训练策略组成，以及在x射线货物图像中对此类物品进行定位。我们的框架解决了有限标记数据带来的挑战，并提供了一个能够有效检测和定位异常的模型。此外，我们还提供了一个多样化的数据集，其中包含有或没有异常的各种货物场景，为这类问题提供了一个强大的评估环境。与现有的方法不同，现有的方法是通过训练来检测特定类型的物品和固定的非法物品，我们的框架适用于货物中可能存在各种非法物品的现实场景。这种多功能性增强了我们模型的实际适用性。我们评估了我们的框架在我们的数据集以及其他两个公开可用的数据集上的性能，证明了我们的方法即使在面对与训练期间遇到的异常明显不同的复杂新异常时也具有强大的检测和定位性能。

{"title":"Self-supervised anomaly detection and localization for X-ray cargo images: Generalization to novel anomalies","authors":"Bipin Gaikwad , Abani Patra , Carl R. Crawford , Eric L. Miller","doi":"10.1016/j.engappai.2024.109675","DOIUrl":"10.1016/j.engappai.2024.109675","url":null,"abstract":"<div><div>Robust detection of illicit items using X-ray inspection methods has gained increasing importance in recent years due to the large volume of cargo crossing international borders. In addition to detecting the presence of such items, determining their location, size, and shape is challenging due to the unpredictable nature of anomalies, but essential for expediting security inspections. Viewing the illicit items as anomalies relative to expected cargo, we propose a self-supervised learning framework consisting of an encoder–decoder–classifier–segmenter model, a multi-component loss function, coupled with a training strategy to extract discriminative features tailored for detection of the presence of anomalies, as well as localization of such items in X-ray cargo images. Our framework addresses the challenges posed by limited labeled data and offers a model capable of both detecting and localizing anomalies effectively. Moreover, we present a diverse dataset encompassing various cargo scenarios with and without anomalies, providing a robust evaluation environment for this class of problems. Unlike existing approaches, which are trained to detect specific types of objects with a fixed set of illicit items, our framework is adaptable to real-world scenarios where a wide range of illicit items may be present in the cargo. This versatility enhances the practical applicability of our model. We evaluate the performance of our framework on our dataset as well as two other publicly available datasets, demonstrating our method’s strong detection and localization performance even when faced with complex novel anomalies significantly different from those encountered during training.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109675"},"PeriodicalIF":7.5,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

High-order complementary cloud application programming interface recommendation with logical reasoning for incremental development 高阶互补的云应用程序编程接口推荐，具有用于增量开发的逻辑推理

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence

Pub Date : 2024-11-29 DOI: 10.1016/j.engappai.2024.109698

Zhen Chen , Denghui Xie , Xiaolong Wang , Dianlong You , Limin Shen

Cloud application programming interface, as the best carrier for service delivery, data exchange, and capability replication, has been an indispensable element of innovation in today’s app-driven world. However, it is difficult for developers to select the suitable one when facing the sea of cloud application programming interfaces. Existing researches focus on generating single-function and high-quality recommendation lists, while ignoring developers’ needs for high-order complementary cloud application programming interfaces in incremental development. In this paper, we present a high-order complementary cloud application programming interface recommendation with logical reasoning. Firstly, we conduct data analysis to demonstrate the necessity of recommending high-order complementary cloud application programming interfaces and the existence of substitute noise. Secondly, a logical reasoning network is designed using projection, intersection, and negation three logic operators, wherein high-order complementary relations are mined and substitute noises are eliminated. Then, the cloud application programming interface base vector that is complementary but not substitute to the query set is generated, and Kullback–Leibler divergence is subsequently introduced to generate complementary recommendation results. Finally, experimental results demonstrate the superiority of our approach in low-, high-, and hybrid-order complementary recommendation scenarios, and there is a significant increase in hit rate, normalize discounted cumulative gain, mean reciprocal rank, and substitute degree by 11.43%/4.86%, 10.08%/4.28%, 7.50%/2.67%, and 36.33%/32.35% on ProgrammableWeb and Huawei AppGallery datasets respectively. The proposed approach is not only more likely to produce diversified results that meet developers’ needs but also help providers better formulate pricing strategies to achieve combined sales and improve revenue.

云应用程序编程接口作为服务交付、数据交换和功能复制的最佳载体，已成为当今应用驱动世界中不可或缺的创新元素。然而，面对云应用程序编程接口的海洋，开发人员很难选择合适的接口。现有研究侧重于生成单一功能、高质量的推荐列表，忽略了增量开发中开发者对高阶互补云应用编程接口的需求。本文提出了一种基于逻辑推理的高阶互补云应用程序编程接口推荐方法。首先，我们通过数据分析论证了推荐高阶互补云应用编程接口的必要性以及替代噪声的存在。其次，利用投影、交、负三种逻辑算子设计逻辑推理网络，挖掘高阶互补关系，消除替代噪声；然后，生成与查询集互补但不替代的云应用编程接口基向量，随后引入Kullback-Leibler散度生成互补推荐结果。最后，实验结果证明了我们的方法在低阶、高阶和混合阶互补推荐场景中的优越性，在ProgrammableWeb和Huawei AppGallery数据集上，准确率、归一化贴现累积增益、平均互反秩和替代度分别显著提高了11.43%/4.86%、10.08%/4.28%、7.50%/2.67%和36.33%/32.35%。所提出的方法不仅更有可能产生多样化的结果，满足开发商的需求，而且可以帮助供应商更好地制定定价策略，实现综合销售和提高收入。

{"title":"High-order complementary cloud application programming interface recommendation with logical reasoning for incremental development","authors":"Zhen Chen , Denghui Xie , Xiaolong Wang , Dianlong You , Limin Shen","doi":"10.1016/j.engappai.2024.109698","DOIUrl":"10.1016/j.engappai.2024.109698","url":null,"abstract":"<div><div>Cloud application programming interface, as the best carrier for service delivery, data exchange, and capability replication, has been an indispensable element of innovation in today’s app-driven world. However, it is difficult for developers to select the suitable one when facing the sea of cloud application programming interfaces. Existing researches focus on generating single-function and high-quality recommendation lists, while ignoring developers’ needs for high-order complementary cloud application programming interfaces in incremental development. In this paper, we present a high-order complementary cloud application programming interface recommendation with logical reasoning. Firstly, we conduct data analysis to demonstrate the necessity of recommending high-order complementary cloud application programming interfaces and the existence of substitute noise. Secondly, a logical reasoning network is designed using projection, intersection, and negation three logic operators, wherein high-order complementary relations are mined and substitute noises are eliminated. Then, the cloud application programming interface base vector that is complementary but not substitute to the query set is generated, and Kullback–Leibler divergence is subsequently introduced to generate complementary recommendation results. Finally, experimental results demonstrate the superiority of our approach in low-, high-, and hybrid-order complementary recommendation scenarios, and there is a significant increase in hit rate, normalize discounted cumulative gain, mean reciprocal rank, and substitute degree by 11.43%/4.86%, 10.08%/4.28%, 7.50%/2.67%, and 36.33%/32.35% on ProgrammableWeb and Huawei AppGallery datasets respectively. The proposed approach is not only more likely to produce diversified results that meet developers’ needs but also help providers better formulate pricing strategies to achieve combined sales and improve revenue.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109698"},"PeriodicalIF":7.5,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards robust ferrous scrap material classification with deep learning and conformal prediction 基于深度学习和保形预测的铁屑分类研究

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence

Pub Date : 2024-11-29 DOI: 10.1016/j.engappai.2024.109724

Paulo Henrique dos Santos , Valéria de Carvalho Santos , Eduardo José da Silva Luz

The classification of ferrous scrap materials is a well-explored problem in the literature, recognized for its significance in the steel production industry. While deep learning models are effective for this task, their deployment in industrial settings requires addressing model uncertainties and ensuring proper calibration. This study proposes adapting split conformal prediction to quantify uncertainties and facilitate model calibration. The results indicate that the Hierarchical Vision Transformer using Shifted Windows (Swin) models, particularly Swin V2, serves as the most reliable backbone for this task. Although the performance of Swin models is comparable to other evaluated models, Swin V2 demonstrates superior confidence, achieving 95.51% accuracy and the lowest conformal prediction threshold. The method is rigorously evaluated on a real-world dataset comprising 8,147 images across nine classes of ferrous scrap widely used in the Brazilian steel industry. Explainability methods corroborate the results of conformal prediction, enhancing transparency and trust in model predictions, and thereby facilitating industrial adoption. This approach bridges the gap between advanced deep learning and practical application in ferrous scrap classification, underscoring the importance of model calibration in industrial deployment.

铁废料的分类在文献中是一个很好的探索问题，因其在钢铁生产行业中的重要性而得到认可。虽然深度学习模型对这项任务是有效的，但它们在工业环境中的部署需要解决模型的不确定性并确保适当的校准。本研究提出采用分割保形预测来量化不确定性，方便模型校准。结果表明，使用移位窗口（Swin）模型的分层视觉变压器，特别是Swin V2，是该任务中最可靠的骨干。虽然Swin模型的性能与其他评估模型相当，但Swin V2具有更高的置信度，准确率达到95.51%，适形预测阈值最低。该方法在一个真实世界的数据集上进行了严格的评估，该数据集包括巴西钢铁行业广泛使用的九类黑色金属废料的8,147张图像。可解释性方法证实了适形预测的结果，提高了模型预测的透明度和可信度，从而促进了工业采用。这种方法弥合了高级深度学习与铁废料分类实际应用之间的差距，强调了模型校准在工业部署中的重要性。

{"title":"Towards robust ferrous scrap material classification with deep learning and conformal prediction","authors":"Paulo Henrique dos Santos , Valéria de Carvalho Santos , Eduardo José da Silva Luz","doi":"10.1016/j.engappai.2024.109724","DOIUrl":"10.1016/j.engappai.2024.109724","url":null,"abstract":"<div><div>The classification of ferrous scrap materials is a well-explored problem in the literature, recognized for its significance in the steel production industry. While deep learning models are effective for this task, their deployment in industrial settings requires addressing model uncertainties and ensuring proper calibration. This study proposes adapting split conformal prediction to quantify uncertainties and facilitate model calibration. The results indicate that the Hierarchical Vision Transformer using Shifted Windows (Swin) models, particularly Swin V2, serves as the most reliable backbone for this task. Although the performance of Swin models is comparable to other evaluated models, Swin V2 demonstrates superior confidence, achieving 95.51% accuracy and the lowest conformal prediction threshold. The method is rigorously evaluated on a real-world dataset comprising 8,147 images across nine classes of ferrous scrap widely used in the Brazilian steel industry. Explainability methods corroborate the results of conformal prediction, enhancing transparency and trust in model predictions, and thereby facilitating industrial adoption. This approach bridges the gap between advanced deep learning and practical application in ferrous scrap classification, underscoring the importance of model calibration in industrial deployment.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109724"},"PeriodicalIF":7.5,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Neural Arithmetic Logic Units with Two Transition Matrix and Independent Gates 具有两个转移矩阵和独立门的神经算术逻辑单元

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence

Pub Date : 2024-11-29 DOI: 10.1016/j.engappai.2024.109663

Sthefanie Jofer Gomes Passo, Vishal H. Kothavade, Wei-Ming Lin, Clair Walton

Neural Networks have traditionally been used to handle numerical information based on their training. However, they often struggle with systematic generalization, particularly when the numerical range during testing differs from that used in training. To tackle this issue, we propose an enhanced version of an existing architecture known as Neural Arithmetic Logic Units (NALU), incorporating Independent Gates. We refer to this new architecture as Neural Arithmetic Logic Units with Independent Gates (NALUIG), which can represent numerical values through linear activations. It employs primitive arithmetic operators, managed by learned gates that operate independently of the input, to differentiate weight matrices for both the adder and multiplier. Additionally, we introduce two new architectures: Neural Arithmetic Logic Unit with two Transition Matrices (NALU2M) and Neural Arithmetic Logic Unit with two Transition Matrices and Independent Gates (NALU2MIG). Our experiments demonstrate that the enhanced neural networks can effectively learn to perform arithmetic and numeric image classification from the Modified National Institute of Standards and Technology database (MNIST), achieving significantly lower error rates compared to other existing neural networks. This approach utilizes independent gates to represent numerical values as distinct neurons without introducing non-linearity. In this paper, we present improved results regarding numerical range generalization compared to the current state-of-the-art.

传统上，神经网络被用来处理基于其训练的数字信息。然而，他们经常与系统泛化斗争，特别是当测试期间的数值范围与训练中使用的数值范围不同时。为了解决这个问题，我们提出了一个现有架构的增强版本，称为神经算术逻辑单元（NALU），包含独立门。我们将这种新架构称为具有独立门的神经算术逻辑单元（NALUIG），它可以通过线性激活来表示数值。它使用原始算术运算符，由独立于输入操作的学习门管理，来区分加法器和乘法器的权重矩阵。此外，我们还介绍了两种新的架构：具有两个转换矩阵的神经算术逻辑单元（NALU2M）和具有两个转换矩阵和独立门的神经算术逻辑单元（NALU2MIG）。我们的实验表明，增强的神经网络可以有效地学习执行来自修改的国家标准与技术研究所数据库（MNIST）的算术和数字图像分类，与其他现有的神经网络相比，实现了显着降低的错误率。该方法利用独立门将数值表示为不同的神经元，而不引入非线性。在本文中，我们提出了改进的结果关于数值范围泛化与目前的先进技术相比。

{"title":"Neural Arithmetic Logic Units with Two Transition Matrix and Independent Gates","authors":"Sthefanie Jofer Gomes Passo, Vishal H. Kothavade, Wei-Ming Lin, Clair Walton","doi":"10.1016/j.engappai.2024.109663","DOIUrl":"10.1016/j.engappai.2024.109663","url":null,"abstract":"<div><div>Neural Networks have traditionally been used to handle numerical information based on their training. However, they often struggle with systematic generalization, particularly when the numerical range during testing differs from that used in training. To tackle this issue, we propose an enhanced version of an existing architecture known as Neural Arithmetic Logic Units (NALU), incorporating Independent Gates. We refer to this new architecture as Neural Arithmetic Logic Units with Independent Gates (NALUIG), which can represent numerical values through linear activations. It employs primitive arithmetic operators, managed by learned gates that operate independently of the input, to differentiate weight matrices for both the adder and multiplier. Additionally, we introduce two new architectures: Neural Arithmetic Logic Unit with two Transition Matrices (NALU2M) and Neural Arithmetic Logic Unit with two Transition Matrices and Independent Gates (NALU2MIG). Our experiments demonstrate that the enhanced neural networks can effectively learn to perform arithmetic and numeric image classification from the Modified National Institute of Standards and Technology database (MNIST), achieving significantly lower error rates compared to other existing neural networks. This approach utilizes independent gates to represent numerical values as distinct neurons without introducing non-linearity. In this paper, we present improved results regarding numerical range generalization compared to the current state-of-the-art.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109663"},"PeriodicalIF":7.5,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

State of health estimation of lithium-ion battery cell based on optical thermometry with physics-informed machine learning 基于光学测温和物理信息机器学习的锂离子电池健康状态估计

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence

Pub Date : 2024-11-29 DOI: 10.1016/j.engappai.2024.109704

Jeongwoo Jang , Junhyoung Jo , Jinsu Kim , Seungmin Lee , Tonghun Lee , Jihyung Yoo

Effective thermal management and accurate state of health (SOH) estimation of lithium-ion batteries is crucial for ensuring their safety, reliability, and longevity. This study presents three innovative physics-informed machine learning-based SOH estimation techniques trained and demonstrated using experimental temperature data. Temperature distribution measurements were obtained using optical frequency domain reflectometry with optical fibers embedded in a cylindrical lithium-ion battery cell under various SOH. One of the trained model accurately predicted the SOH of a cell within 2% with only a 10-minute measurement. This technique also enables the estimation of SOH for individual cells connected in series or parallel within a battery module or pack simultaneously, thereby reducing the overall SOH estimation uncertainty without the need for disassembly. Furthermore, this not only highlights the necessity of precise thermal management in maintaining battery health but also offers a practical and efficient solution for real-time SOH monitoring in battery systems.

有效的热管理和准确的健康状态（SOH）评估对于确保锂离子电池的安全性、可靠性和寿命至关重要。本研究提出了三种创新的基于物理的机器学习的SOH估计技术，并使用实验温度数据进行了训练和演示。在不同SOH条件下，利用光纤嵌入圆柱形锂离子电池，采用光频域反射法测量温度分布。其中一个经过训练的模型仅用10分钟的测量就能准确地预测细胞的SOH在2%以内。该技术还可以同时对电池模块或电池组内串联或并联的单个电池进行SOH估计，从而减少整体SOH估计的不确定性，而无需拆卸。此外，这不仅强调了保持电池健康的精确热管理的必要性，而且还为电池系统中的SOH实时监测提供了实用高效的解决方案。

{"title":"State of health estimation of lithium-ion battery cell based on optical thermometry with physics-informed machine learning","authors":"Jeongwoo Jang , Junhyoung Jo , Jinsu Kim , Seungmin Lee , Tonghun Lee , Jihyung Yoo","doi":"10.1016/j.engappai.2024.109704","DOIUrl":"10.1016/j.engappai.2024.109704","url":null,"abstract":"<div><div>Effective thermal management and accurate state of health (SOH) estimation of lithium-ion batteries is crucial for ensuring their safety, reliability, and longevity. This study presents three innovative physics-informed machine learning-based SOH estimation techniques trained and demonstrated using experimental temperature data. Temperature distribution measurements were obtained using optical frequency domain reflectometry with optical fibers embedded in a cylindrical lithium-ion battery cell under various SOH. One of the trained model accurately predicted the SOH of a cell within 2% with only a 10-minute measurement. This technique also enables the estimation of SOH for individual cells connected in series or parallel within a battery module or pack simultaneously, thereby reducing the overall SOH estimation uncertainty without the need for disassembly. Furthermore, this not only highlights the necessity of precise thermal management in maintaining battery health but also offers a practical and efficient solution for real-time SOH monitoring in battery systems.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109704"},"PeriodicalIF":7.5,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Edge artificial intelligence and super-resolution for enhanced weapon detection in video surveillance 利用边缘人工智能和超分辨率增强视频监控中的武器探测能力

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence

Pub Date : 2024-11-27 DOI: 10.1016/j.engappai.2024.109684

Daniele Berardini , Lucia Migliorelli , Alessandro Galdelli , Manuel J. Marín-Jiménez

The prevalence of crimes involving handguns and knives underscores the importance of early weapon detection. This, along with the spread of video surveillance systems, boosted the development of automatic approaches for weapon detection from surveillance cameras. Despite the advancements from classical computer vision to Deep Learning (DL) techniques, accurately detecting weapons in real-time remains challenging due to their small size. Current DL methods, which attempt to mitigate this issue using complex detection architectures, are resource-intensive, resulting in high costs and energy usage, and hindering their deployment on efficient edge devices. This creates challenges in resource-limited environments, making these methods impractical for edge and real-time applications. To address these shortcomings, our work proposes YOLOSR, which integrates a You Only Look Once (YOLO) v8-small model with an Enhanced Deep Super Resolution (EDSR)-based network using a shared backbone. During training, the auxiliary Super Resolution (SR) helps in learning better features, which could benefit the weapon detection task. During inference, the SR branch is removed, keeping the detector’s computational complexity unchanged. The YOLOSR’s accuracy and efficiency were validated on our WeaponSense dataset and on a NVIDIA Jetson Nano, against other weapon detectors. The results exhibited that YOLOSR, compared to the state-of-the-art YOLOv8-small model, maintained the same computational complexity with 28.8 billion floating point operations and on-device latency of 101 ms per image, while increasing the Average Precision by 10.2 percentage points. Thus, the YOLOSR emerges as an effective solution for real-time weapon detection in resource-constrained environments, achieving an optimal trade-off between efficiency and accuracy.

涉及手枪和刀具的犯罪盛行凸显了早期武器检测的重要性。这一点以及视频监控系统的普及，推动了从监控摄像头自动检测武器方法的发展。尽管从经典计算机视觉到深度学习（DL）技术都取得了进步，但由于武器体积小，实时准确地检测武器仍然具有挑战性。目前的深度学习方法试图利用复杂的检测架构来缓解这一问题，但这些方法需要大量资源，导致成本和能源消耗较高，阻碍了其在高效边缘设备上的部署。这给资源有限的环境带来了挑战，使这些方法无法用于边缘和实时应用。为了解决这些问题，我们的研究提出了 YOLOSR，它将 "只看一次"（YOLO）v8-小模型与基于增强深度超分辨率（EDSR）的网络集成在一起，并使用共享骨干网。在训练过程中，辅助超级分辨率（SR）有助于学习更好的特征，从而有利于武器检测任务。在推理过程中，SR 分支被移除，从而保持检测器的计算复杂度不变。我们在 WeaponSense 数据集和 NVIDIA Jetson Nano 上对 YOLOSR 的准确性和效率进行了验证，并与其他武器检测器进行了比较。结果表明，与最先进的 YOLOv8-small 模型相比，YOLOSR 的计算复杂度保持不变，浮点运算次数为 288 亿次，每幅图像的设备延迟时间为 101 毫秒，而平均精度提高了 10.2 个百分点。因此，YOLOSR 是在资源有限环境中进行实时武器探测的有效解决方案，在效率和精度之间实现了最佳权衡。

{"title":"Edge artificial intelligence and super-resolution for enhanced weapon detection in video surveillance","authors":"Daniele Berardini , Lucia Migliorelli , Alessandro Galdelli , Manuel J. Marín-Jiménez","doi":"10.1016/j.engappai.2024.109684","DOIUrl":"10.1016/j.engappai.2024.109684","url":null,"abstract":"<div><div>The prevalence of crimes involving handguns and knives underscores the importance of early weapon detection. This, along with the spread of video surveillance systems, boosted the development of automatic approaches for weapon detection from surveillance cameras. Despite the advancements from classical computer vision to Deep Learning (DL) techniques, accurately detecting weapons in real-time remains challenging due to their small size. Current DL methods, which attempt to mitigate this issue using complex detection architectures, are resource-intensive, resulting in high costs and energy usage, and hindering their deployment on efficient edge devices. This creates challenges in resource-limited environments, making these methods impractical for edge and real-time applications. To address these shortcomings, our work proposes YOLOSR, which integrates a You Only Look Once (YOLO) v8-small model with an Enhanced Deep Super Resolution (EDSR)-based network using a shared backbone. During training, the auxiliary Super Resolution (SR) helps in learning better features, which could benefit the weapon detection task. During inference, the SR branch is removed, keeping the detector’s computational complexity unchanged. The YOLOSR’s accuracy and efficiency were validated on our WeaponSense dataset and on a NVIDIA Jetson Nano, against other weapon detectors. The results exhibited that YOLOSR, compared to the state-of-the-art YOLOv8-small model, maintained the same computational complexity with 28.8 billion floating point operations and on-device latency of 101 ms per image, while increasing the Average Precision by 10.2 percentage points. Thus, the YOLOSR emerges as an effective solution for real-time weapon detection in resource-constrained environments, achieving an optimal trade-off between efficiency and accuracy.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109684"},"PeriodicalIF":7.5,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142723171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive neural boundary control for multi-agent manipulators system with uncertainties through cooperative disturbance observers network 通过合作扰动观测器网络实现具有不确定性的多代理机械手系统的自适应神经边界控制

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence

Pub Date : 2024-11-27 DOI: 10.1016/j.engappai.2024.109669

Zhibo Zhao , Yuan Yuan , Xiaodong Xu , Biao Luo , Tingwen Huang

This paper addresses vibration control problem of multi-agent flexible manipulators systems in the presence of simultaneous uncertainty and unknown external disturbance. Particularly, the goal is to suppress vibration of both flexible link and joint angular. In this paper, the dynamic model of the considered flexible manipulator is described by the fourth order partial differential equation. Without control, the system is unstable and vibrate constantly due to initial states, the external unknown disturbances and system uncertainties. To compensate the uncertainty in each agent, the neural networks are employed and novel adaptation laws are developed to update weighting parameters in the neural networks. While for the compensation of the external disturbance a cooperative network of disturbance observers is proposed to enhance the observation reliability. With the resulting estimations of uncertainties and the unknown disturbance, adaptive distributed boundary controllers are derived to suppress vibration in-domain and keep joint angular position to zero. The closed-loop system is proven to be uniform ultimately bounded through Lyapunov stability theory. Numerical simulations result shows that compared with the proportional–derivative control, the proposed method almost reduces all overshoot and steady-state error.

本文探讨了在同时存在不确定性和未知外部干扰的情况下，多代理柔性机械手系统的振动控制问题。特别是，目标是抑制柔性连杆和关节角度的振动。本文所考虑的柔性机械手的动态模型由四阶偏微分方程描述。在没有控制的情况下，系统会因初始状态、外部未知干扰和系统不确定性而不稳定并不断振动。为了补偿每个代理的不确定性，我们采用了神经网络，并开发了新的适应法则来更新神经网络中的权重参数。为了补偿外部干扰，提出了一个干扰观测器合作网络，以提高观测可靠性。根据对不确定性和未知干扰的估计结果，推导出自适应分布式边界控制器，以抑制域内振动并将关节角位置保持为零。通过 Lyapunov 稳定性理论证明了闭环系统最终是均匀有界的。数值模拟结果表明，与比例-衍生控制相比，所提出的方法几乎减少了所有过冲和稳态误差。

{"title":"Adaptive neural boundary control for multi-agent manipulators system with uncertainties through cooperative disturbance observers network","authors":"Zhibo Zhao , Yuan Yuan , Xiaodong Xu , Biao Luo , Tingwen Huang","doi":"10.1016/j.engappai.2024.109669","DOIUrl":"10.1016/j.engappai.2024.109669","url":null,"abstract":"<div><div>This paper addresses vibration control problem of multi-agent flexible manipulators systems in the presence of simultaneous uncertainty and unknown external disturbance. Particularly, the goal is to suppress vibration of both flexible link and joint angular. In this paper, the dynamic model of the considered flexible manipulator is described by the fourth order partial differential equation. Without control, the system is unstable and vibrate constantly due to initial states, the external unknown disturbances and system uncertainties. To compensate the uncertainty in each agent, the neural networks are employed and novel adaptation laws are developed to update weighting parameters in the neural networks. While for the compensation of the external disturbance a cooperative network of disturbance observers is proposed to enhance the observation reliability. With the resulting estimations of uncertainties and the unknown disturbance, adaptive distributed boundary controllers are derived to suppress vibration in-domain and keep joint angular position to zero. The closed-loop system is proven to be uniform ultimately bounded through Lyapunov stability theory. Numerical simulations result shows that compared with the proportional–derivative control, the proposed method almost reduces all overshoot and steady-state error.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109669"},"PeriodicalIF":7.5,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142723187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0