Applied Soft Computing最新文献_第4页

A continuous concrete vibration method for robots based on machine vision with integrated spatial features 基于集成空间特征的机器视觉的机器人连续混凝土振动方法

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing

Pub Date : 2024-09-12 DOI: 10.1016/j.asoc.2024.112231

The traditional manual concrete vibration work faces numerous limitations, necessitating efficient automated method to assist in this task. This study proposes a vision-based continuous concrete vibration method for vibrating robots. By enhancing the YOLOv8n model with attention mechanisms, our proposed method demonstrates a high AP of 93.31 % in identifying reinforcing grids, and an FPS of 25.6 on embedded systems. For the first time in concrete vibration tasks, this study utilizes spatial positional information to cluster coordinate data, transforming confidence-sorted data into spatially ordered sequences. Vibrating robot case test shows that the proposed method enhances the vibration speed by 22.18 % and improves the vibration success rate by 11.67 % compared to traditional strategies. Additionally, the on-site experiment conducted at four construction sites demonstrated the robustness of the proposed method. These findings advance automation in concrete vibration work, offering significant implications for the fields of robotics and construction engineering.

传统的人工混凝土振捣工作面临诸多限制，因此需要高效的自动化方法来协助完成这项任务。本研究为振动机器人提出了一种基于视觉的连续混凝土振动方法。通过利用注意力机制增强 YOLOv8n 模型，我们提出的方法在识别钢筋网格方面实现了高达 93.31% 的 AP 值，在嵌入式系统上的 FPS 为 25.6。本研究首次在混凝土振动任务中利用空间位置信息对坐标数据进行聚类，将置信度排序数据转化为空间有序序列。振动机器人案例测试表明，与传统策略相比，所提出的方法提高了 22.18 % 的振动速度，提高了 11.67 % 的振动成功率。此外，在四个建筑工地进行的现场实验证明了所提方法的鲁棒性。这些发现推进了混凝土振动工作的自动化，对机器人和建筑工程领域具有重要意义。

引用次数: 0

Short-term air quality prediction using point and interval deep learning systems coupled with multi-factor decomposition and data-driven tree compression 利用多因素分解和数据驱动树压缩的点和区间深度学习系统进行短期空气质量预测

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing

Pub Date : 2024-09-11 DOI: 10.1016/j.asoc.2024.112191

Clean air, as a symbol of high-quality air quality, is the most basic requirement for people to maintain health. Moreover, in keeping humans fit, accurate short-term air quality prediction is vital. The decomposition algorithm can better capture the local features and temporal changes of the data. However, it increases the computation time, resource consumption, and complexity of the model. On the other hand, existing forecasting systems overlook instability and uncertainty. To solve the above problems, a deterministic and uncertainty AOA-DBGRU-MDN deep learning systems is proposed, which combines arithmetic optimization algorithm (AOA), double-layer bi-directional GRUs (DBGRU), and mixture density network (MDN). The above systems consider meteorological factors and air pollutants comprehensively. It involves feature selection using maximum information coefficient (MIC), decomposition using complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) algorithm, classification, and compression of decomposed components using entropy-Huffman tree compression. Firstly, the information measurement process reduces the number of components significantly. Following the incorporation of multi-factor data, the optimal DBGRU model is then obtained using AOA. Finally, the training errors are fitted using MDN to obtain interval prediction results. The experiments demonstrate that (1) Using the CEEMDAN algorithm can improve the prediction accuracy; (2) Classifying and reconstructing the data based on entropy-Huffman tree compression can not only decrease the model's training volume and improve training efficiency but also boost the model's prediction accuracy; (3) The AOA-DBGRU-MDN system performs probabilistic prediction to obtain an effective and intuitive prediction interval to improve the point prediction of air quality prediction.

洁净的空气象征着高质量的空气质量，是人们保持健康的最基本要求。此外，要保持人体健康，准确的短期空气质量预测也至关重要。分解算法能更好地捕捉数据的局部特征和时间变化。但是，它增加了计算时间、资源消耗和模型的复杂性。另一方面，现有的预测系统忽视了不稳定性和不确定性。为解决上述问题，本文提出了一种确定性和不确定性 AOA-DBGRU-MDN 深度学习系统，该系统结合了算术优化算法（AOA）、双层双向 GRU（DBGRU）和混合密度网络（MDN）。上述系统综合考虑了气象因素和空气污染物。它包括使用最大信息系数（MIC）进行特征选择、使用带自适应噪声的完全集合经验模式分解（CEEMDAN）算法进行分解、分类，以及使用熵-哈夫曼树压缩对分解后的成分进行压缩。首先，信息测量过程大大减少了分量的数量。在纳入多因素数据后，使用 AOA 获得 DBGRU 的最优模型。最后，利用 MDN 对训练误差进行拟合，得到区间预测结果。实验证明：（1）使用 CEEMDAN 算法可以提高预测精度；（2）基于熵-哈夫曼树压缩对数据进行分类和重构，不仅可以减少模型的训练量，提高训练效率，还可以提高模型的预测精度；（3）AOA-DBGRU-MDN 系统进行概率预测，得到有效直观的预测区间，提高空气质量预测的点预测效果。

{"title":"Short-term air quality prediction using point and interval deep learning systems coupled with multi-factor decomposition and data-driven tree compression","authors":"","doi":"10.1016/j.asoc.2024.112191","DOIUrl":"10.1016/j.asoc.2024.112191","url":null,"abstract":"<div><p>Clean air, as a symbol of high-quality air quality, is the most basic requirement for people to maintain health. Moreover, in keeping humans fit, accurate short-term air quality prediction is vital. The decomposition algorithm can better capture the local features and temporal changes of the data. However, it increases the computation time, resource consumption, and complexity of the model. On the other hand, existing forecasting systems overlook instability and uncertainty. To solve the above problems, a deterministic and uncertainty AOA-DBGRU-MDN deep learning systems is proposed, which combines arithmetic optimization algorithm (AOA), double-layer bi-directional GRUs (DBGRU), and mixture density network (MDN). The above systems consider meteorological factors and air pollutants comprehensively. It involves feature selection using maximum information coefficient (MIC), decomposition using complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) algorithm, classification, and compression of decomposed components using entropy-Huffman tree compression. Firstly, the information measurement process reduces the number of components significantly. Following the incorporation of multi-factor data, the optimal DBGRU model is then obtained using AOA. Finally, the training errors are fitted using MDN to obtain interval prediction results. The experiments demonstrate that (1) Using the CEEMDAN algorithm can improve the prediction accuracy; (2) Classifying and reconstructing the data based on entropy-Huffman tree compression can not only decrease the model's training volume and improve training efficiency but also boost the model's prediction accuracy; (3) The AOA-DBGRU-MDN system performs probabilistic prediction to obtain an effective and intuitive prediction interval to improve the point prediction of air quality prediction.</p></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142232998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A survey on hand gesture recognition based on surface electromyography: Fundamentals, methods, applications, challenges and future trends 基于表面肌电图的手势识别研究：基础、方法、应用、挑战和未来趋势

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing

Pub Date : 2024-09-11 DOI: 10.1016/j.asoc.2024.112235

Hand gestures are crucial for developing prosthetic and rehabilitation devices, enabling intuitive human–computer interaction (HCI) and improving accessibility for individuals with impairments. Recently, gesture recognition systems based on surface electromyography (sEMG) have been widely employed in various fields, demonstrating remarkable advantages and developments. In this paper, we present a comprehensive survey on sEMG-based hand gesture recognition. We provide an overview of the basic knowledge and background of sEMG signals and the acquisition equipment used. We delve into the applied feature extraction methods and classification models, focusing on recent advances in deep learning techniques. We also identify the datasets of sEMG signals used for hand gesture recognition. Moreover, we highlight recent applications of sEMG-based gesture recognition methods, including HCI, sign language recognition, rehabilitation, prosthesis control, and exoskeletons for augmentation. Additionally, we outline the latest innovative progress in this field, such as the influence of force, user identity detection, and migration effects. We also discuss the current limitations and challenges. Finally, we summarize the main findings and discuss future directions to enhance sEMG-based hand gesture recognition.

手势对于开发假肢和康复设备、实现直观的人机交互（HCI）以及改善残障人士的无障碍环境至关重要。最近，基于表面肌电图（sEMG）的手势识别系统被广泛应用于各个领域，显示出显著的优势和发展。在本文中，我们对基于 sEMG 的手势识别进行了全面研究。我们概述了 sEMG 信号和所用采集设备的基本知识和背景。我们深入探讨了应用的特征提取方法和分类模型，重点介绍了深度学习技术的最新进展。我们还确定了用于手势识别的 sEMG 信号数据集。此外，我们还重点介绍了基于 sEMG 的手势识别方法的最新应用，包括人机交互、手语识别、康复、假肢控制和用于增强功能的外骨骼。此外，我们还概述了该领域的最新创新进展，如力的影响、用户身份检测和迁移效应。我们还讨论了当前的局限性和挑战。最后，我们总结了主要研究成果，并讨论了加强基于 sEMG 的手势识别的未来方向。

{"title":"A survey on hand gesture recognition based on surface electromyography: Fundamentals, methods, applications, challenges and future trends","authors":"","doi":"10.1016/j.asoc.2024.112235","DOIUrl":"10.1016/j.asoc.2024.112235","url":null,"abstract":"<div><p>Hand gestures are crucial for developing prosthetic and rehabilitation devices, enabling intuitive human–computer interaction (HCI) and improving accessibility for individuals with impairments. Recently, gesture recognition systems based on surface electromyography (sEMG) have been widely employed in various fields, demonstrating remarkable advantages and developments. In this paper, we present a comprehensive survey on sEMG-based hand gesture recognition. We provide an overview of the basic knowledge and background of sEMG signals and the acquisition equipment used. We delve into the applied feature extraction methods and classification models, focusing on recent advances in deep learning techniques. We also identify the datasets of sEMG signals used for hand gesture recognition. Moreover, we highlight recent applications of sEMG-based gesture recognition methods, including HCI, sign language recognition, rehabilitation, prosthesis control, and exoskeletons for augmentation. Additionally, we outline the latest innovative progress in this field, such as the influence of force, user identity detection, and migration effects. We also discuss the current limitations and challenges. Finally, we summarize the main findings and discuss future directions to enhance sEMG-based hand gesture recognition.</p></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142173931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Temporal relation transformer for robust visual tracking with dual-memory learning 利用双记忆学习实现稳健视觉跟踪的时空关系转换器

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing

Pub Date : 2024-09-10 DOI: 10.1016/j.asoc.2024.112229

Recently, transformer trackers mostly associate multiple reference images with the search area to adapt to the changing appearance of the target. However, they ignore the learned cross-relations between the target and surrounding, leading to difficulties in building coherent contextual models for specific target instances. This paper presents a Temporal Relation Transformer Tracker (TRTT) for robust visual tracking, providing a concise approach to modeling temporal relations by dual target memory learning. Specifically, a temporal relation transformer network generates paired memories based on static and dynamic templates, which are reinforced interactively. The memory contains implicit relation hints that capture the relations between the tracked object and its immediate surroundings. More importantly, to ensure consistency of target instance identities between frames, the relation hints from previous frames are transferred to the current frame for merging temporal contextual attention. Our method also incorporates mechanisms for reusing favorable cross-relations and instance-specific features, thereby overcoming background interference in complex spatio-temporal interactions through a sequential constraint. Furthermore, we design a memory token sparsification method that leverages the key points of the target to eliminate interferences and optimize attention calculations. Extensive experiments demonstrate that our method surpasses advanced trackers on 8 challenging benchmarks while maintaining real-time running speed.

最近，变换跟踪器大多将多个参考图像与搜索区域关联起来，以适应目标不断变化的外观。然而，它们忽略了目标与周围环境之间的交叉关系，导致难以为特定目标实例建立连贯的上下文模型。本文提出了一种用于稳健视觉跟踪的时空关系变换跟踪器（TRTT），通过双目标记忆学习为时空关系建模提供了一种简洁的方法。具体来说，时空关系转换器网络根据静态和动态模板生成配对记忆，并以交互方式对其进行强化。该记忆包含隐式关系提示，可捕捉被跟踪物体与其周围环境之间的关系。更重要的是，为了确保不同帧之间目标实例身份的一致性，以前帧中的关系提示会被转移到当前帧中，以合并时间上下文注意力。我们的方法还包含重用有利交叉关系和特定实例特征的机制，从而通过顺序约束克服复杂时空交互中的背景干扰。此外，我们还设计了一种记忆标记稀疏化方法，利用目标的关键点消除干扰，优化注意力计算。大量实验证明，我们的方法在 8 个具有挑战性的基准测试中超越了先进的跟踪器，同时保持了实时运行速度。

{"title":"Temporal relation transformer for robust visual tracking with dual-memory learning","authors":"","doi":"10.1016/j.asoc.2024.112229","DOIUrl":"10.1016/j.asoc.2024.112229","url":null,"abstract":"<div><p>Recently, transformer trackers mostly associate multiple reference images with the search area to adapt to the changing appearance of the target. However, they ignore the learned cross-relations between the target and surrounding, leading to difficulties in building coherent contextual models for specific target instances. This paper presents a Temporal Relation Transformer Tracker (TRTT) for robust visual tracking, providing a concise approach to modeling temporal relations by dual target memory learning. Specifically, a temporal relation transformer network generates paired memories based on static and dynamic templates, which are reinforced interactively. The memory contains implicit relation hints that capture the relations between the tracked object and its immediate surroundings. More importantly, to ensure consistency of target instance identities between frames, the relation hints from previous frames are transferred to the current frame for merging temporal contextual attention. Our method also incorporates mechanisms for reusing favorable cross-relations and instance-specific features, thereby overcoming background interference in complex spatio-temporal interactions through a sequential constraint. Furthermore, we design a memory token sparsification method that leverages the key points of the target to eliminate interferences and optimize attention calculations. Extensive experiments demonstrate that our method surpasses advanced trackers on 8 challenging benchmarks while maintaining real-time running speed.</p></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient face anti-spoofing via head-aware transformer based knowledge distillation with 5 MB model parameters 利用 5 MB 模型参数，通过基于头部感知变换器的知识提炼实现高效人脸防欺骗

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing

Pub Date : 2024-09-10 DOI: 10.1016/j.asoc.2024.112237

Although face recognition technology has been applied in many scenarios, it still suffers from many types of presentation attacks, so face anti-spoofing (FAS) becomes a hot topic in computer vision. Recently, vision transformer is recognized as the mainstream architecture for FAS, which always relies on auxiliary information, sophisticated tricks and huge model parameters. Considering that face based identity authentication usually takes place on mobile-like devices, therefore how to design an effective and lightweight model is of great significance. Inspired by the powerful global modeling ability of self-attention and the model compression ability of knowledge distillation, a simple yet effective knowledge distillation approach is proposed for FAS under transformer framework. Our primary idea is to leverage the rich knowledge of a teacher network pre-trained on large-scale face data to guide the learning of a lightweight student network. The main contributions of our method are threefold: (1) Feature- and logits-level distillation are combined to transfer the rich knowledge of teacher to student. (2) A head-aware strategy is proposed to deal with the dimension mismatching issue of middle encoder layers between teacher and student networks, in which a novel attention head correlation matrix is introduced. (3) Our method can bridge the performance gap between teacher and student, and the resulting student network is extremely lightweight with only 5 MB parameters. Extensive experiments are conducted on three public face-spoofing datasets, CASIA-FASD, Replay-Attack and OULU-NPU, the results demonstrate that our method can obtain performance on par with or superior to most FAS methods and outperform many knowledge distillation methods. Meanwhile, the distilled student network achieves excellent performance with 17 $\times$ fewer parameters and 9 $\times$ faster inference time compared to the teacher network. The code will be publicly available at https://github.com/Maricle-zhangjun/HaTFAS.

尽管人脸识别技术已被应用于多种场景，但它仍然受到多种类型的呈现攻击，因此人脸防欺骗（FAS）成为计算机视觉领域的热门话题。近年来，视觉变换器被认为是人脸识别系统的主流架构，它总是依赖于辅助信息、复杂的技巧和庞大的模型参数。考虑到基于人脸的身份验证通常是在类移动设备上进行的，因此如何设计一种有效且轻量级的模型具有重要意义。受自我关注强大的全局建模能力和知识提炼的模型压缩能力的启发，我们在转换器框架下为 FAS 提出了一种简单而有效的知识提炼方法。我们的主要想法是利用在大规模人脸数据上预先训练好的教师网络的丰富知识来指导轻量级学生网络的学习。我们的方法主要有三方面的贡献：（1）将特征级和对数级蒸馏结合起来，将教师的丰富知识传授给学生。(2) 提出了一种头部感知策略，以解决教师和学生网络中间编码器层的维度不匹配问题，其中引入了一种新颖的注意力头部相关矩阵。(3) 我们的方法可以缩小教师和学生之间的性能差距，生成的学生网络非常轻量级，参数只有 5 MB。我们在 CASIA-FASD、Replay-Attack 和 OULU-NPU 三个公开的人脸欺骗数据集上进行了广泛的实验，结果表明我们的方法可以获得与大多数 FAS 方法相当或更高的性能，并优于许多知识提炼方法。同时，与教师网络相比，经过提炼的学生网络在参数数量减少 17 倍、推理时间缩短 9 倍的情况下取得了优异的性能。代码将在 https://github.com/Maricle-zhangjun/HaTFAS 上公开。

{"title":"Efficient face anti-spoofing via head-aware transformer based knowledge distillation with 5 MB model parameters","authors":"","doi":"10.1016/j.asoc.2024.112237","DOIUrl":"10.1016/j.asoc.2024.112237","url":null,"abstract":"<div><p>Although face recognition technology has been applied in many scenarios, it still suffers from many types of presentation attacks, so face anti-spoofing (FAS) becomes a hot topic in computer vision. Recently, vision transformer is recognized as the mainstream architecture for FAS, which always relies on auxiliary information, sophisticated tricks and huge model parameters. Considering that face based identity authentication usually takes place on mobile-like devices, therefore how to design an effective and lightweight model is of great significance. Inspired by the powerful global modeling ability of self-attention and the model compression ability of knowledge distillation, a simple yet effective knowledge distillation approach is proposed for FAS under transformer framework. Our primary idea is to leverage the rich knowledge of a teacher network pre-trained on large-scale face data to guide the learning of a lightweight student network. The main contributions of our method are threefold: (1) Feature- and logits-level distillation are combined to transfer the rich knowledge of teacher to student. (2) A head-aware strategy is proposed to deal with the dimension mismatching issue of middle encoder layers between teacher and student networks, in which a novel attention head correlation matrix is introduced. (3) Our method can bridge the performance gap between teacher and student, and the resulting student network is extremely lightweight with only 5 MB parameters. Extensive experiments are conducted on three public face-spoofing datasets, CASIA-FASD, Replay-Attack and OULU-NPU, the results demonstrate that our method can obtain performance on par with or superior to most FAS methods and outperform many knowledge distillation methods. Meanwhile, the distilled student network achieves excellent performance with 17<span><math><mo>×</mo></math></span> fewer parameters and 9<span><math><mo>×</mo></math></span> faster inference time compared to the teacher network. The code will be publicly available at <span><span>https://github.com/Maricle-zhangjun/HaTFAS</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An ultra-high-definition multi-exposure image fusion method based on multi-scale feature extraction 基于多尺度特征提取的超高清多曝光图像融合方法

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing

Pub Date : 2024-09-10 DOI: 10.1016/j.asoc.2024.112240

Multiple exposure image fusion is a technique used to obtain high dynamic range images. Due to its low cost and high efficiency, it has received a lot of attention from researchers in recent years. Currently, most deep learning-based multiple exposure image fusion methods extract features from different exposure images using a single feature extraction method. Some methods simply rely on two different modules to directly extract features. However, this approach inevitably leads to the loss of some feature information during the feature extraction process, thus further affecting the performance of the model. To minimize the loss of feature information as much as possible, we propose an ultra-high-definition (UHD) multiple exposure image fusion method based on multi-scale feature extraction. The method adopts a U-shaped structure to construct the overall network model, which can fully exploit the feature information at different levels. Additionally, we construct a novel hybrid stacking paradigm to combine convolutional neural networks and Transformer modules. This combined module can extract both local texture features and global color features simultaneously. To more efficiently fuse and extract features, we also design a cross-layer feature fusion module, which can adaptively learn the correlation between features at different layers. Numerous quantitative and qualitative results demonstrate that our proposed method performs well in UHD multiple exposure image fusion.

多重曝光图像融合是一种用于获取高动态范围图像的技术。由于其成本低、效率高，近年来受到研究人员的广泛关注。目前，大多数基于深度学习的多重曝光图像融合方法都是使用单一特征提取方法从不同曝光图像中提取特征。有些方法只是依靠两个不同的模块直接提取特征。然而，这种方法不可避免地会在特征提取过程中损失一些特征信息，从而进一步影响模型的性能。为了尽可能减少特征信息的损失，我们提出了一种基于多尺度特征提取的超高清（UHD）多重曝光图像融合方法。该方法采用 U 型结构构建整体网络模型，可充分利用不同层次的特征信息。此外，我们还构建了一种新颖的混合堆叠范式，将卷积神经网络和 Transformer 模块结合起来。这种组合模块可以同时提取局部纹理特征和全局颜色特征。为了更有效地融合和提取特征，我们还设计了一个跨层特征融合模块，它可以自适应地学习不同层特征之间的相关性。大量定量和定性结果表明，我们提出的方法在超高清多重曝光图像融合中表现出色。

{"title":"An ultra-high-definition multi-exposure image fusion method based on multi-scale feature extraction","authors":"","doi":"10.1016/j.asoc.2024.112240","DOIUrl":"10.1016/j.asoc.2024.112240","url":null,"abstract":"<div><p>Multiple exposure image fusion is a technique used to obtain high dynamic range images. Due to its low cost and high efficiency, it has received a lot of attention from researchers in recent years. Currently, most deep learning-based multiple exposure image fusion methods extract features from different exposure images using a single feature extraction method. Some methods simply rely on two different modules to directly extract features. However, this approach inevitably leads to the loss of some feature information during the feature extraction process, thus further affecting the performance of the model. To minimize the loss of feature information as much as possible, we propose an ultra-high-definition (UHD) multiple exposure image fusion method based on multi-scale feature extraction. The method adopts a U-shaped structure to construct the overall network model, which can fully exploit the feature information at different levels. Additionally, we construct a novel hybrid stacking paradigm to combine convolutional neural networks and Transformer modules. This combined module can extract both local texture features and global color features simultaneously. To more efficiently fuse and extract features, we also design a cross-layer feature fusion module, which can adaptively learn the correlation between features at different layers. Numerous quantitative and qualitative results demonstrate that our proposed method performs well in UHD multiple exposure image fusion.</p></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142168079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An adaptive shuffled frog-leaping algorithm for flexible flow shop scheduling problem with batch processing machines 针对具有批量加工机器的灵活流动车间调度问题的自适应洗牌蛙跳算法

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing

Pub Date : 2024-09-10 DOI: 10.1016/j.asoc.2024.112230

Batch Processing Machines (BPM) and transportation are seldom studied simultaneously in Flexible Flow Shop. In this study, Flexible Flow Shop Scheduling Problem (FFSP) with BPM at the last stage and transportation is considered and an adaptive shuffled frog-leaping algorithm (ASFLA) is proposed to minimize makespan. To produce high-quality solutions, a heuristic is employed to produce initial solution, two groups are formed by using all memeplexes, then an adaptive memeplex search is implemented, in which the number of searches is dynamically determined by the quality of the memeplex, an adaptive group search is also conducted by exchanging memeplexes or supporting of the worse memeplex. A novel population shuffling and the worst memeplex elimination are proposed. A number of computational experiments are executed to test the new strategies and performances of ASFLA. Computational results demonstrate that new strategies are effective and ASFLA is a very competitive algorithm for FFSP with BPM and transportation.

在柔性流程车间中，批量处理机（BPM）和运输很少被同时研究。在本研究中，考虑了在最后阶段有 BPM 和运输的柔性流水线调度问题（FFSP），并提出了一种自适应洗牌蛙跳算法（ASFLA）来最小化工期。为了生成高质量的解，该算法采用启发式方法生成初始解，通过使用所有memeplex组成两组，然后实施自适应memeplex搜索，其中搜索次数由memeplex质量动态决定，还通过交换memeplex或支持较差的memeplex进行自适应组搜索。我们提出了一种新的群体洗牌和最差记忆体淘汰方法。为了测试 ASFLA 的新策略和性能，我们进行了大量计算实验。计算结果证明，新策略是有效的，而且 ASFLA 是一种非常有竞争力的算法，适用于带有 BPM 和运输功能的 FFSP。

引用次数: 0

A fermatean fuzzy SWARA-TOPSIS methodology based on SCOR model for autonomous vehicle parking lot selection 基于 SCOR 模型的自动驾驶汽车停车场选择模糊 SWARA-TOPSIS 方法

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing

Pub Date : 2024-09-10 DOI: 10.1016/j.asoc.2024.112198

Population growth in crowded cities and the resulting increase in vehicle use have led to the problem of insufficient parking. When public parking lots and urban growth are not in coordination, vehicles park on the street and close the crosswalks. In the coming years, this problem will become more complicated with the addition of autonomous vehicles (AVs) to urban traffic. This study addresses the research question of how to effectively select AV parking lots in urban areas experiencing population growth and increased vehicle usage. For this aim, a hybrid Multi-Criteria Decision Making (MCDM) methodology, combining SWARA (Step-wise Weight Assessment Ratio Analysis) and TOPSIS (Technique for Order Preference by Similarity) approaches in a Fermatean Fuzzy (FF) environment is proposed. The decision hierarchy based on the SCOR model has been developed to determine and construct the evaluation criteria. Then, a case study analysis has been applied to selected districts in Istanbul, which is Turkiye's most populous and developing city. Operating expenses, safety and security, and land costs are determined as the most important factors. As a result of the detailed fuzzy analysis, which districts should primarily be chosen for AV parking lots in Istanbul is determined and finally, the robustness and validity of the results obtained by the sensitivity analysis being questioned. The study contributes by providing insights into AV parking lot selection, demonstrating the efficacy of the proposed methodology, and highlighting the importance of addressing this issue in urban planning.

拥挤城市的人口增长和随之而来的车辆使用增加导致了停车位不足的问题。当公共停车场与城市发展不协调时，车辆就会停在街道上，关闭人行横道。未来几年，随着自动驾驶汽车（AV）加入城市交通，这一问题将变得更加复杂。本研究探讨了如何在人口增长和车辆使用增加的城市地区有效选择 AV 停车场的研究问题。为此，本研究提出了一种混合多标准决策（MCDM）方法，该方法结合了在费曼模糊（FF）环境下的 SWARA（逐步权重评估比率分析）和 TOPSIS（相似性排序偏好技术）方法。在 SCOR 模型的基础上建立了决策层次结构，以确定和构建评价标准。然后，对伊斯坦布尔（土耳其人口最多的发展中城市）的选定地区进行了案例研究分析。运营费用、安全和安保以及土地成本被确定为最重要的因素。通过详细的模糊分析，确定了伊斯坦布尔的视听停车场应主要选择哪些地区，最后，对敏感性分析得出的结果的稳健性和有效性提出了质疑。这项研究有助于深入了解视听停车场的选择，证明了所建议方法的有效性，并强调了在城市规划中解决这一问题的重要性。

{"title":"A fermatean fuzzy SWARA-TOPSIS methodology based on SCOR model for autonomous vehicle parking lot selection","authors":"","doi":"10.1016/j.asoc.2024.112198","DOIUrl":"10.1016/j.asoc.2024.112198","url":null,"abstract":"<div><p>Population growth in crowded cities and the resulting increase in vehicle use have led to the problem of insufficient parking. When public parking lots and urban growth are not in coordination, vehicles park on the street and close the crosswalks. In the coming years, this problem will become more complicated with the addition of autonomous vehicles (AVs) to urban traffic. This study addresses the research question of how to effectively select AV parking lots in urban areas experiencing population growth and increased vehicle usage. For this aim, a hybrid Multi-Criteria Decision Making (MCDM) methodology, combining SWARA (Step-wise Weight Assessment Ratio Analysis) and TOPSIS (Technique for Order Preference by Similarity) approaches in a Fermatean Fuzzy (FF) environment is proposed. The decision hierarchy based on the SCOR model has been developed to determine and construct the evaluation criteria. Then, a case study analysis has been applied to selected districts in Istanbul, which is Turkiye's most populous and developing city. Operating expenses, safety and security, and land costs are determined as the most important factors. As a result of the detailed fuzzy analysis, which districts should primarily be chosen for AV parking lots in Istanbul is determined and finally, the robustness and validity of the results obtained by the sensitivity analysis being questioned. The study contributes by providing insights into AV parking lot selection, demonstrating the efficacy of the proposed methodology, and highlighting the importance of addressing this issue in urban planning.</p></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A sound event detection support system for smart home based on “two-to-one” teacher–student learning 基于 "二对一 "师生学习的智能家居声音事件检测支持系统

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing

Pub Date : 2024-09-10 DOI: 10.1016/j.asoc.2024.112224

Sound event detection (SED) is a core technology in smart home projects that rely on detected sound events to trigger specific actions. SED systems face two major challenges: high labeling costs and complex acoustic environments. To reduce labeling costs, some semi-supervised systems extract both global and local features for classification. However, these methods treat global and local features equally, not accounting for their varying importance when recognizing different types of sound events. Furthermore, to address complex acoustic environments, some studies use multitask learning frameworks to introduce SED-related tasks as auxiliaries to improve detection performance. However, these methods fail to align tasks within the framework, leading to conflicting outputs that may limit system performance. To address these issues, in this paper we propose a “two-to-one” teacher-student learning based semi-supervised SED system. This system employs a gating mechanism to selectively enhance global and local features, improving adaptability to different types of sound events, and incorporates a cross-task alignment module to interact SED with related tasks, reducing the risk of performance degradation caused by conflicting outputs. Experimental results on two datasets demonstrate that our system achieves the best performance in all metrics, with EB-F1 scores of 48.1 % and 64.7 %, representing improvements of 15.3 % and 10.6 % over the baseline ConformerSED system, respectively. Our work offers an effective SED solution for smart home projects by providing a semi-supervised SED system that performs well while reducing labeling costs.

声音事件检测（SED）是智能家居项目中的一项核心技术，它依靠检测到的声音事件来触发特定操作。SED 系统面临两大挑战：高昂的标注成本和复杂的声学环境。为了降低标注成本，一些半监督系统同时提取全局和局部特征进行分类。然而，这些方法对全局和局部特征一视同仁，没有考虑到它们在识别不同类型声音事件时的不同重要性。此外，为了应对复杂的声学环境，一些研究利用多任务学习框架引入 SED 相关任务作为辅助工具，以提高检测性能。然而，这些方法未能协调框架内的任务，导致输出结果相互冲突，从而限制了系统性能。为了解决这些问题，我们在本文中提出了一种基于 "二对一 "师生学习的半监督 SED 系统。该系统采用门控机制，选择性地增强全局和局部特征，提高了对不同类型声音事件的适应性，并结合了跨任务对齐模块，将 SED 与相关任务进行交互，降低了因输出冲突而导致性能下降的风险。在两个数据集上的实验结果表明，我们的系统在所有指标上都取得了最佳性能，EB-F1 分数分别为 48.1 % 和 64.7 %，与基线 ConformerSED 系统相比分别提高了 15.3 % 和 10.6 %。我们的工作为智能家居项目提供了一种有效的 SED 解决方案，它提供了一种半监督 SED 系统，该系统性能良好，同时降低了标签成本。

{"title":"A sound event detection support system for smart home based on “two-to-one” teacher–student learning","authors":"","doi":"10.1016/j.asoc.2024.112224","DOIUrl":"10.1016/j.asoc.2024.112224","url":null,"abstract":"<div><p>Sound event detection (SED) is a core technology in smart home projects that rely on detected sound events to trigger specific actions. SED systems face two major challenges: high labeling costs and complex acoustic environments. To reduce labeling costs, some semi-supervised systems extract both global and local features for classification. However, these methods treat global and local features equally, not accounting for their varying importance when recognizing different types of sound events. Furthermore, to address complex acoustic environments, some studies use multitask learning frameworks to introduce SED-related tasks as auxiliaries to improve detection performance. However, these methods fail to align tasks within the framework, leading to conflicting outputs that may limit system performance. To address these issues, in this paper we propose a “two-to-one” teacher-student learning based semi-supervised SED system. This system employs a gating mechanism to selectively enhance global and local features, improving adaptability to different types of sound events, and incorporates a cross-task alignment module to interact SED with related tasks, reducing the risk of performance degradation caused by conflicting outputs. Experimental results on two datasets demonstrate that our system achieves the best performance in all metrics, with EB-F1 scores of 48.1 % and 64.7 %, representing improvements of 15.3 % and 10.6 % over the baseline ConformerSED system, respectively. Our work offers an effective SED solution for smart home projects by providing a semi-supervised SED system that performs well while reducing labeling costs.</p></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142238715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A linear programming aggregation method based on generalized Zhenyuan integral in q-ROFN environment and the application of talent recruitment in universities 基于q-ROFN环境下广义振源积分的线性规划聚合方法及在高校人才招聘中的应用

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing

Pub Date : 2024-09-08 DOI: 10.1016/j.asoc.2024.112214

The reasonable ranking of binary pairs that characterize fuzzy information in many fuzzy decision problems is very important. To overcome some defects of the existing score functions for the q-rung orthopair fuzzy numbers (q-ROFNs), a novel score function and ranking criterion are proposed by the q-compression transformation and hesitation factor. The main motivation is to introduce the generalized Zhenyuan (GZ)-integral into the q-ROFN environment, and cleverly transform the aggregation operations into a linear programming problem through the arithmetic operations of q-ROFNs. The main contribution is to solve the aggregation problem of q-rung orthopair fuzzy generalized Zhenyuan integral ordered weighted average (q-ROFGZIOWA) operator through the optimization technique of linear programming, and a new decision making method is established by using the q-ROFGZIOWA operator and ranking criterion. The main innovation is to map all q-ROFNs to the unit triangle in the first quadrant (converted into intuitionistic fuzzy numbers, IFNs) according to the q-compression transformation in geometric significance, and the novel score function and its ranking criterion are proposed by combining hesitation factor, and then the aggregation operation based on generalized Z-integral is converted to an optimization problem in linear programming. Finally, the superiority of the proposed method are verified by comparing the aggregation results of two integral operators through an example, and apply the proposed method to the optimal decision-making of talent recruitment in universities. The proposed method can not only correct some flaws in the ranking of existing q-ROFNs, but also overcomes some defects of existing Choquet integral average (geometric) operators in a q-ROFN environment. These results are of great significance for further research on the widespread application of q-ROFNs.

在许多模糊决策问题中，对表征模糊信息的二元对进行合理排序非常重要。为了克服现有 q-ROFN（q-rung orthopair fuzzy numbers，q-ROFN）分值函数的一些缺陷，本文通过 q-压缩变换和犹豫因子提出了一种新的分值函数和排序准则。其主要动机是在 q-ROFN 环境中引入广义振源（GZ）积分，并通过 q-ROFN 的算术运算将聚合运算巧妙地转化为线性规划问题。其主要贡献在于通过线性规划的优化技术解决了q-rung正对模糊广义振源积分有序加权平均（q-ROFGZIOWA）算子的聚合问题，并利用q-ROFGZIOWA算子和排序准则建立了一种新的决策方法。主要创新点是根据几何意义中的 q 压缩变换，将所有 q-ROFN 映射到第一象限的单位三角形（转换为直觉模糊数，IFN），并结合犹豫因子提出了新的评分函数及其排序准则，然后将基于广义 Z 积分的聚合运算转换为线性规划中的优化问题。最后，通过实例比较两种积分运算的聚合结果，验证了所提方法的优越性，并将所提方法应用于高校人才招聘的优化决策中。所提出的方法不仅可以修正现有 q-ROFN 排序中的一些缺陷，而且克服了现有 q-ROFN 环境下 Choquet 积分平均（几何）算子的一些缺陷。这些结果对进一步研究 q-ROFN 的广泛应用具有重要意义。

{"title":"A linear programming aggregation method based on generalized Zhenyuan integral in q-ROFN environment and the application of talent recruitment in universities","authors":"","doi":"10.1016/j.asoc.2024.112214","DOIUrl":"10.1016/j.asoc.2024.112214","url":null,"abstract":"<div><p>The reasonable ranking of binary pairs that characterize fuzzy information in many fuzzy decision problems is very important. To overcome some defects of the existing score functions for the q-rung orthopair fuzzy numbers (q-ROFNs), a novel score function and ranking criterion are proposed by the q-compression transformation and hesitation factor. The main motivation is to introduce the generalized Zhenyuan (GZ)-integral into the q-ROFN environment, and cleverly transform the aggregation operations into a linear programming problem through the arithmetic operations of q-ROFNs. The main contribution is to solve the aggregation problem of q-rung orthopair fuzzy generalized Zhenyuan integral ordered weighted average (q-ROFGZIOWA) operator through the optimization technique of linear programming, and a new decision making method is established by using the q-ROFGZIOWA operator and ranking criterion. The main innovation is to map all q-ROFNs to the unit triangle in the first quadrant (converted into intuitionistic fuzzy numbers, IFNs) according to the q-compression transformation in geometric significance, and the novel score function and its ranking criterion are proposed by combining hesitation factor, and then the aggregation operation based on generalized Z-integral is converted to an optimization problem in linear programming. Finally, the superiority of the proposed method are verified by comparing the aggregation results of two integral operators through an example, and apply the proposed method to the optimal decision-making of talent recruitment in universities. The proposed method can not only correct some flaws in the ranking of existing q-ROFNs, but also overcomes some defects of existing Choquet integral average (geometric) operators in a q-ROFN environment. These results are of great significance for further research on the widespread application of q-ROFNs.</p></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142272705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0