首页 > 最新文献

Systems and Soft Computing最新文献

英文 中文
Path optimization for robots using nature-inspired hybrid algorithms: ACO, PSO, and GSO 基于自然启发的混合算法:蚁群算法、粒子群算法和粒子群算法的机器人路径优化
IF 3.6 Pub Date : 2026-01-05 DOI: 10.1016/j.sasc.2026.200439
Sameer Shastri , Nagendra Tripathi , Shobha Lata Sinha , Saroj Kumar Pandey , Maheswaran S , Anurag Sinha , Ayodeji Olalekan Salau
Path planning represents a fundamental challenge in the domain of artificial intelligence, particularly when autonomous robots are required to operate in complex and dynamic environments. Conventional optimization techniques, including Genetic Algorithms (GA) and Ant Colony Optimization (ACO), have been extensively applied to this problem; however, such single-method approaches frequently encounter limitations in terms of efficiency, adaptability, and robustness. To overcome these challenges, this study introduces a hybrid metaheuristic framework that integrates three nature-inspired algorithms: Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), and Glow-worm Swarm Optimization (GSO). The proposed approach, termed APGO (Ant Colony, Particle Swarm, and Glow-worm Optimizer), capitalizes on the distinct strengths of each algorithm ACO for shortest-path identification, PSO for rapid convergence, and GSO for effective local search while simultaneously mitigating their individual weaknesses. The principal contribution of this work lies in demonstrating that the integration of ACO, PSO, and GSO yields a significant improvement in optimizing both path length and travel time. Extensive simulation studies and comparative evaluations against traditional single-algorithm methods confirm that APGO achieves superior performance in terms of efficiency, reliability, and computational effectiveness. These findings provide strong evidence that hybrid metaheuristic strategies can advance the state of the art in autonomous robot navigation and serve as a foundation for future developments in intelligent path planning.
路径规划是人工智能领域的一个基本挑战,特别是当自主机器人需要在复杂和动态的环境中运行时。传统的优化技术,包括遗传算法(GA)和蚁群优化(ACO),已被广泛应用于该问题;然而,这种单方法方法在效率、适应性和健壮性方面经常遇到限制。为了克服这些挑战,本研究引入了一种混合元启发式框架,该框架集成了三种受自然启发的算法:蚁群优化(ACO)、粒子群优化(PSO)和萤火虫群优化(GSO)。所提出的方法被称为APGO(蚁群、粒子群和萤火虫优化器),利用了每种算法的独特优势——蚁群算法用于最短路径识别,粒子群算法用于快速收敛,粒子群算法用于有效的局部搜索,同时减轻了它们各自的弱点。这项工作的主要贡献在于证明了蚁群算法、粒子群算法和粒子群算法的集成在优化路径长度和旅行时间方面都有显著的改进。大量的仿真研究和与传统单算法方法的对比评估证实,APGO在效率、可靠性和计算效率方面都具有优越的性能。这些发现提供了强有力的证据,表明混合元启发式策略可以推动自主机器人导航的发展,并为未来智能路径规划的发展奠定基础。
{"title":"Path optimization for robots using nature-inspired hybrid algorithms: ACO, PSO, and GSO","authors":"Sameer Shastri ,&nbsp;Nagendra Tripathi ,&nbsp;Shobha Lata Sinha ,&nbsp;Saroj Kumar Pandey ,&nbsp;Maheswaran S ,&nbsp;Anurag Sinha ,&nbsp;Ayodeji Olalekan Salau","doi":"10.1016/j.sasc.2026.200439","DOIUrl":"10.1016/j.sasc.2026.200439","url":null,"abstract":"<div><div>Path planning represents a fundamental challenge in the domain of artificial intelligence, particularly when autonomous robots are required to operate in complex and dynamic environments. Conventional optimization techniques, including Genetic Algorithms (GA) and Ant Colony Optimization (ACO), have been extensively applied to this problem; however, such single-method approaches frequently encounter limitations in terms of efficiency, adaptability, and robustness. To overcome these challenges, this study introduces a hybrid metaheuristic framework that integrates three nature-inspired algorithms: Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), and Glow-worm Swarm Optimization (GSO). The proposed approach, termed APGO (Ant Colony, Particle Swarm, and Glow-worm Optimizer), capitalizes on the distinct strengths of each algorithm ACO for shortest-path identification, PSO for rapid convergence, and GSO for effective local search while simultaneously mitigating their individual weaknesses. The principal contribution of this work lies in demonstrating that the integration of ACO, PSO, and GSO yields a significant improvement in optimizing both path length and travel time. Extensive simulation studies and comparative evaluations against traditional single-algorithm methods confirm that APGO achieves superior performance in terms of efficiency, reliability, and computational effectiveness. These findings provide strong evidence that hybrid metaheuristic strategies can advance the state of the art in autonomous robot navigation and serve as a foundation for future developments in intelligent path planning.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"8 ","pages":"Article 200439"},"PeriodicalIF":3.6,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid architecture combining physical modeling and neural networks for piano sound synthesis 结合物理建模和神经网络的钢琴声音合成的混合架构
IF 3.6 Pub Date : 2026-01-02 DOI: 10.1016/j.sasc.2025.200435
Dan Zhang
This paper presents a novel hybrid architecture for piano sound synthesis that combines physical modeling and neural networks. The proposed approach leverages the strengths of both methods by integrating a physical modeling module that simulates the acoustic behavior of piano components with a neural network module that learns complex timbral mappings. The physical modeling module provides a physically grounded foundation by modeling string vibrations, hammer-string interactions, and soundboard radiation, while the neural network captures subtle tonal characteristics through data-driven learning. Experimental results demonstrate that our hybrid architecture achieves superior performance in terms of sound quality, realism, and expressiveness compared to traditional synthesis methods, while maintaining reasonable computational efficiency. Objective evaluations show improved spectral accuracy with a spectral convergence score of 0.92, and subjective listening tests confirm enhanced perceptual quality with realism scores of 4.6 and expressiveness scores of 4.5 on a 5-point scale. The hybrid approach achieves synthesis in 2.5 seconds with 1200 MB memory usage, demonstrating computational efficiency. The architecture enables intuitive control through physical parameters while benefiting from the learning capabilities of neural networks, making it suitable for various applications in music production and performance.
提出了一种结合物理建模和神经网络的钢琴声音合成混合体系结构。该方法利用了两种方法的优势,将模拟钢琴部件声学行为的物理建模模块与学习复杂音色映射的神经网络模块集成在一起。物理建模模块通过建模弦振动、锤击弦相互作用和音板辐射提供物理基础,而神经网络通过数据驱动的学习捕捉微妙的音调特征。实验结果表明,与传统的合成方法相比,我们的混合结构在音质、真实感和表现力方面都取得了更好的性能,同时保持了合理的计算效率。客观评价表明频谱精度得到提高,频谱收敛得分为0.92,主观听力测试证实感知质量得到提高,现实主义得分为4.6,表现力得分为4.5(5分制)。这种混合方法在使用1200 MB内存的情况下,在2.5秒内完成了合成,证明了计算效率。该架构通过物理参数实现直观控制,同时受益于神经网络的学习能力,使其适用于音乐制作和表演中的各种应用。
{"title":"A hybrid architecture combining physical modeling and neural networks for piano sound synthesis","authors":"Dan Zhang","doi":"10.1016/j.sasc.2025.200435","DOIUrl":"10.1016/j.sasc.2025.200435","url":null,"abstract":"<div><div>This paper presents a novel hybrid architecture for piano sound synthesis that combines physical modeling and neural networks. The proposed approach leverages the strengths of both methods by integrating a physical modeling module that simulates the acoustic behavior of piano components with a neural network module that learns complex timbral mappings. The physical modeling module provides a physically grounded foundation by modeling string vibrations, hammer-string interactions, and soundboard radiation, while the neural network captures subtle tonal characteristics through data-driven learning. Experimental results demonstrate that our hybrid architecture achieves superior performance in terms of sound quality, realism, and expressiveness compared to traditional synthesis methods, while maintaining reasonable computational efficiency. Objective evaluations show improved spectral accuracy with a spectral convergence score of 0.92, and subjective listening tests confirm enhanced perceptual quality with realism scores of 4.6 and expressiveness scores of 4.5 on a 5-point scale. The hybrid approach achieves synthesis in 2.5 seconds with 1200 MB memory usage, demonstrating computational efficiency. The architecture enables intuitive control through physical parameters while benefiting from the learning capabilities of neural networks, making it suitable for various applications in music production and performance.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"8 ","pages":"Article 200435"},"PeriodicalIF":3.6,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Overcoming center-bias behavior: A metaheuristic algorithm with dual operators for optimized search and refinement 克服中心偏置行为:一种具有对偶算子的元启发式优化搜索和细化算法
IF 3.6 Pub Date : 2025-12-27 DOI: 10.1016/j.sasc.2025.200436
Erik Cuevas, Oscar A. González-Sánchez, Héctor Escobar, Ernesto Ayala, Daniel Zaldívar, Marco Pérez-Cisneros, Alma N. Rodríguez-Vázquez
In most metaheuristic methods, a single operator is employed for both exploration and exploitation. While this makes the algorithm simple, this approach can introduce inefficiencies, such as inadequate coverage of areas, revisiting irrelevant spaces, and poor refinement of solutions. In this paper, a new metaheuristic algorithm using two operators designed specifically to perform exploration and exploitation tasks is introduced. During the initial phase, the exploration operator constructs a trajectory formed by the sequential interpolation of points obtained using the Latin Hypercube Sampling technique. The trajectory completely covers the areas in the search space and acts as a guide mechanism for exploring the runtime of the algorithm. As the search continues, each agent changes its position to follow this track such that the particles continue to explore the search domain and identify its most promising regions. In contrast, the exploitation operator uses a crossover operation in which an agent’s present position is modified to a new position inside a ribbon-shaped area using its position combined with the best solutions found until then, allowing the exploitation phase to concentrate on refining these promising solutions further. Together, these operators provide a balanced approach to exploring and exploiting the search space, enhancing the algorithm’s overall effectiveness. The efficacy of the proposed approach was validated by comparing the algorithm to various metaheuristic algorithms using a standard set of functions that have been shifted to accurately assess the performance of methods employing center-biased operators. The findings indicate that this method yields competitive outcomes, providing superior quality solutions and quicker convergence rates, while avoiding the drawbacks associated with algorithms reliant on center-biased operators.
在大多数元启发式方法中,同时使用单个算子进行勘探和开发。虽然这使得算法很简单,但这种方法可能会导致效率低下,例如对区域的覆盖不足、重新访问不相关的空间以及对解决方案的改进不足。本文介绍了一种新的元启发式算法,该算法使用两个专门设计的算子来执行勘探和开发任务。在初始阶段,勘探算子利用拉丁超立方采样技术对得到的点进行序列插值,构建一条轨迹。轨迹完全覆盖了搜索空间的区域,并作为探索算法运行时的指导机制。随着搜索的继续,每个智能体改变其位置以遵循这条轨道,这样粒子就可以继续探索搜索域并确定最有希望的区域。相比之下,开发操作人员使用交叉操作,将代理的当前位置修改为带状区域内的新位置,利用其位置与迄今为止找到的最佳解决方案相结合,使开发阶段能够集中精力进一步完善这些有前途的解决方案。总之,这些运算符提供了一种平衡的方法来探索和利用搜索空间,提高了算法的整体有效性。通过将该算法与使用标准函数集的各种元启发式算法进行比较,验证了所提出方法的有效性,这些函数集已被转移,以准确评估采用中心偏置算子的方法的性能。研究结果表明,这种方法产生了有竞争力的结果,提供了高质量的解决方案和更快的收敛速度,同时避免了与依赖中心偏差算子的算法相关的缺点。
{"title":"Overcoming center-bias behavior: A metaheuristic algorithm with dual operators for optimized search and refinement","authors":"Erik Cuevas,&nbsp;Oscar A. González-Sánchez,&nbsp;Héctor Escobar,&nbsp;Ernesto Ayala,&nbsp;Daniel Zaldívar,&nbsp;Marco Pérez-Cisneros,&nbsp;Alma N. Rodríguez-Vázquez","doi":"10.1016/j.sasc.2025.200436","DOIUrl":"10.1016/j.sasc.2025.200436","url":null,"abstract":"<div><div>In most metaheuristic methods, a single operator is employed for both exploration and exploitation. While this makes the algorithm simple, this approach can introduce inefficiencies, such as inadequate coverage of areas, revisiting irrelevant spaces, and poor refinement of solutions. In this paper, a new metaheuristic algorithm using two operators designed specifically to perform exploration and exploitation tasks is introduced. During the initial phase, the exploration operator constructs a trajectory formed by the sequential interpolation of points obtained using the Latin Hypercube Sampling technique. The trajectory completely covers the areas in the search space and acts as a guide mechanism for exploring the runtime of the algorithm. As the search continues, each agent changes its position to follow this track such that the particles continue to explore the search domain and identify its most promising regions. In contrast, the exploitation operator uses a crossover operation in which an agent’s present position is modified to a new position inside a ribbon-shaped area using its position combined with the best solutions found until then, allowing the exploitation phase to concentrate on refining these promising solutions further. Together, these operators provide a balanced approach to exploring and exploiting the search space, enhancing the algorithm’s overall effectiveness. The efficacy of the proposed approach was validated by comparing the algorithm to various metaheuristic algorithms using a standard set of functions that have been shifted to accurately assess the performance of methods employing center-biased operators. The findings indicate that this method yields competitive outcomes, providing superior quality solutions and quicker convergence rates, while avoiding the drawbacks associated with algorithms reliant on center-biased operators.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"8 ","pages":"Article 200436"},"PeriodicalIF":3.6,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145926880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy curve based multilevel thresholding with artificial hummingbird algorithm and minimum cross entropy 基于人工蜂鸟算法和最小交叉熵的能量曲线多层阈值分割
IF 3.6 Pub Date : 2025-12-26 DOI: 10.1016/j.sasc.2025.200432
Tirumalasetti Supraja, Kankanala Srinivas
Multilevel thresholding is an important technique in color image segmentation, yet traditional methods such as OTSU’s often struggle to preserve meaningful structures in complex images. To address this limitation, we propose a hybrid segmentation framework that integrates the Artificial Hummingbird Algorithm (AHA) with the Minimum Cross Entropy Measure (MCEM) as the objective function. Instead of relying on the global histogram, the method employs an intensity level energy curve to capture spatial intensity variation and fine edge information. AHA incorporates guided, territorial, and migration foraging strategies, enabling an effective balance between exploration and exploitation during threshold optimization. The proposed approach is evaluated across multiple threshold levels and benchmarked against OTSU’s and MCEM based methods enhanced through four metaheuristics: Aquila Optimizer (AO), Equilibrium Optimizer (EO), Particle Swarm Optimization (PSO), and Whale Optimization Algorithm (WOA). Performance is assessed using seven quantitative metrics: PSNR, SSIM, FSIM, QILV, Correlation coefficient, Edge Preservation Index (EPI), and Mutual Information Factor (MIF). Experimental results on satellite images demonstrate that the proposed method delivers improved segmentation quality, robustness, and structural fidelity, showing strong potential for environmental monitoring, remote sensing, and disaster analysis applications.
多层阈值分割是彩色图像分割中的一项重要技术,但传统的阈值分割方法(如OTSU)往往难以在复杂图像中保留有意义的结构。为了解决这一限制,我们提出了一种混合分割框架,该框架将人工蜂鸟算法(AHA)与最小交叉熵测度(MCEM)作为目标函数。该方法不依赖全局直方图,而是采用强度级能量曲线来捕捉空间强度变化和精细边缘信息。AHA结合了引导、领土和迁移觅食策略,在阈值优化过程中实现了探索和开发之间的有效平衡。该方法通过Aquila优化器(AO)、均衡优化器(EO)、粒子群优化器(PSO)和鲸鱼优化算法(WOA)四种元启发式方法进行了评估,并与基于OTSU和MCEM的方法进行了基准测试。性能评估使用七个定量指标:PSNR, SSIM, FSIM, QILV,相关系数,边缘保存指数(EPI)和互信息因子(MIF)。卫星图像的实验结果表明,该方法具有更好的分割质量、鲁棒性和结构保真度,在环境监测、遥感和灾害分析应用中具有很强的潜力。
{"title":"Energy curve based multilevel thresholding with artificial hummingbird algorithm and minimum cross entropy","authors":"Tirumalasetti Supraja,&nbsp;Kankanala Srinivas","doi":"10.1016/j.sasc.2025.200432","DOIUrl":"10.1016/j.sasc.2025.200432","url":null,"abstract":"<div><div>Multilevel thresholding is an important technique in color image segmentation, yet traditional methods such as OTSU’s often struggle to preserve meaningful structures in complex images. To address this limitation, we propose a hybrid segmentation framework that integrates the Artificial Hummingbird Algorithm (AHA) with the Minimum Cross Entropy Measure (MCEM) as the objective function. Instead of relying on the global histogram, the method employs an intensity level energy curve to capture spatial intensity variation and fine edge information. AHA incorporates guided, territorial, and migration foraging strategies, enabling an effective balance between exploration and exploitation during threshold optimization. The proposed approach is evaluated across multiple threshold levels and benchmarked against OTSU’s and MCEM based methods enhanced through four metaheuristics: Aquila Optimizer (AO), Equilibrium Optimizer (EO), Particle Swarm Optimization (PSO), and Whale Optimization Algorithm (WOA). Performance is assessed using seven quantitative metrics: PSNR, SSIM, FSIM, QILV, Correlation coefficient, Edge Preservation Index (EPI), and Mutual Information Factor (MIF). Experimental results on satellite images demonstrate that the proposed method delivers improved segmentation quality, robustness, and structural fidelity, showing strong potential for environmental monitoring, remote sensing, and disaster analysis applications.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"8 ","pages":"Article 200432"},"PeriodicalIF":3.6,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145926881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Plant leaves disease detection in complex background using automatic segmentation and improved multi-scale feature fusion network 基于自动分割和改进多尺度特征融合网络的复杂背景下植物叶片病害检测
IF 3.6 Pub Date : 2025-12-23 DOI: 10.1016/j.sasc.2025.200434
Apoorva Arora, Vinay Gautam
In order to solve a number of issues and increase agricultural output, researchers have recently widely used Artificial Intelligence (AI) approaches in smart farming. Given the enormous diversity of plants in the world and the many diseases that have a negative effect on crop productivity, identifying and categorizing plant diseases is a difficult undertaking. One of the main areas of research globally has been the automatic segmentation of images showing plant leaf diseases. The efficiency of the proposed model is assessed using open-source datasets, such as FGVC 7, which comprise four distinct classes of leaf diseases. The picture undergoes three phases of pre-processing. First is, a Rank Order Fuzzy (ROF) filter approach is used to minimize background noise in the plant image. Then resizing the image and finally augment of data. The next step involves identifying disease spots using histogram-based methods based on the L*a*b* color model. Finding disease spots before segmentation helps ensure proper segmentation, which is the goal of the next stage, which involves dividing the leaf pictures into uniform areas. Furthermore, a fusion model for data categorization uses patches segmentation data as input. To improve the system's accuracy while operating independently, different leaf disease zones are segmented, and patches are created using patch segmentation algorithms. This proposed model utilizes the advantages of both Convolutional Neural Networks (CNNs) and Vision Transformers (ViT) to extract potent early features fusion by integrating CNN designs such as VGG16, InceptionV3, AlexNet, and Google Net. After that, local characteristics are captured using a ViT model to accurately detect plant illnesses. With an impressive accuracy of 99.85%, the proposed fusion model performs better than similar previously published methods in the detection and categorization of several plant leaf diseases.
为了解决一些问题和提高农业产量,研究人员最近在智能农业中广泛使用人工智能(AI)方法。鉴于世界上植物的巨大多样性以及对作物生产力产生负面影响的许多疾病,确定和分类植物疾病是一项艰巨的任务。植物叶片病害图像的自动分割是目前全球研究的主要领域之一。使用开源数据集(如FGVC 7)评估了所提出模型的效率,该数据集包含四种不同类型的叶片疾病。图像经过三个阶段的预处理。首先,采用秩阶模糊(ROF)滤波方法对植物图像中的背景噪声进行最小化处理。然后调整图像的大小,最后增加数据。下一步是使用基于L*a*b*颜色模型的直方图方法来识别疾病斑点。在分割之前找到病点有助于确保正确的分割,这是下一阶段的目标,其中包括将叶片图像划分为统一的区域。在此基础上,提出了一种以小块分割数据为输入的数据分类融合模型。为了提高系统在独立运行时的准确性,对不同的叶片病害区域进行了分割,并使用斑块分割算法创建了斑块。该模型利用卷积神经网络(CNN)和视觉变压器(ViT)的优势,通过集成VGG16、InceptionV3、AlexNet和谷歌Net等CNN设计,提取有效的早期特征融合。之后,使用ViT模型捕获局部特征,以准确检测植物病害。在几种植物叶片病害的检测和分类方面,该融合模型的准确率高达99.85%,优于先前发表的类似方法。
{"title":"Plant leaves disease detection in complex background using automatic segmentation and improved multi-scale feature fusion network","authors":"Apoorva Arora,&nbsp;Vinay Gautam","doi":"10.1016/j.sasc.2025.200434","DOIUrl":"10.1016/j.sasc.2025.200434","url":null,"abstract":"<div><div>In order to solve a number of issues and increase agricultural output, researchers have recently widely used Artificial Intelligence (AI) approaches in smart farming. Given the enormous diversity of plants in the world and the many diseases that have a negative effect on crop productivity, identifying and categorizing plant diseases is a difficult undertaking. One of the main areas of research globally has been the automatic segmentation of images showing plant leaf diseases. The efficiency of the proposed model is assessed using open-source datasets, such as FGVC 7, which comprise four distinct classes of leaf diseases. The picture undergoes three phases of pre-processing. First is, a Rank Order Fuzzy (ROF) filter approach is used to minimize background noise in the plant image. Then resizing the image and finally augment of data. The next step involves identifying disease spots using histogram-based methods based on the L*a*b* color model. Finding disease spots before segmentation helps ensure proper segmentation, which is the goal of the next stage, which involves dividing the leaf pictures into uniform areas. Furthermore, a fusion model for data categorization uses patches segmentation data as input. To improve the system's accuracy while operating independently, different leaf disease zones are segmented, and patches are created using patch segmentation algorithms. This proposed model utilizes the advantages of both Convolutional Neural Networks (CNNs) and Vision Transformers (ViT) to extract potent early features fusion by integrating CNN designs such as VGG16, InceptionV3, AlexNet, and Google Net. After that, local characteristics are captured using a ViT model to accurately detect plant illnesses. With an impressive accuracy of 99.85%, the proposed fusion model performs better than similar previously published methods in the detection and categorization of several plant leaf diseases.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"8 ","pages":"Article 200434"},"PeriodicalIF":3.6,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145977957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing computer education through IoT-Enabled learning environments leveraging mobile edge computing for real-time feedback 利用移动边缘计算提供实时反馈的物联网学习环境,加强计算机教育
IF 3.6 Pub Date : 2025-12-22 DOI: 10.1016/j.sasc.2025.200433
Xin Yu, Yinghua Tian
The integration of Internet of Things (IoT) technologies into educational systems has opened new pathways for enhancing teaching effectiveness and student engagement, particularly in the domain of computer education. Traditional computer education methods often face limitations in engagement monitoring, timely feedback delivery, and personalized instruction. To address these challenges, this research proposes an intelligent digital learning framework that enhances computer education through the integration of IoT devices and Mobile Edge Computing (MEC), enabling real-time feedback and adaptive instruction. In the proposed framework, IoT-enabled sensors collect real-time multimodal data reflecting students’ engagement and interaction within the classroom. These data streams are offloaded to local MEC nodes, where lightweight deep learning models process the information rapidly, eliminating reliance on cloud-based latency. The Dynamic Elephant Herding Optimiser-driven Feedback Stacked Long Short-Term Memory (DEH-FStacked LSTM) network processes this data at the edge to detect behavioural cues and assess teaching quality. Simultaneously, the module learns optimal feedback policies based on student state transitions, guided by a reward function that maximizes learning effectiveness while minimizing response delay. By deploying these deep learning models on edge servers, the system ensures ultra-low latency, maintains data privacy, and delivers personalized, real-time interventions. Experimental results demonstrate overall performance above 92 %, highlighting its reliability, responsiveness, and practical value for real-world educational environments. The proposed method has reached the highest accuracy of 94.32; compared to other models like Stacked LSTM (91.2 %), Bi-LSTM (92.0 %), CNN-LSTM (89.7 %), and Transformer- based architectures (93.1 %). It also provides better precision, recall and F1-score and is shown to have a visible and quantifiable improvement on the current State of the Art of engagement prediction in real-time. This research validates that incorporating deep learning within an IoT and MEC-based infrastructure significantly enhances the quality, responsiveness, and overall effectiveness of computer education in higher learning institutions.
物联网(IoT)技术与教育系统的整合为提高教学效率和学生参与度开辟了新的途径,特别是在计算机教育领域。传统的计算机教育方法在参与监测、及时反馈和个性化教学方面往往面临局限性。为了应对这些挑战,本研究提出了一个智能数字学习框架,通过物联网设备和移动边缘计算(MEC)的集成来增强计算机教育,实现实时反馈和自适应教学。在提议的框架中,支持物联网的传感器收集实时多模态数据,反映学生在课堂上的参与和互动。这些数据流被卸载到本地MEC节点,在这些节点上,轻量级深度学习模型可以快速处理信息,从而消除了对基于云的延迟的依赖。动态象群优化器驱动的反馈堆叠长短期记忆(DEH-FStacked LSTM)网络在边缘处理这些数据,以检测行为线索并评估教学质量。同时,该模块学习基于学生状态转换的最优反馈策略,在奖励函数的指导下,最大限度地提高学习效率,同时最小化响应延迟。通过在边缘服务器上部署这些深度学习模型,系统可确保超低延迟,维护数据隐私,并提供个性化的实时干预。实验结果表明,该系统的总体性能在92%以上,突出了其可靠性、响应性和在现实教育环境中的实用价值。该方法达到了94.32的最高准确率;与其他模型相比,如堆叠LSTM(91.2%)、Bi-LSTM(92.0%)、CNN-LSTM(89.7%)和基于Transformer的架构(93.1%)。它还提供了更好的精确度、召回率和f1分数,并被证明对当前的实时参与预测技术有一个可见的、可量化的改进。本研究证实,在物联网和基于mec的基础设施中结合深度学习可以显著提高高等院校计算机教育的质量、响应能力和整体效率。
{"title":"Enhancing computer education through IoT-Enabled learning environments leveraging mobile edge computing for real-time feedback","authors":"Xin Yu,&nbsp;Yinghua Tian","doi":"10.1016/j.sasc.2025.200433","DOIUrl":"10.1016/j.sasc.2025.200433","url":null,"abstract":"<div><div>The integration of Internet of Things (IoT) technologies into educational systems has opened new pathways for enhancing teaching effectiveness and student engagement, particularly in the domain of computer education. Traditional computer education methods often face limitations in engagement monitoring, timely feedback delivery, and personalized instruction. To address these challenges, this research proposes an intelligent digital learning framework that enhances computer education through the integration of IoT devices and Mobile Edge Computing (MEC), enabling real-time feedback and adaptive instruction. In the proposed framework, IoT-enabled sensors collect real-time multimodal data reflecting students’ engagement and interaction within the classroom. These data streams are offloaded to local MEC nodes, where lightweight deep learning models process the information rapidly, eliminating reliance on cloud-based latency. The Dynamic Elephant Herding Optimiser-driven Feedback Stacked Long Short-Term Memory (DEH-FStacked LSTM) network processes this data at the edge to detect behavioural cues and assess teaching quality. Simultaneously, the module learns optimal feedback policies based on student state transitions, guided by a reward function that maximizes learning effectiveness while minimizing response delay. By deploying these deep learning models on edge servers, the system ensures ultra-low latency, maintains data privacy, and delivers personalized, real-time interventions. Experimental results demonstrate overall performance above 92 %, highlighting its reliability, responsiveness, and practical value for real-world educational environments. The proposed method has reached the highest accuracy of 94.32; compared to other models like Stacked LSTM (91.2 %), Bi-LSTM (92.0 %), CNN-LSTM (89.7 %), and Transformer- based architectures (93.1 %). It also provides better precision, recall and F1-score and is shown to have a visible and quantifiable improvement on the current State of the Art of engagement prediction in real-time. This research validates that incorporating deep learning within an IoT and MEC-based infrastructure significantly enhances the quality, responsiveness, and overall effectiveness of computer education in higher learning institutions.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"8 ","pages":"Article 200433"},"PeriodicalIF":3.6,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantum inspired hyperparameter optimization for enhanced deep learning based intrusion detection in wireless sensor networks 无线传感器网络中基于增强深度学习的入侵检测量子启发超参数优化
IF 3.6 Pub Date : 2025-12-15 DOI: 10.1016/j.sasc.2025.200431
K. Vinotha, P. Eswaran
Wireless Sensor Networks (WSNs) are increasingly threatened by evolving denial-of-service (DoS) attacks that degrade communication reliability and energy efficiency. Existing intrusion detection systems (IDS) that employ deep learning models often struggle with suboptimal hyperparameter tuning, slow convergence, and poor generalization under dynamic WSN conditions. To address these limitations, this study proposes a quantum search-enhanced bat algorithm (QS-BAT) that integrates quantum-inspired search dynamics with adaptive swarm intelligence for efficient and precise hyperparameter optimization. Five deep learning architectures—CNN, GAN, GRU, LSTM, and Transformer— were optimized and evaluated on the WSN-DS dataset under multiple attack scenarios. Experimental results demonstrate that the QS-BAT-optimized Transformer achieves 95.8 % accuracy and 95.7 % F1-score, significantly outperforming Grid Search, Bat Algorithm, Cuckoo Search, and Sparrow Search Algorithm. The novelty of this study lies in coupling the quantum exponential search with adaptive elite selection, enabling faster convergence and superior IDS performance in energy-constrained WSN environments.
无线传感器网络(wsn)日益受到不断发展的拒绝服务(DoS)攻击的威胁,这些攻击降低了通信可靠性和能效。在动态WSN条件下,采用深度学习模型的现有入侵检测系统(IDS)经常存在超参数调优次优、收敛速度慢、泛化能力差等问题。为了解决这些限制,本研究提出了一种量子搜索增强蝙蝠算法(QS-BAT),该算法将量子启发的搜索动态与自适应群体智能相结合,以实现高效、精确的超参数优化。在多种攻击场景下,对cnn、GAN、GRU、LSTM和Transformer五种深度学习架构进行了优化和评估。实验结果表明,qs - Bat优化后的变压器准确率达到95.8%,f1得分达到95.7%,显著优于网格搜索、蝙蝠算法、布谷鸟搜索和麻雀搜索算法。本研究的新颖之处在于将量子指数搜索与自适应精英选择相结合,在能量受限的WSN环境中实现更快的收敛和更好的IDS性能。
{"title":"Quantum inspired hyperparameter optimization for enhanced deep learning based intrusion detection in wireless sensor networks","authors":"K. Vinotha,&nbsp;P. Eswaran","doi":"10.1016/j.sasc.2025.200431","DOIUrl":"10.1016/j.sasc.2025.200431","url":null,"abstract":"<div><div>Wireless Sensor Networks (WSNs) are increasingly threatened by evolving denial-of-service (DoS) attacks that degrade communication reliability and energy efficiency. Existing intrusion detection systems (IDS) that employ deep learning models often struggle with suboptimal hyperparameter tuning, slow convergence, and poor generalization under dynamic WSN conditions. To address these limitations, this study proposes a quantum search-enhanced bat algorithm (QS-BAT) that integrates quantum-inspired search dynamics with adaptive swarm intelligence for efficient and precise hyperparameter optimization. Five deep learning architectures—CNN, GAN, GRU, LSTM, and Transformer— were optimized and evaluated on the WSN-DS dataset under multiple attack scenarios. Experimental results demonstrate that the QS-BAT-optimized Transformer achieves 95.8 % accuracy and 95.7 % F1-score, significantly outperforming Grid Search, Bat Algorithm, Cuckoo Search, and Sparrow Search Algorithm. The novelty of this study lies in coupling the quantum exponential search with adaptive elite selection, enabling faster convergence and superior IDS performance in energy-constrained WSN environments.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"8 ","pages":"Article 200431"},"PeriodicalIF":3.6,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MoistViT: A vision transformer model for moisture content prediction of wood chips 用于木屑含水率预测的视觉变压器模型
IF 3.6 Pub Date : 2025-12-11 DOI: 10.1016/j.sasc.2025.200429
Daniel E. Marulanda , Abdur Rahman , Jason Street , Mohammad Marufuzzaman , Haifeng Wang , Veera G. Gude , Randy Buchanan
Moisture content in wood chips is a critical parameter for industries such as pelleting mills, bio-refineries, paper mills, and renewable energy production. The moisture level significantly influences both the quality of the final product and the efficiency of the production process. Consequently, accurate knowledge of moisture content is of substantial importance to wood chip-reliant industries. However, current methods for determining moisture content are either time-consuming or require costly equipment and specialized setups. Therefore, developing a quick and reliable method for assessing wood chip moisture content is imperative. To address this need, we evaluate fourteen Vision Transformer (ViT) architectures and introduce an optimized model, MoistViT, developed using Bayesian Optimization Hyperband (BOHB) for efficient hyperparameter tuning. Experiments on two wood chip image datasets (1600 total images) show that MoistViT achieves 91% accuracy and 92% F1-score on Source 1 and 93% accuracy and 93% F1-score on Source 2, outperforming all baseline models. Subsequently, a thorough analysis of failure cases has been carried out, including the identification of the most challenging groups of moisture levels. These analyses provide valuable insights into the complex task of determining moisture content from inherently heterogeneous wood chips. The proposed MoistViT demonstrates significant potential for real-time applications in relevant industries, which could ultimately lead to a streamlined production process.
木屑中的水分含量是造粒厂、生物精炼厂、造纸厂和可再生能源生产等行业的关键参数。水分水平对最终产品的质量和生产过程的效率都有很大的影响。因此,水分含量的准确知识是相当重要的木屑依赖行业。然而,目前测定水分含量的方法要么耗时,要么需要昂贵的设备和专门的设置。因此,开发一种快速可靠的木屑含水率测定方法势在必行。为了满足这一需求,我们评估了14种视觉变压器(ViT)架构,并引入了一个优化模型,即使用贝叶斯优化超带(BOHB)开发的用于高效超参数调谐的湿热变压器。在两个木屑图像数据集(共1600张图像)上的实验表明,在Source 1上,该模型的准确率为91%,f1得分为92%;在Source 2上,该模型的准确率为93%,f1得分为93%,优于所有基线模型。随后,对失效案例进行了彻底的分析,包括确定最具挑战性的湿度水平组。这些分析提供了有价值的见解,以确定从固有的异质木屑水分含量的复杂任务。提出的moisture vit在相关行业的实时应用中显示了巨大的潜力,最终可能导致简化的生产过程。
{"title":"MoistViT: A vision transformer model for moisture content prediction of wood chips","authors":"Daniel E. Marulanda ,&nbsp;Abdur Rahman ,&nbsp;Jason Street ,&nbsp;Mohammad Marufuzzaman ,&nbsp;Haifeng Wang ,&nbsp;Veera G. Gude ,&nbsp;Randy Buchanan","doi":"10.1016/j.sasc.2025.200429","DOIUrl":"10.1016/j.sasc.2025.200429","url":null,"abstract":"<div><div>Moisture content in wood chips is a critical parameter for industries such as pelleting mills, bio-refineries, paper mills, and renewable energy production. The moisture level significantly influences both the quality of the final product and the efficiency of the production process. Consequently, accurate knowledge of moisture content is of substantial importance to wood chip-reliant industries. However, current methods for determining moisture content are either time-consuming or require costly equipment and specialized setups. Therefore, developing a quick and reliable method for assessing wood chip moisture content is imperative. To address this need, we evaluate fourteen Vision Transformer (ViT) architectures and introduce an optimized model, MoistViT, developed using Bayesian Optimization Hyperband (BOHB) for efficient hyperparameter tuning. Experiments on two wood chip image datasets (1600 total images) show that MoistViT achieves 91% accuracy and 92% F1-score on Source 1 and 93% accuracy and 93% F1-score on Source 2, outperforming all baseline models. Subsequently, a thorough analysis of failure cases has been carried out, including the identification of the most challenging groups of moisture levels. These analyses provide valuable insights into the complex task of determining moisture content from inherently heterogeneous wood chips. The proposed MoistViT demonstrates significant potential for real-time applications in relevant industries, which could ultimately lead to a streamlined production process.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"8 ","pages":"Article 200429"},"PeriodicalIF":3.6,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145792135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DRSCNet: End-to-end attention-guided segmentation and sequence-aware classification for diabetic retinopathy DRSCNet:端到端注意力引导的糖尿病视网膜病变分割和序列感知分类
IF 3.6 Pub Date : 2025-12-10 DOI: 10.1016/j.sasc.2025.200430
Samatha Gaddam

Background

Diabetic Retinopathy (DR) is one of the most prevalent causes of blindness, affecting people of all ages. The conventional screening method is time-consuming and highly subjective due to inter-observer bias. Hence, automated systems greatly assist in classifying segmented DR features, increasing diagnostic speed and efficiency.

Method

This work presents a new artificial intelligence model such as DR Segmented Labels Classification Network (DRSCNet). The approach starts with the segmentation of five important features of the retina, namely, soft exudates (SE), optic disc (OD), microaneurysms (MA), hard exudates (HE), and hemorrhages (HM) using Attention-U-Net (AU-Net) architecture. These segmented regions are then passed through a Multi-Layer Convolutional Neural Network (MLCNN) for detailed feature extraction. Further, a bidirectional long short-term memory (Bi-LSTM) network is used for classification, using the features extracted from the MLCNN model.

Results

The proposed DRSCNet was tested on the Indian DR Image Dataset (IDRiD). The system achieved high results, with a segmentation accuracy of 99.10 % and a classification accuracy of 99.19 % and outperformed several existing techniques. As a case study, the proposed DRSCNet demonstrated consistently high performance across multiple datasets, achieving an accuracy of 99.01 % on the Fine-Grained Annotated DR (FGADR) dataset, 99.83 % on the Dataset for DR (DDR), and 99.19 % on the Digital Retinal Images for Vessel Extraction (DRIVE) dataset. These results highlight the model’s robustness and generalizability in automated DR diagnosis across diverse imaging sources.

Conclusion

The proposed DRSCNet combines the segmentation and classification modules to provide a comprehensive solution for automated diagnosis. Due to its high accuracy rates, it could be used in clinical practice to enhance diagnosis and management early, thus decreasing the workload of healthcare practitioners.
背景:糖尿病视网膜病变(DR)是最常见的致盲原因之一,影响所有年龄段的人。由于观察者间的偏见,传统的筛选方法耗时且高度主观。因此,自动化系统极大地有助于对分割DR特征进行分类,提高诊断速度和效率。方法提出了一种新的人工智能模型——DR分段标签分类网络(DRSCNet)。该方法首先使用Attention-U-Net (AU-Net)架构分割视网膜的五个重要特征,即软渗出物(SE)、视盘(OD)、微动脉瘤(MA)、硬渗出物(HE)和出血(HM)。然后将这些分割的区域通过多层卷积神经网络(MLCNN)进行详细的特征提取。此外,使用从MLCNN模型中提取的特征,使用双向长短期记忆(Bi-LSTM)网络进行分类。结果提出的DRSCNet在印度DR图像数据集(IDRiD)上进行了测试。该系统的分割准确率达到99.10%,分类准确率达到99.19%,优于现有的几种技术。作为一个案例研究,提出的DRSCNet在多个数据集上表现出一致的高性能,在细粒度注释DR (FGADR)数据集上实现了99.01%的准确率,在DR数据集(DDR)上实现了99.83%的准确率,在数字视网膜图像血管提取(DRIVE)数据集上实现了99.19%的准确率。这些结果突出了模型的鲁棒性和广泛性,在不同的成像源自动DR诊断。结论提出的DRSCNet将分割和分类模块相结合,为自动诊断提供了全面的解决方案。由于其准确率高,可用于临床实践,以提高早期诊断和管理,从而减少医护人员的工作量。
{"title":"DRSCNet: End-to-end attention-guided segmentation and sequence-aware classification for diabetic retinopathy","authors":"Samatha Gaddam","doi":"10.1016/j.sasc.2025.200430","DOIUrl":"10.1016/j.sasc.2025.200430","url":null,"abstract":"<div><h3>Background</h3><div>Diabetic Retinopathy (DR) is one of the most prevalent causes of blindness, affecting people of all ages. The conventional screening method is time-consuming and highly subjective due to inter-observer bias. Hence, automated systems greatly assist in classifying segmented DR features, increasing diagnostic speed and efficiency.</div></div><div><h3>Method</h3><div>This work presents a new artificial intelligence model such as DR Segmented Labels Classification Network (DRSC<img>Net). The approach starts with the segmentation of five important features of the retina, namely, soft exudates (SE), optic disc (OD), microaneurysms (MA), hard exudates (HE), and hemorrhages (HM) using Attention-U-Net (AU-Net) architecture. These segmented regions are then passed through a Multi-Layer Convolutional Neural Network (MLCNN) for detailed feature extraction. Further, a bidirectional long short-term memory (Bi-LSTM) network is used for classification, using the features extracted from the MLCNN model.</div></div><div><h3>Results</h3><div>The proposed DRSC<img>Net was tested on the Indian DR Image Dataset (IDRiD). The system achieved high results, with a segmentation accuracy of 99.10 % and a classification accuracy of 99.19 % and outperformed several existing techniques. As a case study, the proposed DRSC<img>Net demonstrated consistently high performance across multiple datasets, achieving an accuracy of 99.01 % on the Fine-Grained Annotated DR (FGADR) dataset, 99.83 % on the Dataset for DR (DDR), and 99.19 % on the Digital Retinal Images for Vessel Extraction (DRIVE) dataset. These results highlight the model’s robustness and generalizability in automated DR diagnosis across diverse imaging sources.</div></div><div><h3>Conclusion</h3><div>The proposed DRSC<img>Net combines the segmentation and classification modules to provide a comprehensive solution for automated diagnosis. Due to its high accuracy rates, it could be used in clinical practice to enhance diagnosis and management early, thus decreasing the workload of healthcare practitioners.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"8 ","pages":"Article 200430"},"PeriodicalIF":3.6,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145760844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-fidelity video frame interpolation through context-aware temporal aggregation and recurrent propagation 高保真视频帧插值通过上下文感知的时间聚合和循环传播
IF 3.6 Pub Date : 2025-12-08 DOI: 10.1016/j.sasc.2025.200428
Mohana Priya P, Ulagapriya K
Accurate inpainting of missing middle frames in video sequences is vital for multiple applications like video restoration, enhancement and compression. This study introduces a sophisticated deep learning-based framework designed to address this challenge by utilizing adjoining sequences of preceding and following frames. Our approach integrates temporal aggregation and recurrent propagation to effectively perform frame inpainting. Temporal aggregation leverages visible content from adjacent frames to recreate missing frames, ensuring high spatial fidelity and feature conservation. Optical flow estimation, utilizing methods such as Farneback Optical Flow, estimates displacement between frames and provides motion vectors that guide the interpolation process, enabling accurate alignment and blending of frames. Recurrent propagation is accomplished through Long Short-Term Memory (LSTM) networks that maintains temporal coherence by embedding and propagating information from preceding frames, thus ensuring smooth transitions and consistency across the video sequence. To further enhance performance, our model includes a context-aware feature extraction mechanism that adapts to various motion patterns and occlusions, optimizing the reconstruction quality. Framework has been evaluated on MSU Video Frame Interpolation (VFI) Benchmark Dataset, which provides diverse and challenging scenarios for interpolation, as well as the YouTube-8 M dataset, which contains a wide range of real-world video content. The experimental results demonstrate the robustness of the proposed model: a PSNR of 32.00 and an SSIM score of 0.905 indicate its superior reconstruction quality and structural similarity compared to baseline models. These results underscore the framework’s effectiveness in handling complex motion dynamics and occlusions, making it well suited for advanced video restoration, enhancement and compression tasks.
对视频序列中缺失的中间帧进行准确的补图对于视频恢复、增强和压缩等多种应用至关重要。本研究引入了一个复杂的基于深度学习的框架,旨在通过利用前后框架的相邻序列来解决这一挑战。我们的方法将时间聚合和循环传播相结合,有效地执行帧绘制。时间聚合利用相邻帧的可见内容来重建缺失的帧,确保高空间保真度和特征保存。光流估计,利用Farneback光流等方法,估计帧之间的位移,并提供指导插值过程的运动矢量,从而实现帧的精确对齐和混合。循环传播是通过长短期记忆(LSTM)网络完成的,该网络通过嵌入和传播来自前帧的信息来保持时间一致性,从而确保整个视频序列的平滑过渡和一致性。为了进一步提高性能,我们的模型包括一个适应各种运动模式和遮挡的上下文感知特征提取机制,优化重建质量。框架已经在MSU视频帧插值(VFI)基准数据集上进行了评估,该数据集提供了多种具有挑战性的插值场景,以及包含广泛的现实世界视频内容的youtube - 8m数据集。实验结果表明,该模型具有较好的鲁棒性,PSNR为32.00,SSIM分数为0.905,表明该模型具有较好的重建质量和结构相似性。这些结果强调了框架在处理复杂运动动力学和遮挡方面的有效性,使其非常适合高级视频恢复,增强和压缩任务。
{"title":"High-fidelity video frame interpolation through context-aware temporal aggregation and recurrent propagation","authors":"Mohana Priya P,&nbsp;Ulagapriya K","doi":"10.1016/j.sasc.2025.200428","DOIUrl":"10.1016/j.sasc.2025.200428","url":null,"abstract":"<div><div>Accurate inpainting of missing middle frames in video sequences is vital for multiple applications like video restoration, enhancement and compression. This study introduces a sophisticated deep learning-based framework designed to address this challenge by utilizing adjoining sequences of preceding and following frames. Our approach integrates temporal aggregation and recurrent propagation to effectively perform frame inpainting. Temporal aggregation leverages visible content from adjacent frames to recreate missing frames, ensuring high spatial fidelity and feature conservation. Optical flow estimation, utilizing methods such as Farneback Optical Flow, estimates displacement between frames and provides motion vectors that guide the interpolation process, enabling accurate alignment and blending of frames. Recurrent propagation is accomplished through Long Short-Term Memory (LSTM) networks that maintains temporal coherence by embedding and propagating information from preceding frames, thus ensuring smooth transitions and consistency across the video sequence. To further enhance performance, our model includes a context-aware feature extraction mechanism that adapts to various motion patterns and occlusions, optimizing the reconstruction quality. Framework has been evaluated on MSU Video Frame Interpolation (VFI) Benchmark Dataset, which provides diverse and challenging scenarios for interpolation, as well as the YouTube-8 M dataset, which contains a wide range of real-world video content. The experimental results demonstrate the robustness of the proposed model: a PSNR of 32.00 and an SSIM score of 0.905 indicate its superior reconstruction quality and structural similarity compared to baseline models. These results underscore the framework’s effectiveness in handling complex motion dynamics and occlusions, making it well suited for advanced video restoration, enhancement and compression tasks.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"8 ","pages":"Article 200428"},"PeriodicalIF":3.6,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Systems and Soft Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1