首页 > 最新文献

IEEE Access最新文献

英文 中文
A New Touchscreen Cover for Braille-Based Data Entry on Mobile Devices 移动设备上基于盲文的数据输入的新触摸屏盖板
IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-26 DOI: 10.1109/ACCESS.2026.3658020
Ghassan Aladool;Olivier Togni
The emerging advances in mobile technology, along with the anticipated global increase in the blind and low-vision population, emphasize the importance of introducing this community to such technology as an effective and low-cost communication tool. Aiming to enable blind and deaf-blind individuals to use mobile devices, many tactile-based methods and systems have been presented in literature over the last decade. Although these methods are well-regarded, their limitations include the need for object localization on mobile devices’ embedded touchscreens and the requirement to apply multiple gesture types, often involving several fingers from both hands. Addressing these limitations, this work presents a framework offering a new tactile-based communication method for blind individuals on mobile devices, with a touchscreen cover as a core component. The invented cover splits the touchscreen into eight equally-sized cells—six for data entry and two for control input—and provides an effective solution to localization and navigation challenges faced by blind individuals on touchscreens. In particular, the proposed framework enables Braille-based character entry. An Android app has also been developed to identify entered characters and save them for further processing. The proposed framework was tested at a foundation providing care for blind and low-vision individuals. Two-phases experiments, preliminary and final, were designed and conducted with ten participants. Experimental analysis based on measuring data entry time and input error indicates that the proposed framework performs well. Furthermore, two participant surveys were conducted prior and post the experiments to justify the calculated performance measures, asses cognitive load, and verify the usability and learnability of the proposed framework.
移动技术的新进展,以及预计全球盲人和低视力人口的增加,强调了向这个社区介绍这种技术作为一种有效和低成本的通信工具的重要性。为了使盲人和聋哑盲人能够使用移动设备,在过去的十年里,文献中提出了许多基于触觉的方法和系统。尽管这些方法很受欢迎,但它们的局限性包括需要在移动设备的嵌入式触摸屏上进行对象定位,以及需要应用多种手势类型,通常涉及双手的几个手指。为了解决这些限制,这项工作提出了一个框架,为移动设备上的盲人提供了一种新的基于触觉的通信方法,其中触摸屏盖为核心组件。该发明的保护套将触摸屏分成8个大小相等的单元——6个用于数据输入,2个用于控制输入——为盲人在触摸屏上面临的定位和导航挑战提供了有效的解决方案。特别地,提出的框架支持基于盲文的字符输入。该公司还开发了一款安卓应用程序,用于识别输入的字符,并将其保存下来供进一步处理。提议的框架在一家为盲人和低视力人群提供护理的基金会进行了测试。实验分为前期和后期两个阶段,共有10人参与。通过测量数据输入时间和输入误差的实验分析表明,该框架具有良好的性能。此外,在实验之前和之后进行了两次参与者调查,以证明计算的性能指标,评估认知负荷,并验证所提出框架的可用性和可学习性。
{"title":"A New Touchscreen Cover for Braille-Based Data Entry on Mobile Devices","authors":"Ghassan Aladool;Olivier Togni","doi":"10.1109/ACCESS.2026.3658020","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3658020","url":null,"abstract":"The emerging advances in mobile technology, along with the anticipated global increase in the blind and low-vision population, emphasize the importance of introducing this community to such technology as an effective and low-cost communication tool. Aiming to enable blind and deaf-blind individuals to use mobile devices, many tactile-based methods and systems have been presented in literature over the last decade. Although these methods are well-regarded, their limitations include the need for object localization on mobile devices’ embedded touchscreens and the requirement to apply multiple gesture types, often involving several fingers from both hands. Addressing these limitations, this work presents a framework offering a new tactile-based communication method for blind individuals on mobile devices, with a touchscreen cover as a core component. The invented cover splits the touchscreen into eight equally-sized cells—six for data entry and two for control input—and provides an effective solution to localization and navigation challenges faced by blind individuals on touchscreens. In particular, the proposed framework enables Braille-based character entry. An Android app has also been developed to identify entered characters and save them for further processing. The proposed framework was tested at a foundation providing care for blind and low-vision individuals. Two-phases experiments, preliminary and final, were designed and conducted with ten participants. Experimental analysis based on measuring data entry time and input error indicates that the proposed framework performs well. Furthermore, two participant surveys were conducted prior and post the experiments to justify the calculated performance measures, asses cognitive load, and verify the usability and learnability of the proposed framework.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"14931-14941"},"PeriodicalIF":3.6,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11363568","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic Radar Cross-Section Estimation of Chaff Clouds Based on a Surrogate Model for Spatiotemporal Distribution 基于替代时空分布模型的箔条云动态雷达截面估计
IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-23 DOI: 10.1109/ACCESS.2026.3657414
Jun-Seon Kim;Uk Jin Jung;Su Hong Park;Donghyun Kim;Moonhong Kim;Dongwoo Sohn;Dong-Wook Seo
This paper presents a novel surrogate modeling approach for estimating the dynamic radar cross-section (RCS) of chaff clouds under diverse launch and environmental conditions. A high-fidelity computational fluid dynamic–discrete element method (CFD-DEM) framework is first used to simulate the multiphysics behavior of chaff clouds generated by both naval and aircraft dispensers. These simulations generate detailed aerodynamic datasets, which are used to train a Gaussian process regression (GPR)–based surrogate model. The surrogate model enables efficient prediction of the spatiotemporal distribution of chaff clouds, incorporating variables such as wind speed, wind direction, and launch parameters. To estimate dynamic RCS, the spatiotemporal distributions are combined with approximation techniques, specifically the generalized equivalent conductor (GEC) and vector radiative transfer (VRT) methods. A real-time chaff cloud simulator with a graphical user interface is also developed, integrating aerodynamic modeling, RCS calculations, and signal fluctuation modeling. Simulation results demonstrate that the proposed surrogate model achieves high prediction accuracy, with normalized mean absolute errors (NMAE) of 0.0085 for naval chaff and 0.0176 for aircraft chaff. The dynamic RCS obtained via the surrogate model closely matches the CFD-DEM results while substantially reducing computational cost, thus offering practical utility for real-time system applications.
提出了一种新的替代建模方法,用于估算不同发射和环境条件下箔条云的动态雷达截面(RCS)。本文首次采用高保真计算流体动力学离散元法(CFD-DEM)框架,模拟了海军和飞机布布机产生的箔条云的多物理场行为。这些模拟生成详细的空气动力学数据集,用于训练基于高斯过程回归(GPR)的代理模型。代理模型能够有效地预测箔条云的时空分布,包括风速、风向和发射参数等变量。为了估计动态RCS,将时空分布与近似技术相结合,特别是广义等效导体(GEC)和矢量辐射传递(VRT)方法。本文还开发了一个实时箔条云模拟器,具有图形用户界面,集成了气动建模、RCS计算和信号波动建模。仿真结果表明,该模型具有较高的预测精度,舰船箔条的归一化平均绝对误差(NMAE)为0.0085,飞机箔条的归一化平均绝对误差为0.0176。通过代理模型获得的动态RCS与CFD-DEM结果接近,同时大大降低了计算成本,为实时系统应用提供了实用价值。
{"title":"Dynamic Radar Cross-Section Estimation of Chaff Clouds Based on a Surrogate Model for Spatiotemporal Distribution","authors":"Jun-Seon Kim;Uk Jin Jung;Su Hong Park;Donghyun Kim;Moonhong Kim;Dongwoo Sohn;Dong-Wook Seo","doi":"10.1109/ACCESS.2026.3657414","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3657414","url":null,"abstract":"This paper presents a novel surrogate modeling approach for estimating the dynamic radar cross-section (RCS) of chaff clouds under diverse launch and environmental conditions. A high-fidelity computational fluid dynamic–discrete element method (CFD-DEM) framework is first used to simulate the multiphysics behavior of chaff clouds generated by both naval and aircraft dispensers. These simulations generate detailed aerodynamic datasets, which are used to train a Gaussian process regression (GPR)–based surrogate model. The surrogate model enables efficient prediction of the spatiotemporal distribution of chaff clouds, incorporating variables such as wind speed, wind direction, and launch parameters. To estimate dynamic RCS, the spatiotemporal distributions are combined with approximation techniques, specifically the generalized equivalent conductor (GEC) and vector radiative transfer (VRT) methods. A real-time chaff cloud simulator with a graphical user interface is also developed, integrating aerodynamic modeling, RCS calculations, and signal fluctuation modeling. Simulation results demonstrate that the proposed surrogate model achieves high prediction accuracy, with normalized mean absolute errors (NMAE) of 0.0085 for naval chaff and 0.0176 for aircraft chaff. The dynamic RCS obtained via the surrogate model closely matches the CFD-DEM results while substantially reducing computational cost, thus offering practical utility for real-time system applications.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"14857-14869"},"PeriodicalIF":3.6,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11363218","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sensorless Collaborative Impedance Control of Rehabilitation Robots via LQR and Model-Based Sliding Manifold 基于LQR和模型滑动流形的康复机器人无传感器协同阻抗控制
IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-19 DOI: 10.1109/ACCESS.2026.3655024
Brahim Brahmi
Human–robot interaction (HRI) remains a critical research area, particularly in assistive robotics, where intuitive, safe, and adaptive collaboration with human users is essential for real-world deployment. Despite significant advances in impedance-based and adaptive control strategies, many existing approaches rely on full-state measurements, exhibit limited capability in handling user intent, and lack adaptability to unstructured environments. This paper addresses these challenges by proposing a novel sensorless control architecture that integrates an optimal Linear Quadratic Regulator (LQR), a nonlinear force observer, and a robust sliding mode controller with nonlinear model-based switching function. The primary objective is to achieve accurate trajectory tracking while minimizing user-applied interaction forces without requiring force or velocity sensors. The proposed controller dynamically generates reference trajectories through a human-cooperative LQR paradigm that penalizes both robot effort and human torque, enabling adaptive behavior based on inferred user intent. A nonlinear observer estimates interaction torques using only joint position measurements, facilitating intent inference and real-time impedance adaptation. These estimates are subsequently incorporated into the construction of model-based switching manifolds within the sliding mode controller, enhancing chattering mitigation and control decoupling under uncertainty and partial state observability. Theoretical analysis establishes global asymptotic stability, and comprehensive simulations conducted on a 2-DOF rehabilitation robot validate the proposed approach under multiple disturbance scenarios. Compared to conventional impedance control strategies, the proposed method demonstrates improved accuracy, robustness, and energy efficiency, highlighting its potential for deployment in sensor-limited, human-in-the-loop applications such as prosthetics, exoskeletons, and adaptive rehabilitation robotics.
人机交互(HRI)仍然是一个关键的研究领域,特别是在辅助机器人领域,其中与人类用户的直观、安全和自适应协作对于现实世界的部署至关重要。尽管基于阻抗和自适应控制策略取得了重大进展,但许多现有方法依赖于全状态测量,在处理用户意图方面表现出有限的能力,并且缺乏对非结构化环境的适应性。本文通过提出一种新的无传感器控制体系结构来解决这些挑战,该体系结构集成了最优线性二次型调节器(LQR)、非线性力观测器和具有非线性模型切换功能的鲁棒滑模控制器。主要目标是在不需要力或速度传感器的情况下实现准确的轨迹跟踪,同时最大限度地减少用户施加的交互力。所提出的控制器通过人类合作的LQR范式动态生成参考轨迹,该范式惩罚机器人的努力和人类的扭矩,从而基于推断的用户意图实现自适应行为。非线性观测器仅使用关节位置测量来估计相互作用力矩,便于意图推断和实时阻抗适应。这些估计随后被纳入到滑模控制器中基于模型的开关流形的构造中,增强了不确定性和部分状态可观察性下的抖振缓解和控制解耦。理论分析建立了该方法的全局渐近稳定性,并对一个二自由度康复机器人进行了综合仿真,验证了该方法在多种干扰情况下的有效性。与传统的阻抗控制策略相比,所提出的方法具有更高的准确性、鲁棒性和能效,突出了其在传感器受限、人在环应用(如假肢、外骨骼和自适应康复机器人)中的应用潜力。
{"title":"Sensorless Collaborative Impedance Control of Rehabilitation Robots via LQR and Model-Based Sliding Manifold","authors":"Brahim Brahmi","doi":"10.1109/ACCESS.2026.3655024","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3655024","url":null,"abstract":"Human–robot interaction (HRI) remains a critical research area, particularly in assistive robotics, where intuitive, safe, and adaptive collaboration with human users is essential for real-world deployment. Despite significant advances in impedance-based and adaptive control strategies, many existing approaches rely on full-state measurements, exhibit limited capability in handling user intent, and lack adaptability to unstructured environments. This paper addresses these challenges by proposing a novel sensorless control architecture that integrates an optimal Linear Quadratic Regulator (LQR), a nonlinear force observer, and a robust sliding mode controller with nonlinear model-based switching function. The primary objective is to achieve accurate trajectory tracking while minimizing user-applied interaction forces without requiring force or velocity sensors. The proposed controller dynamically generates reference trajectories through a human-cooperative LQR paradigm that penalizes both robot effort and human torque, enabling adaptive behavior based on inferred user intent. A nonlinear observer estimates interaction torques using only joint position measurements, facilitating intent inference and real-time impedance adaptation. These estimates are subsequently incorporated into the construction of model-based switching manifolds within the sliding mode controller, enhancing chattering mitigation and control decoupling under uncertainty and partial state observability. Theoretical analysis establishes global asymptotic stability, and comprehensive simulations conducted on a 2-DOF rehabilitation robot validate the proposed approach under multiple disturbance scenarios. Compared to conventional impedance control strategies, the proposed method demonstrates improved accuracy, robustness, and energy efficiency, highlighting its potential for deployment in sensor-limited, human-in-the-loop applications such as prosthetics, exoskeletons, and adaptive rehabilitation robotics.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"10760-10781"},"PeriodicalIF":3.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11357932","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PF-DTN: Predictive Routing for Intelligent Delay Tolerant Networks Using RNN-LSTM Deep Learning With Monte Carlo Dropout Uncertainty Estimation and Hybrid Deterministic, Probabilistic, and Uncertain Routing Strategies PF-DTN:基于RNN-LSTM深度学习的智能容延迟网络预测路由,蒙特卡洛不确定性估计和混合确定性、概率和不确定路由策略
IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-19 DOI: 10.1109/ACCESS.2026.3655507
El Mastapha Sammou
Delay tolerant networks (DTN/s) represent an evolution of traditional ad hoc networks, specifically designed for extreme environments characterized by intermittent connectivity, unpredictable node mobility, high transmission delays, and frequent disruptions. Conventional routing protocols prove ineffective in these contexts, necessitating adaptive and robust solutions. In this paper, we propose PF-DTN (Predictive Forwarding for DTN), a hybrid adaptive routing algorithm that combines the prediction of future trajectories of DTN nodes using an LSTM (Long Short-Term Memory) model with relay selection leveraging deterministic, probabilistic, and uncertain strategies. PF-DTN also integrates uncertainty estimation via the Monte Carlo Dropout technique, enabling the dynamic adaptation of the routing strategy based on prediction reliability. The proposed PF-DTN architecture is structured into three phases: contextual mobility data collection, trajectory prediction with uncertainty estimation, and adaptive relay selection. Experimental evaluations conducted on the ONE simulator demonstrate that PF-DTN outperforms the benchmark protocol Prophet and remains competitive with recent state-of-the-art approaches. Our approach achieves a high delivery rate with relative gains over Prophet reaching up to + 46.15% in low- to medium-density networks, along with latency reduction of up to + 22.89% in high-density environments. The elevated overhead ratio, attributable to the computational demands of predictive and adaptive modules, represents a justified trade-off given the improvements in delivery reliability and latency control. These results demonstrate PF-DTN’s effectiveness and robustness across diverse DTN environments, ranging from predictable to highly dynamic and uncertain networks, establishing an optimal balance between reliability, speed, and communication cost.
延迟容忍网络(DTN/s)代表了传统自组织网络的发展,专为具有间歇性连接、不可预测的节点移动性、高传输延迟和频繁中断等极端环境而设计。传统的路由协议在这些情况下被证明是无效的,因此需要自适应和健壮的解决方案。在本文中,我们提出了PF-DTN(预测转发DTN),这是一种混合自适应路由算法,它将使用LSTM(长短期记忆)模型预测DTN节点的未来轨迹与利用确定性,概率性和不确定性策略的中继选择相结合。PF-DTN还通过蒙特卡罗Dropout技术集成了不确定性估计,实现了基于预测可靠性的路由策略的动态适应。提出的PF-DTN体系结构分为三个阶段:上下文移动数据收集、不确定性估计的轨迹预测和自适应中继选择。在ONE模拟器上进行的实验评估表明,PF-DTN优于基准协议Prophet,并与最近最先进的方法保持竞争力。我们的方法实现了高传输率,在中低密度网络中相对于Prophet的增益高达+ 46.15%,在高密度环境中延迟降低高达+ 22.89%。由于预测模块和自适应模块的计算需求,开销比的增加代表了在交付可靠性和延迟控制方面的改进的合理权衡。这些结果证明了PF-DTN在不同DTN环境中的有效性和鲁棒性,从可预测到高度动态和不确定的网络,在可靠性、速度和通信成本之间建立了最佳平衡。
{"title":"PF-DTN: Predictive Routing for Intelligent Delay Tolerant Networks Using RNN-LSTM Deep Learning With Monte Carlo Dropout Uncertainty Estimation and Hybrid Deterministic, Probabilistic, and Uncertain Routing Strategies","authors":"El Mastapha Sammou","doi":"10.1109/ACCESS.2026.3655507","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3655507","url":null,"abstract":"Delay tolerant networks (DTN/s) represent an evolution of traditional ad hoc networks, specifically designed for extreme environments characterized by intermittent connectivity, unpredictable node mobility, high transmission delays, and frequent disruptions. Conventional routing protocols prove ineffective in these contexts, necessitating adaptive and robust solutions. In this paper, we propose PF-DTN (Predictive Forwarding for DTN), a hybrid adaptive routing algorithm that combines the prediction of future trajectories of DTN nodes using an LSTM (Long Short-Term Memory) model with relay selection leveraging deterministic, probabilistic, and uncertain strategies. PF-DTN also integrates uncertainty estimation via the Monte Carlo Dropout technique, enabling the dynamic adaptation of the routing strategy based on prediction reliability. The proposed PF-DTN architecture is structured into three phases: contextual mobility data collection, trajectory prediction with uncertainty estimation, and adaptive relay selection. Experimental evaluations conducted on the ONE simulator demonstrate that PF-DTN outperforms the benchmark protocol Prophet and remains competitive with recent state-of-the-art approaches. Our approach achieves a high delivery rate with relative gains over Prophet reaching up to + 46.15% in low- to medium-density networks, along with latency reduction of up to + 22.89% in high-density environments. The elevated overhead ratio, attributable to the computational demands of predictive and adaptive modules, represents a justified trade-off given the improvements in delivery reliability and latency control. These results demonstrate PF-DTN’s effectiveness and robustness across diverse DTN environments, ranging from predictable to highly dynamic and uncertain networks, establishing an optimal balance between reliability, speed, and communication cost.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"10841-10859"},"PeriodicalIF":3.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11357868","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Improved Template Inversion Attack Against Korean Face Images 一种针对韩国人脸图像的改进模板反转攻击
IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-19 DOI: 10.1109/ACCESS.2026.3655482
Koy Motita;Sang Hoon Han;Younho Lee
Recent template inversion methods achieve strong performance on Western datasets, but poor results on Korean facial images. This gap poses security risks for the authentication system using facial recognition in Korea, as the stakeholders may misunderstand that protecting facial templates on their databases is not necessary. We propose an enhanced template inversion method with three improvements: refined preprocessing using per-eye alignment and GFPGAN upscaling, MSE-enhanced loss function, and dynamic weight adjustment. Our method outperforms existing approaches, achieving 0.882 cosine similarity, $9.7~L_{2}$ norm, and 0.312 LPIPS. Beyond reconstruction quality, we evaluate the security implications of template inversion attacks by analyzing verification robustness using Successful Attack Rate (SAR) at fixed False Match Rate (FMR) operating points, Receiver Operating Characteristic (ROC) curves, and Equal Error Rate (EER), providing a comprehensive assessment under realistic authentication thresholds. These results emphasize the critical need for robust facial template protection in facial recognition authentication systems.
最近的模板反演方法在西方数据集上取得了很好的效果,但在韩国面部图像上的效果很差。这种差距给韩国的人脸识别认证系统带来了安全隐患,因为利益相关者可能会误解为没有必要保护数据库中的人脸模板。我们提出了一种增强的模板反演方法,该方法有三个改进:使用每眼对准和GFPGAN升级的精细预处理,mse增强的损失函数和动态权值调整。我们的方法优于现有的方法,实现了0.882余弦相似度,$9.7~L_{2}$范数和0.312 LPIPS。除了重建质量之外,我们还通过在固定错误匹配率(FMR)操作点、接收者工作特征(ROC)曲线和等错误率(EER)下分析成功攻击率(SAR)的验证鲁棒性来评估模板反转攻击的安全含义,在现实认证阈值下提供全面评估。这些结果强调了在面部识别认证系统中对鲁棒面部模板保护的迫切需要。
{"title":"An Improved Template Inversion Attack Against Korean Face Images","authors":"Koy Motita;Sang Hoon Han;Younho Lee","doi":"10.1109/ACCESS.2026.3655482","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3655482","url":null,"abstract":"Recent template inversion methods achieve strong performance on Western datasets, but poor results on Korean facial images. This gap poses security risks for the authentication system using facial recognition in Korea, as the stakeholders may misunderstand that protecting facial templates on their databases is not necessary. We propose an enhanced template inversion method with three improvements: refined preprocessing using per-eye alignment and GFPGAN upscaling, MSE-enhanced loss function, and dynamic weight adjustment. Our method outperforms existing approaches, achieving 0.882 cosine similarity, <inline-formula> <tex-math>$9.7~L_{2}$ </tex-math></inline-formula> norm, and 0.312 LPIPS. Beyond reconstruction quality, we evaluate the security implications of template inversion attacks by analyzing verification robustness using Successful Attack Rate (SAR) at fixed False Match Rate (FMR) operating points, Receiver Operating Characteristic (ROC) curves, and Equal Error Rate (EER), providing a comprehensive assessment under realistic authentication thresholds. These results emphasize the critical need for robust facial template protection in facial recognition authentication systems.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"10871-10882"},"PeriodicalIF":3.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11357908","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Learning-Based Faulty Component Diagnosis of Transmission Channels in ATE Affected by Thermal Degradation 热退化影响下ATE传输通道故障部件深度学习诊断
IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-19 DOI: 10.1109/ACCESS.2026.3655081
Jimin Gu;Jeonghyeon Choi;Youbean Kim
As the operating frequency of automated test equipment (ATE) increases, the thermal degradation of the components that constitute the channel accelerates. Degraded components cause signal integrity (SI) issues in the channel, which is a major factor in reducing the test quality and, thus, degrading the reliability of the ATE. Traditionally, test engineers have detected degraded components through direct probing; however, this process is time-consuming and necessitates an automated faulty component diagnosis framework. Accordingly, in this study, we propose a deep learning-based faulty component diagnosis framework to identify components that cause signal quality degradation due to heat in the ATE transmission channel. To analyze the effect of the thermal degradation of individual components on signal quality, a component modeling approach utilizing electromagnetic (EM) simulation was employed to construct a database of S-parameter data based on the temperature of the component. The simulation model demonstrated a high correlation with the measurement waveform data, with an average consistency of 97.1%, thereby ensuring its reliability. Furthermore, to address the issue of data scarcity in industrial environments, a conditional generative adversarial network (CGAN) was developed to generate S-parameter image data. The generated data showed a high similarity to the original S-parameter image data, with an average structural similarity index measure (SSIM) of 0.9845 and a peak signal-to-noise ratio (PSNR) of 35.21 dB. The convolutional neural network (CNN)-based faulty component diagnosis model trained with augmented data exhibited excellent performance, classifying faulty component types with an accuracy of 99.78%.
随着自动化测试设备(ATE)工作频率的增加,构成通道的组件的热降解加速。降级的组件导致通道中的信号完整性(SI)问题,这是降低测试质量的主要因素,从而降低了ATE的可靠性。传统上,测试工程师通过直接探测来检测退化的部件;然而,这个过程非常耗时,并且需要一个自动的故障组件诊断框架。因此,在本研究中,我们提出了一个基于深度学习的故障组件诊断框架,以识别由于ATE传输通道中的热量而导致信号质量下降的组件。为了分析单个部件热退化对信号质量的影响,采用电磁仿真的部件建模方法,建立了基于部件温度的s参数数据数据库。仿真模型与实测波形数据具有较高的相关性,平均一致性为97.1%,保证了仿真模型的可靠性。此外,为了解决工业环境中数据稀缺的问题,开发了一种条件生成对抗网络(CGAN)来生成s参数图像数据。生成的数据与原始s参数图像数据具有较高的相似性,平均结构相似性指数(SSIM)为0.9845,峰值信噪比(PSNR)为35.21 dB。采用增强数据训练的基于卷积神经网络(CNN)的故障部件诊断模型表现出优异的性能,对故障部件类型的分类准确率达到99.78%。
{"title":"Deep Learning-Based Faulty Component Diagnosis of Transmission Channels in ATE Affected by Thermal Degradation","authors":"Jimin Gu;Jeonghyeon Choi;Youbean Kim","doi":"10.1109/ACCESS.2026.3655081","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3655081","url":null,"abstract":"As the operating frequency of automated test equipment (ATE) increases, the thermal degradation of the components that constitute the channel accelerates. Degraded components cause signal integrity (SI) issues in the channel, which is a major factor in reducing the test quality and, thus, degrading the reliability of the ATE. Traditionally, test engineers have detected degraded components through direct probing; however, this process is time-consuming and necessitates an automated faulty component diagnosis framework. Accordingly, in this study, we propose a deep learning-based faulty component diagnosis framework to identify components that cause signal quality degradation due to heat in the ATE transmission channel. To analyze the effect of the thermal degradation of individual components on signal quality, a component modeling approach utilizing electromagnetic (EM) simulation was employed to construct a database of S-parameter data based on the temperature of the component. The simulation model demonstrated a high correlation with the measurement waveform data, with an average consistency of 97.1%, thereby ensuring its reliability. Furthermore, to address the issue of data scarcity in industrial environments, a conditional generative adversarial network (CGAN) was developed to generate S-parameter image data. The generated data showed a high similarity to the original S-parameter image data, with an average structural similarity index measure (SSIM) of 0.9845 and a peak signal-to-noise ratio (PSNR) of 35.21 dB. The convolutional neural network (CNN)-based faulty component diagnosis model trained with augmented data exhibited excellent performance, classifying faulty component types with an accuracy of 99.78%.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"11019-11034"},"PeriodicalIF":3.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11357867","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Robust Occlusion-Aware Deep Learning Architecture for Thermal Aerial Person Classification in Search-and-Rescue Missions 一种鲁棒闭塞感知深度学习架构用于热航搜救任务中的人员分类
IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-19 DOI: 10.1109/ACCESS.2026.3655411
Christian Woesle;Leopold Fischer-Brandies;Ricardo Buettner
Uncrewed aerial vehicles equipped with long-wave infrared cameras are a promising tool for locating missing persons, yet their performance collapses when more than 70% of a human body is obscured by vegetation. We address this limitation by reformulating the task as a thermal classification problem rather than a detection problem, allowing the system to focus on thermal appearance cues that remain visible under heavy occlusion. Recognition robustness is further affected by high-frequency infrared sensor noise, which increases with flight altitude. Using five-fold cross-validation at flight altitudes of 30 m, 50 m, and 70 m, we show that the proposed classification pipeline achieves an accuracy of 99.07% and maintains strong recall under heavy occlusion while operating at a low computational cost. Analysis of altitude-specific models demonstrates that the Gaussian-enhanced pipeline provides a statistically significant improvement in robustness, with the largest gains observed at higher flight altitudes due to more effective suppression of altitude-dependent sensor noise. These findings establish an operational robustness benchmark for occlusion-aware thermal person classification and provide a reproducible foundation for improving the reliability of uncrewed aerial vehicle search-and-rescue systems.
配备长波红外摄像机的无人驾驶飞行器是一种很有前途的寻找失踪人员的工具,但当超过70%的人体被植被遮挡时,它们的性能就会下降。我们通过将任务重新表述为热分类问题而不是检测问题来解决这一限制,允许系统专注于在严重遮挡下仍然可见的热外观线索。红外传感器高频噪声随着飞行高度的增加而增加,进而影响识别的鲁棒性。在飞行高度分别为30米、50米和70米的情况下,通过五次交叉验证,我们发现所提出的分类管道的准确率达到99.07%,在严重遮挡下保持了很强的召回率,同时计算成本较低。对特定高度模型的分析表明,高斯增强管道在鲁棒性方面提供了统计上显著的改进,由于更有效地抑制了与高度相关的传感器噪声,在更高的飞行高度观察到最大的增益。这些发现为闭塞感知热人分类建立了操作稳健性基准,并为提高无人飞行器搜救系统的可靠性提供了可重复性基础。
{"title":"A Robust Occlusion-Aware Deep Learning Architecture for Thermal Aerial Person Classification in Search-and-Rescue Missions","authors":"Christian Woesle;Leopold Fischer-Brandies;Ricardo Buettner","doi":"10.1109/ACCESS.2026.3655411","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3655411","url":null,"abstract":"Uncrewed aerial vehicles equipped with long-wave infrared cameras are a promising tool for locating missing persons, yet their performance collapses when more than 70% of a human body is obscured by vegetation. We address this limitation by reformulating the task as a thermal classification problem rather than a detection problem, allowing the system to focus on thermal appearance cues that remain visible under heavy occlusion. Recognition robustness is further affected by high-frequency infrared sensor noise, which increases with flight altitude. Using five-fold cross-validation at flight altitudes of 30 m, 50 m, and 70 m, we show that the proposed classification pipeline achieves an accuracy of 99.07% and maintains strong recall under heavy occlusion while operating at a low computational cost. Analysis of altitude-specific models demonstrates that the Gaussian-enhanced pipeline provides a statistically significant improvement in robustness, with the largest gains observed at higher flight altitudes due to more effective suppression of altitude-dependent sensor noise. These findings establish an operational robustness benchmark for occlusion-aware thermal person classification and provide a reproducible foundation for improving the reliability of uncrewed aerial vehicle search-and-rescue systems.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"10923-10938"},"PeriodicalIF":3.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11357877","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SWIFT-FMQA: Enhancing Factorization Machine With Quadratic-Optimization Annealing via Sliding Window SWIFT-FMQA:基于滑动窗的二次优化退火改进因子分解机
IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-19 DOI: 10.1109/ACCESS.2026.3655591
Mayumi Nakano;Yuya Seki;Shuta Kikuchi;Shu Tanaka
Derivative-free (DF) optimization problems aim to identify an input that maximizes or minimizes the output of an objective function whose input-output relationship is unknown. Factorization machine with quadratic-optimization annealing (FMQA) is a promising approach to this task, employing a factorization machine (FM) as a surrogate model to iteratively guide the solution search via an Ising machine. Although FMQA has demonstrated strong optimization performance across various applications, its performance often stagnates as the number of optimization iterations increases. One contributing factor to this stagnation is the growing number of data points in the dataset used to train FM. As more data are accumulated, the contribution of newly added data points tends to become diluted within the entire dataset. Based on this observation, we hypothesize that such dilution reduces the impact of new data on improving the prediction accuracy of FM. To address this issue, we propose a novel method named sliding window for iterative factorization training combined with FMQA (SWIFT-FMQA). This method improves upon FMQA by utilizing a sliding-window strategy to sequentially construct a dataset that retains at most a specified number of the most recently added data points. SWIFT-FMQA is designed to enhance the influence of newly added data points on the surrogate model. Numerical experiments demonstrate that SWIFT-FMQA obtains lower-cost solutions with fewer objective function evaluations compared to FMQA.
无导数(DF)优化问题的目的是确定输入与输出关系未知的目标函数的输出最大化或最小的输入。二次优化退火分解机(FMQA)是一种很有前途的方法,它采用分解机(FM)作为代理模型,通过伊辛机迭代地指导解的搜索。尽管FMQA已经在各种应用程序中展示了强大的优化性能,但随着优化迭代次数的增加,其性能通常会停滞不前。造成这种停滞的一个因素是用于训练FM的数据集中越来越多的数据点。随着数据的积累,新添加的数据点在整个数据集中的贡献往往会被稀释。基于这一观察,我们假设这种稀释降低了新数据对提高FM预测精度的影响。为了解决这一问题,我们提出了一种滑动窗口迭代分解训练与FMQA相结合的新方法(SWIFT-FMQA)。该方法在FMQA的基础上进行了改进,利用滑动窗口策略顺序构建最多保留指定数量的最新添加数据点的数据集。SWIFT-FMQA旨在增强新添加的数据点对代理模型的影响。数值实验表明,与FMQA相比,SWIFT-FMQA能以更少的目标函数评估获得更低成本的解。
{"title":"SWIFT-FMQA: Enhancing Factorization Machine With Quadratic-Optimization Annealing via Sliding Window","authors":"Mayumi Nakano;Yuya Seki;Shuta Kikuchi;Shu Tanaka","doi":"10.1109/ACCESS.2026.3655591","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3655591","url":null,"abstract":"Derivative-free (DF) optimization problems aim to identify an input that maximizes or minimizes the output of an objective function whose input-output relationship is unknown. Factorization machine with quadratic-optimization annealing (FMQA) is a promising approach to this task, employing a factorization machine (FM) as a surrogate model to iteratively guide the solution search via an Ising machine. Although FMQA has demonstrated strong optimization performance across various applications, its performance often stagnates as the number of optimization iterations increases. One contributing factor to this stagnation is the growing number of data points in the dataset used to train FM. As more data are accumulated, the contribution of newly added data points tends to become diluted within the entire dataset. Based on this observation, we hypothesize that such dilution reduces the impact of new data on improving the prediction accuracy of FM. To address this issue, we propose a novel method named sliding window for iterative factorization training combined with FMQA (SWIFT-FMQA). This method improves upon FMQA by utilizing a sliding-window strategy to sequentially construct a dataset that retains at most a specified number of the most recently added data points. SWIFT-FMQA is designed to enhance the influence of newly added data points on the surrogate model. Numerical experiments demonstrate that SWIFT-FMQA obtains lower-cost solutions with fewer objective function evaluations compared to FMQA.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"10977-10990"},"PeriodicalIF":3.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11357563","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a Realistic Model to Accurately Predict the “Mirrored S-Curve” Nature of LED Luminaire Lumen Maintenance for Any Operating Conditions 开发一个现实的模型,以准确预测任何工作条件下LED灯具流明维护的“镜像s曲线”性质
IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-19 DOI: 10.1109/ACCESS.2026.3655701
Savitha G. Kini;J. Lokesh;Anjan N. Padmasali
In the current era, LED lighting technology is the most widely used illumination source in all applications worldwide. Accurately predicting lumen degradation and lifetime performance has become critical for ensuring long-term reliability and cost-effectiveness. Traditional models often fail to capture the complex, non-linear nature of real-world degradation behavior. The work systematically models the lumen degradation behavior of LED luminaires using a four-parameter double exponential Gompertz function. The proposed model effectively captures the asymmetric, mirrored S-curve behavior observed in long-term degradation profiles of LED luminaires, which traditional exponential models fail to represent accurately. Experimental data from accelerated degradation tests conducted on three different commercial 16W LED luminaires were used to develop the model. The SEM-EDS analysis identified silver mirror tarnishing as a dominant physical degradation mechanism, providing material-level insight into the observed steep lumen drop during mid-life operation. A key contribution of this work is the development of a predictive framework that correlates proposed model coefficients with temperature using only three accelerated degradation tests. This enables accurate estimation of lumen maintenance performance at untested operating conditions, significantly reducing the need for exhaustive physical testing. The proposed methodology provides a practical, scalable, and cost-effective solution for predicting LED lifetime, making it highly applicable to both research and industry. It supports sustainable lighting development by improving lifetime prediction accuracy while reducing experimental burden, thereby contributing to energy-efficient operation and responsible resource utilization.
在当今时代,LED照明技术是世界范围内应用最广泛的照明光源。准确预测流明衰减和寿命性能对于确保长期可靠性和成本效益至关重要。传统的模型往往不能捕捉到现实世界中复杂的、非线性的退化行为。本文采用四参数双指数Gompertz函数系统地模拟了LED灯具的流明衰减行为。所提出的模型有效地捕获了LED灯具长期退化曲线中观察到的不对称镜像s曲线行为,而传统的指数模型无法准确地表示这一点。在三种不同的商用16W LED灯具上进行的加速退化测试的实验数据用于开发该模型。SEM-EDS分析发现,银镜变色是主要的物理降解机制,为观察到的中期运行期间的急剧流明下降提供了材料层面的见解。这项工作的一个关键贡献是开发了一个预测框架,该框架仅使用三个加速降解试验就将提出的模型系数与温度联系起来。这使得在未经测试的操作条件下准确估计流明维护性能,大大减少了详尽的物理测试的需要。所提出的方法为预测LED寿命提供了一种实用、可扩展且具有成本效益的解决方案,使其高度适用于研究和工业。它通过提高寿命预测精度,同时减少实验负担,从而促进节能运行和负责任的资源利用,从而支持可持续照明发展。
{"title":"Development of a Realistic Model to Accurately Predict the “Mirrored S-Curve” Nature of LED Luminaire Lumen Maintenance for Any Operating Conditions","authors":"Savitha G. Kini;J. Lokesh;Anjan N. Padmasali","doi":"10.1109/ACCESS.2026.3655701","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3655701","url":null,"abstract":"In the current era, LED lighting technology is the most widely used illumination source in all applications worldwide. Accurately predicting lumen degradation and lifetime performance has become critical for ensuring long-term reliability and cost-effectiveness. Traditional models often fail to capture the complex, non-linear nature of real-world degradation behavior. The work systematically models the lumen degradation behavior of LED luminaires using a four-parameter double exponential Gompertz function. The proposed model effectively captures the asymmetric, mirrored S-curve behavior observed in long-term degradation profiles of LED luminaires, which traditional exponential models fail to represent accurately. Experimental data from accelerated degradation tests conducted on three different commercial 16W LED luminaires were used to develop the model. The SEM-EDS analysis identified silver mirror tarnishing as a dominant physical degradation mechanism, providing material-level insight into the observed steep lumen drop during mid-life operation. A key contribution of this work is the development of a predictive framework that correlates proposed model coefficients with temperature using only three accelerated degradation tests. This enables accurate estimation of lumen maintenance performance at untested operating conditions, significantly reducing the need for exhaustive physical testing. The proposed methodology provides a practical, scalable, and cost-effective solution for predicting LED lifetime, making it highly applicable to both research and industry. It supports sustainable lighting development by improving lifetime prediction accuracy while reducing experimental burden, thereby contributing to energy-efficient operation and responsible resource utilization.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"10860-10870"},"PeriodicalIF":3.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11358875","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Unified Lightweight Network for Complex Scene Image Understanding via Multi-Task Joint Learning 基于多任务联合学习的复杂场景图像理解统一轻量级网络
IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-19 DOI: 10.1109/ACCESS.2026.3655826
Tingting Guo;Sainan Yang;Yao Fu;Daitao Wang
Multi-task joint learning for complex scene image understanding faces multiple challenges, including diverse visual elements, task-specific demands, and constrained computational resources. These challenges are particularly prominent in specialized domains such as Intangible Cultural Heritage (ICH), where current research lacks effective joint modeling approaches for image classification, semantic segmentation, and object localization tasks. To address this gap, we introduce a novel multi-task visual understanding problem tailored for ICH scenarios, and construct a high-quality dataset—ICH-Scene3800—comprising 3,800 annotated images across 12 representative ICH categories. To tackle this task, we propose the first lightweight multi-task learning framework capable of performing image-level classification, instance-level localization, and instance-level detection simultaneously. The framework employs a shared backbone to learn general-purpose features and integrates an attention-guided dynamic fusion mechanism that facilitates cross-task semantic interaction. Furthermore, a group-convolution-based lightweight architecture is introduced to enable efficient feature extraction and resource-aware deployment. These designs significantly enhance the model’s generalization ability across tasks and scenes. Extensive experiments on ICH-Scene3800 and the Cityscapes dataset demonstrate that our model achieves 92.19% mIoU and 82.36% mIoU, respectively, with only 0.024M parameters and 0.085 GFLOPs. It reaches a real-time processing speed of 98.5 FPS on an NVIDIA GeForce GTX 1060 (6GB) and significantly outperforms existing methods on the LSES metric, achieving state-of-the-art performance. This research provides a practical and efficient solution for intelligent visual understanding in cultural heritage preservation and other resource-constrained application scenarios. The code and related materials are available at https://github.com/Upno111/ICH
复杂场景图像理解的多任务联合学习面临多种挑战,包括不同的视觉元素、任务特定的需求和有限的计算资源。这些挑战在非物质文化遗产(ICH)等专业领域尤为突出,目前的研究缺乏有效的联合建模方法来进行图像分类、语义分割和目标定位任务。为了解决这一差距,我们引入了一种针对ICH场景量身定制的新型多任务视觉理解问题,并构建了一个高质量的数据集ICH- scene3800 -包含12个代表性ICH类别的3800张带注释的图像。为了解决这个问题,我们提出了第一个轻量级的多任务学习框架,能够同时执行图像级分类、实例级定位和实例级检测。该框架采用共享主干学习通用特性,并集成了注意力引导的动态融合机制,促进了跨任务语义交互。此外,引入了基于群卷积的轻量级架构,实现了高效的特征提取和资源感知部署。这些设计显著提高了模型跨任务和场景的泛化能力。在ICH-Scene3800和cityscape数据集上的大量实验表明,我们的模型在仅使用0.024M参数和0.085 GFLOPs的情况下分别实现了92.19%和82.36%的mIoU。它在NVIDIA GeForce GTX 1060 (6GB)上达到98.5 FPS的实时处理速度,在LSES指标上显著优于现有方法,实现了最先进的性能。本研究为文化遗产保护等资源受限应用场景下的视觉智能理解提供了一种实用高效的解决方案。代码和相关材料可在https://github.com/Upno111/ICH上获得
{"title":"A Unified Lightweight Network for Complex Scene Image Understanding via Multi-Task Joint Learning","authors":"Tingting Guo;Sainan Yang;Yao Fu;Daitao Wang","doi":"10.1109/ACCESS.2026.3655826","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3655826","url":null,"abstract":"Multi-task joint learning for complex scene image understanding faces multiple challenges, including diverse visual elements, task-specific demands, and constrained computational resources. These challenges are particularly prominent in specialized domains such as Intangible Cultural Heritage (ICH), where current research lacks effective joint modeling approaches for image classification, semantic segmentation, and object localization tasks. To address this gap, we introduce a novel multi-task visual understanding problem tailored for ICH scenarios, and construct a high-quality dataset—ICH-Scene3800—comprising 3,800 annotated images across 12 representative ICH categories. To tackle this task, we propose the first lightweight multi-task learning framework capable of performing image-level classification, instance-level localization, and instance-level detection simultaneously. The framework employs a shared backbone to learn general-purpose features and integrates an attention-guided dynamic fusion mechanism that facilitates cross-task semantic interaction. Furthermore, a group-convolution-based lightweight architecture is introduced to enable efficient feature extraction and resource-aware deployment. These designs significantly enhance the model’s generalization ability across tasks and scenes. Extensive experiments on ICH-Scene3800 and the Cityscapes dataset demonstrate that our model achieves 92.19% mIoU and 82.36% mIoU, respectively, with only 0.024M parameters and 0.085 GFLOPs. It reaches a real-time processing speed of 98.5 FPS on an NVIDIA GeForce GTX 1060 (6GB) and significantly outperforms existing methods on the LSES metric, achieving state-of-the-art performance. This research provides a practical and efficient solution for intelligent visual understanding in cultural heritage preservation and other resource-constrained application scenarios. The code and related materials are available at <uri>https://github.com/Upno111/ICH</uri>","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"14916-14930"},"PeriodicalIF":3.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11358991","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Access
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1