IEEE Access最新文献_第2页

User Sentiment Analysis of Solar Energy Monitoring Apps: Insights From Google Play Reviews 太阳能监测应用的用户情绪分析：来自谷歌Play评论的见解

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access

Pub Date : 2026-03-04 DOI: 10.1109/ACCESS.2026.3670366

Afnan Shobil;Salah Al-Hagree

With a focus on user reviews from the Google Play Store, this study investigates user perceptions of mobile applications designed for remote monitoring and control of solar energy systems. This study uses sentiment analysis, a powerful natural language processing tool, to extract valuable information from structured and unstructured text data. It classifies opinions and comments as neutral, negative, or positive to understand user experiences, identify significant areas of satisfaction and dissatisfaction, and provide application developers with relevant information by analyzing more than 17,100 user comments across 18 applications that were updated in 2024, collected using the Google-Play-scraper Python script. The reviews were analyzed using the Valence Aware Dictionary and Sentiment Reasoner (VADER) to determine whether they are positive, neutral, or negative. The analysis revealed that a large majority of users, 67.4%, expressed primarily positive sentiments about the ease of use and effectiveness of these apps in monitoring solar systems. Approximately 15.1% of the reviews are negative, and 17.5% are neutral. The analysis leverages classical supervised machine learning (ML) models in combination with lexicon-based sentiment analysis (VADER) to classify the sentiment expressed in comments. Support Vector Machine (SVM) is one of the most effective supervised machine learning methods that can be used for sentiment classification tasks, achieving 91% accuracy, indicating the effectiveness of SVM in sentiment classification when compared to other machine learning methods, such as Decision Tree (DT), Logistic Regression (LR), RidgeClassifier, Random Forest (RF), and Naive Bayes (NB) which achieved accuracies of 90%, 89%, 87%, 68%, and 40%, respectively. However, the study also identified areas for improvement, such as addressing technical bugs, improving response times, and enhancing the user interface. These findings provide valuable insights for developers to improve user experience, enhance app functionality, and ultimately increase user satisfaction with solar monitoring apps. The findings also show that sentiment analysis is a useful technique for classifying and assessing user reviews and feedback.

本研究以b谷歌Play Store的用户评论为重点，调查了用户对太阳能系统远程监测和控制的移动应用程序的看法。本研究使用情感分析这一强大的自然语言处理工具，从结构化和非结构化文本数据中提取有价值的信息。它将意见和评论分为中性、消极或积极，以了解用户体验，确定满意和不满意的重要领域，并通过分析使用Google-Play-scraper Python脚本收集的18个应用程序在2024年更新的17,100多条用户评论，为应用程序开发人员提供相关信息。使用效价感知词典和情感推理器（VADER）对评论进行分析，以确定它们是积极的、中性的还是消极的。分析显示，绝大多数用户（67.4%）对这些应用程序在监测太阳能系统方面的易用性和有效性表达了积极的看法。大约15.1%的评论是负面的，17.5%的评论是中性的。该分析利用经典的监督机器学习（ML）模型与基于词典的情感分析（VADER）相结合，对评论中表达的情感进行分类。支持向量机（SVM）是最有效的监督机器学习方法之一，可用于情感分类任务，准确率达到91%，表明SVM在情感分类方面的有效性，与其他机器学习方法相比，如决策树（DT）、逻辑回归（LR）、ridgecclassifier、随机森林（RF）和朴素贝叶斯（NB），其准确率分别达到90%、89%、87%、68%和40%。然而，该研究还确定了需要改进的领域，例如解决技术错误、改进响应时间和增强用户界面。这些发现为开发人员提供了宝贵的见解，以改善用户体验，增强应用程序功能，并最终提高用户对太阳能监测应用程序的满意度。研究结果还表明，情感分析对于分类和评估用户评论和反馈是一种有用的技术。

{"title":"User Sentiment Analysis of Solar Energy Monitoring Apps: Insights From Google Play Reviews","authors":"Afnan Shobil;Salah Al-Hagree","doi":"10.1109/ACCESS.2026.3670366","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3670366","url":null,"abstract":"With a focus on user reviews from the Google Play Store, this study investigates user perceptions of mobile applications designed for remote monitoring and control of solar energy systems. This study uses sentiment analysis, a powerful natural language processing tool, to extract valuable information from structured and unstructured text data. It classifies opinions and comments as neutral, negative, or positive to understand user experiences, identify significant areas of satisfaction and dissatisfaction, and provide application developers with relevant information by analyzing more than 17,100 user comments across 18 applications that were updated in 2024, collected using the Google-Play-scraper Python script. The reviews were analyzed using the Valence Aware Dictionary and Sentiment Reasoner (VADER) to determine whether they are positive, neutral, or negative. The analysis revealed that a large majority of users, 67.4%, expressed primarily positive sentiments about the ease of use and effectiveness of these apps in monitoring solar systems. Approximately 15.1% of the reviews are negative, and 17.5% are neutral. The analysis leverages classical supervised machine learning (ML) models in combination with lexicon-based sentiment analysis (VADER) to classify the sentiment expressed in comments. Support Vector Machine (SVM) is one of the most effective supervised machine learning methods that can be used for sentiment classification tasks, achieving 91% accuracy, indicating the effectiveness of SVM in sentiment classification when compared to other machine learning methods, such as Decision Tree (DT), Logistic Regression (LR), RidgeClassifier, Random Forest (RF), and Naive Bayes (NB) which achieved accuracies of 90%, 89%, 87%, 68%, and 40%, respectively. However, the study also identified areas for improvement, such as addressing technical bugs, improving response times, and enhancing the user interface. These findings provide valuable insights for developers to improve user experience, enhance app functionality, and ultimately increase user satisfaction with solar monitoring apps. The findings also show that sentiment analysis is a useful technique for classifying and assessing user reviews and feedback.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"38421-38433"},"PeriodicalIF":3.6,"publicationDate":"2026-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11421327","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust and Reusable Fuzzy Extractor for Low-Entropy Distributions and Application to User Authentication 低熵分布的鲁棒可重用模糊提取及其在用户认证中的应用

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access

Pub Date : 2026-03-03 DOI: 10.1109/ACCESS.2026.3670084

Somnath Panja;Nikita Tripathi;Shaoquan Jiang;Reihaneh Safavi-Naini

Fuzzy extractors (FEs) are cryptographic primitives that generate a random key from a sample of a randomness source, together with some helper data that can be used to reproduce the key from a second sample of the source that is “close” to the first one. Reusable FEs can be used securely to derive multiple keys. Robust FEs detect any tempering with the helper data with overwhelming probability. In this paper we make two contributions. First, we construct a computationally secure, robust, and reusable FE (rrFE) that satisfies the strongest notion of reusability security, and its security reduces to the hardness of a new computational assumption that is closely related to LPN problem, for which no efficient quantum algorithm is known. The proof is in the standard model, answering an open question of Canetti et al. (Journal of Cryptology 2021). We implement and evaluate our rrFE using a widely used data set of iris data as the randomness source and compare its performance with a reusable only scheme for the same source. We also introduce and formalize password-protected fuzzy extractors (PPFEs), which use passwords as an additional source of entropy to enhance the security of biometric data against offline attacks (by a constant amount). We further present a PPFE construction with proved security. Second, we motivate and propose a new cryptographic primitive called Password Protected Message Retrieval (PPMR) that enables a user to securely store the helper data on a remote server and later securely recover it on a local device, using one round of communication without requiring a secret key and only relying on a retrieval password. The helper data is used to recover a biometric-based pre-shared key with the server. This removes the need to store sensitive biometric data on the device and allows the AKE protocol to be securely executed from any device that can correctly execute a code (e.g. an uncorrupted terminal).

模糊提取器（FEs）是一种加密原语，它从随机源的样本生成随机密钥，以及一些辅助数据，这些数据可用于从与第一个“接近”的源的第二个样本复制密钥。可重用的fe可以安全地用于派生多个密钥。健壮的FEs以压倒性的概率检测对助手数据的任何篡改。在本文中，我们做了两个贡献。首先，我们构建了一个计算安全、鲁棒和可重用的FE (rrFE)，它满足最强的可重用性安全概念，其安全性降低到与LPN问题密切相关的新计算假设的硬度，而LPN问题目前还没有有效的量子算法。该证明是在标准模型中，回答了Canetti等人的一个开放问题（Journal of Cryptology 2021）。我们使用广泛使用的虹膜数据集作为随机源来实现和评估我们的rrFE，并将其性能与同一来源的可重用方案进行比较。我们还引入并形式化了密码保护的模糊提取器（ppfe），它使用密码作为额外的熵源来增强生物特征数据抵御离线攻击的安全性（通过恒定的数量）。我们进一步提出了一种安全可靠的PPFE结构。其次，我们激发并提出了一种新的加密原语，称为密码保护消息检索（PPMR），它使用户能够安全地将助手数据存储在远程服务器上，然后在本地设备上安全地恢复它，使用一轮通信，而不需要密钥，只依赖于检索密码。helper数据用于与服务器恢复基于生物特征的预共享密钥。这消除了在设备上存储敏感生物识别数据的需要，并允许AKE协议从可以正确执行代码的任何设备安全地执行（例如，未损坏的终端）。

{"title":"Robust and Reusable Fuzzy Extractor for Low-Entropy Distributions and Application to User Authentication","authors":"Somnath Panja;Nikita Tripathi;Shaoquan Jiang;Reihaneh Safavi-Naini","doi":"10.1109/ACCESS.2026.3670084","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3670084","url":null,"abstract":"Fuzzy extractors (FEs) are cryptographic primitives that generate a random key from a sample of a randomness source, together with some helper data that can be used to reproduce the key from a second sample of the source that is “close” to the first one. Reusable FEs can be used securely to derive multiple keys. Robust FEs detect any tempering with the helper data with overwhelming probability. In this paper we make two contributions. First, we construct a computationally secure, robust, and reusable FE (rrFE) that satisfies the strongest notion of reusability security, and its security reduces to the hardness of a new computational assumption that is closely related to LPN problem, for which no efficient quantum algorithm is known. The proof is in the standard model, answering an open question of Canetti et al. (Journal of Cryptology 2021). We implement and evaluate our rrFE using a widely used data set of iris data as the randomness source and compare its performance with a reusable only scheme for the same source. We also introduce and formalize password-protected fuzzy extractors (PPFEs), which use passwords as an additional source of entropy to enhance the security of biometric data against offline attacks (by a constant amount). We further present a PPFE construction with proved security. Second, we motivate and propose a new cryptographic primitive called Password Protected Message Retrieval (PPMR) that enables a user to securely store the helper data on a remote server and later securely recover it on a local device, using one round of communication without requiring a secret key and only relying on a retrieval password. The helper data is used to recover a biometric-based pre-shared key with the server. This removes the need to store sensitive biometric data on the device and allows the AKE protocol to be securely executed from any device that can correctly execute a code (e.g. an uncorrupted terminal).","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"38230-38250"},"PeriodicalIF":3.6,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11418906","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Expert Trajectory Prediction for Highway Weaving Sections Using Conflict Potential Energy and GAN 基于冲突势能和GAN的高速公路交叉路段多专家轨迹预测

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access

Pub Date : 2026-03-02 DOI: 10.1109/ACCESS.2026.3669162

Lei Wang;Heng Zhang;Xiuying Wang;Zhiwei Guan;Mei Xiao;Jian Liu;Mingjiang Wei;Yong Pan

To address the challenge of lane-changing prediction for autonomous vehicles (AVs) within highway weaving sections characterized by dynamic interactions in human-machine mixed traffic flows, this paper proposes a multi-expert collaborative prediction model based on a dynamic conflict field (DCF) and Generative Adversarial Networks, termed G-MoE-WGAN. This model quantifies the dynamic game-theoretic intensity among interacting vehicles via the DCF. The proposed model constructs a Mixture of Experts (MoE) system to facilitate intention decision-making and trajectory generation, employing adversarial training to optimize the physical plausibility and interaction adaptability of the prediction results. Furthermore, a Physics-Informed Neural Network (PINN) is introduced for the reconstruction and smoothing of raw naturalistic driving trajectories to validate the model. Experimental results demonstrate that the G-MoE-WGAN achieves a lane-changing intention classification accuracy of 94.35%, representing an improvement of 3.89% to 16.55% compared to baseline models. Within a 3-second prediction horizon, the Final Displacement Error (FDE) and Average Displacement Error (ADE) of the proposed model are significantly reduced by 8.61%–40.55% and 3.21%–40.27%, respectively. In the 5-second long-term prediction, benefiting from the dynamic expert fusion mechanism, the ADE and FDE metrics exhibit a reduction in error of 3.19%–39.29% relative to comparative methods. The study indicates that the proposed conflict potential representation and the multi-expert adversarial training mechanism effectively capture the spatiotemporal heterogeneity of high-dynamic interactions in weaving sections significantly enhancing the robustness and interpretability of lane-changing predictions in complex scenarios.

为了解决人机混合交通流中自动驾驶汽车（av）在高速公路编织路段内动态交互的变道预测问题，本文提出了一种基于动态冲突场（DCF）和生成对抗网络的多专家协同预测模型，称为G-MoE-WGAN。该模型通过DCF量化了相互作用的车辆之间的动态博弈论强度。该模型构建了一个混合专家（MoE）系统来促进意图决策和轨迹生成，并采用对抗训练来优化预测结果的物理合理性和交互适应性。此外，引入了物理信息神经网络（PINN）对原始自然驾驶轨迹进行重建和平滑以验证模型。实验结果表明，G-MoE-WGAN的变道意图分类准确率为94.35%，较基线模型提高了3.89% ~ 16.55%。在3秒的预测范围内，该模型的最终位移误差（FDE）和平均位移误差（ADE）分别显著降低了8.61% ~ 40.55%和3.21% ~ 40.27%。在5秒的长期预测中，得益于动态专家融合机制，ADE和FDE指标相对于比较方法的误差降低了3.19%-39.29%。研究表明，所提出的冲突潜在表征和多专家对抗训练机制有效地捕捉了编织路段高动态相互作用的时空异质性，显著提高了复杂场景下变道预测的鲁棒性和可解释性。

{"title":"Multi-Expert Trajectory Prediction for Highway Weaving Sections Using Conflict Potential Energy and GAN","authors":"Lei Wang;Heng Zhang;Xiuying Wang;Zhiwei Guan;Mei Xiao;Jian Liu;Mingjiang Wei;Yong Pan","doi":"10.1109/ACCESS.2026.3669162","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3669162","url":null,"abstract":"To address the challenge of lane-changing prediction for autonomous vehicles (AVs) within highway weaving sections characterized by dynamic interactions in human-machine mixed traffic flows, this paper proposes a multi-expert collaborative prediction model based on a dynamic conflict field (DCF) and Generative Adversarial Networks, termed G-MoE-WGAN. This model quantifies the dynamic game-theoretic intensity among interacting vehicles via the DCF. The proposed model constructs a Mixture of Experts (MoE) system to facilitate intention decision-making and trajectory generation, employing adversarial training to optimize the physical plausibility and interaction adaptability of the prediction results. Furthermore, a Physics-Informed Neural Network (PINN) is introduced for the reconstruction and smoothing of raw naturalistic driving trajectories to validate the model. Experimental results demonstrate that the G-MoE-WGAN achieves a lane-changing intention classification accuracy of 94.35%, representing an improvement of 3.89% to 16.55% compared to baseline models. Within a 3-second prediction horizon, the Final Displacement Error (FDE) and Average Displacement Error (ADE) of the proposed model are significantly reduced by 8.61%–40.55% and 3.21%–40.27%, respectively. In the 5-second long-term prediction, benefiting from the dynamic expert fusion mechanism, the ADE and FDE metrics exhibit a reduction in error of 3.19%–39.29% relative to comparative methods. The study indicates that the proposed conflict potential representation and the multi-expert adversarial training mechanism effectively capture the spatiotemporal heterogeneity of high-dynamic interactions in weaving sections significantly enhancing the robustness and interpretability of lane-changing predictions in complex scenarios.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34473-34492"},"PeriodicalIF":3.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11417815","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Low-Cost FPGA-Enhanced CNN Accelerator for Real-Time YOLO Object Detection and Classification 用于实时YOLO目标检测和分类的低成本fpga增强CNN加速器

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access

Pub Date : 2026-03-02 DOI: 10.1109/ACCESS.2026.3669451

John S. Fata;Wafa M. Elmannai

This work presents a low-cost FPGA-based accelerator for real-time object detection and classification using a compressed YOLOv3-Tiny model. Existing FPGA-based CNN accelerators excel in one critical performance metric but sacrifice either throughput, accuracy, or power efficiency. This is particularly the case for low-cost devices that are resource-constrained and often heavily rely on off-chip memory which hinders performance. To address these limitations, we introduce three novel contributions: 1) an iterative structured hardware pruning algorithm that removes the least important filters from the YOLO model in small increments; 2) a quantization-aware training (QAT) algorithm that adapts the scaling factor per layer; and 3) a custom RTL memory-mapping controller that prioritizes on-chip BRAM/URAM memory allocation to improve throughput while decreasing power consumption. With this approach, the model size was reduced by

$13.3times $

while preserving accuracy. Implemented on a low-cost Kria KV260 FPGA, the approach achieved 93.8% detection accuracy, 24.3 FPS throughput, 41.59 ms latency, and just 2.13W of power consumption. The result was a high-performing, efficient system that directly outperformed comparable low-cost designs. These results demonstrate that balanced, high-performance YOLO inference is attainable on low-cost FPGA hardware without reliance on off-chip memory.

这项工作提出了一种低成本的基于fpga的加速器，用于实时目标检测和分类，使用压缩的YOLOv3-Tiny模型。现有的基于fpga的CNN加速器在一个关键性能指标上表现出色，但牺牲了吞吐量、精度或功率效率。对于资源受限的低成本设备来说尤其如此，并且通常严重依赖芯片外存储器，从而阻碍了性能。为了解决这些限制，我们引入了三个新的贡献：1)迭代结构化硬件修剪算法，以小增量从YOLO模型中去除最不重要的过滤器；2)一种量化感知训练（QAT）算法，该算法自适应每层的比例因子；3)自定义RTL内存映射控制器，优先考虑片上BRAM/URAM内存分配，以提高吞吐量，同时降低功耗。通过这种方法，模型尺寸减少了13.3倍，同时保持了准确性。该方法在低成本Kria KV260 FPGA上实现，检测精度为93.8%，吞吐量为24.3 FPS，延迟为41.59 ms，功耗仅为2.13W。结果是一个高性能、高效的系统，直接优于类似的低成本设计。这些结果表明，平衡、高性能的YOLO推理可以在低成本的FPGA硬件上实现，而不依赖于片外存储器。

{"title":"Low-Cost FPGA-Enhanced CNN Accelerator for Real-Time YOLO Object Detection and Classification","authors":"John S. Fata;Wafa M. Elmannai","doi":"10.1109/ACCESS.2026.3669451","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3669451","url":null,"abstract":"This work presents a low-cost FPGA-based accelerator for real-time object detection and classification using a compressed YOLOv3-Tiny model. Existing FPGA-based CNN accelerators excel in one critical performance metric but sacrifice either throughput, accuracy, or power efficiency. This is particularly the case for low-cost devices that are resource-constrained and often heavily rely on off-chip memory which hinders performance. To address these limitations, we introduce three novel contributions: 1) an iterative structured hardware pruning algorithm that removes the least important filters from the YOLO model in small increments; 2) a quantization-aware training (QAT) algorithm that adapts the scaling factor per layer; and 3) a custom RTL memory-mapping controller that prioritizes on-chip BRAM/URAM memory allocation to improve throughput while decreasing power consumption. With this approach, the model size was reduced by <inline-formula> <tex-math>$13.3times $ </tex-math></inline-formula> while preserving accuracy. Implemented on a low-cost Kria KV260 FPGA, the approach achieved 93.8% detection accuracy, 24.3 FPS throughput, 41.59 ms latency, and just 2.13W of power consumption. The result was a high-performing, efficient system that directly outperformed comparable low-cost designs. These results demonstrate that balanced, high-performance YOLO inference is attainable on low-cost FPGA hardware without reliance on off-chip memory.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34614-34642"},"PeriodicalIF":3.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11417804","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Web-Ready and 5G-Ready Volumetric Video Streaming Platform: A Platform Prototype and Empirical Study Web-Ready和5G-Ready容量视频流平台：平台原型和实证研究

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access

Pub Date : 2026-03-02 DOI: 10.1109/ACCESS.2026.3669714

A. Dominguez;B. da Costa Paulo;I. Burguera;I. Tamayo;A. Elosegi;S. Cabrero Barros;M. Zorrilla

Volumetric video represents a transformative advancement in multimedia, offering the ability to capture and render three-dimensional content for immersive and interactive experiences. As the demand for immersive web applications grows, the need for a robust platform to stream live volumetric video on the web becomes more critical. This paper presents a prototype designed to stream volumetric video over 5G networks using web technologies. The platform enables real-time transmission and rendering of volumetric video in point cloud format, compressed with the Draco codec, and streamed via WebSocket and HTTP/DASH protocols. We conducted an empirical study to evaluate the performance of these technologies under different network technologies, transport protocols and scenarios, including the experimental network testbed of a commercial network provider: TELENOR. This new evidence complements our previous empirical study that confirmed the readiness of current devices and browsers for web-based volumetric video streaming. The results highlight significant differences in device performance and offer valuable insights into the limitations and opportunities for the future of volumetric video streaming on the web. Moreover, we are publishing the dataset generated during our empirical evaluation as an additional contribution, as it can be used for further analysis, simulation and model training. Finally, the paper discusses the technical and practical considerations for deploying volumetric video applications, laying the foundation for further advancements in the field.

体积视频代表了多媒体的革命性进步，提供了捕捉和渲染三维内容的能力，以实现沉浸式和交互式体验。随着对沉浸式网络应用的需求不断增长，对一个强大的平台的需求变得更加迫切。本文提出了一个原型，旨在使用web技术在5G网络上传输容量视频。该平台支持实时传输和渲染点云格式的体积视频，使用Draco编解码器进行压缩，并通过WebSocket和HTTP/DASH协议进行流式传输。我们进行了一项实证研究，以评估这些技术在不同网络技术、传输协议和场景下的性能，包括商用网络提供商TELENOR的实验网络测试平台。这一新证据补充了我们之前的实证研究，该研究证实了当前设备和浏览器已经为基于网络的海量视频流做好了准备。研究结果突出了设备性能的显著差异，并为网络上容量视频流的未来局限性和机遇提供了有价值的见解。此外，我们正在发布在我们的经验评估过程中生成的数据集，作为额外的贡献，因为它可以用于进一步的分析，模拟和模型训练。最后，本文讨论了部署体积视频应用的技术和实际考虑因素，为该领域的进一步发展奠定了基础。

{"title":"A Web-Ready and 5G-Ready Volumetric Video Streaming Platform: A Platform Prototype and Empirical Study","authors":"A. Dominguez;B. da Costa Paulo;I. Burguera;I. Tamayo;A. Elosegi;S. Cabrero Barros;M. Zorrilla","doi":"10.1109/ACCESS.2026.3669714","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3669714","url":null,"abstract":"Volumetric video represents a transformative advancement in multimedia, offering the ability to capture and render three-dimensional content for immersive and interactive experiences. As the demand for immersive web applications grows, the need for a robust platform to stream live volumetric video on the web becomes more critical. This paper presents a prototype designed to stream volumetric video over 5G networks using web technologies. The platform enables real-time transmission and rendering of volumetric video in point cloud format, compressed with the Draco codec, and streamed via WebSocket and HTTP/DASH protocols. We conducted an empirical study to evaluate the performance of these technologies under different network technologies, transport protocols and scenarios, including the experimental network testbed of a commercial network provider: TELENOR. This new evidence complements our previous empirical study that confirmed the readiness of current devices and browsers for web-based volumetric video streaming. The results highlight significant differences in device performance and offer valuable insights into the limitations and opportunities for the future of volumetric video streaming on the web. Moreover, we are publishing the dataset generated during our empirical evaluation as an additional contribution, as it can be used for further analysis, simulation and model training. Finally, the paper discusses the technical and practical considerations for deploying volumetric video applications, laying the foundation for further advancements in the field.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34655-34675"},"PeriodicalIF":3.6,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11418643","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Hybrid Fractional Chebyshev–Legendre Spectral Collocation Method for Hamilton–Jacobi–Bellman Equations Hamilton-Jacobi-Bellman方程的混合分数Chebyshev-Legendre谱配置方法

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access

Pub Date : 2026-02-27 DOI: 10.1109/ACCESS.2026.3668899

Alvian Alif Hidayatullah;Subchan Subchan;Sena Safarina;Tahiyatul Asfihani;R. Mohamad Atok;Irma Fitria;Andriyansah;Kasno Pamungkas

Stochastic optimal control problems are commonly formulated as optimization problems constrained by stochastic dynamical systems, whose value functions satisfy Hamilton–Jacobi–Bellman (HJB) equations. Owing to their strong nonlinearity and high dimensionality, closed-form solutions of HJB equations are rarely available, thereby motivating the development of robust and highly accurate numerical methods. This research introduces two hybrid spectral–collocation strategies for the numerical solution of stochastic HJB equations, constructed from different combinations of orthogonal polynomial bases. The first strategy utilizes shifted Chebyshev polynomials for time approximation and fractional-order Legendre polynomials for state approximation, while the second utilizes shifted Legendre polynomials in time and fractional-order Chebyshev polynomials in state. A convergence analysis is developed within the Caputo fractional derivative framework to justify the proposed methods and to establish the associated error estimates. The resulting nonlinear algebraic system is then solved using the collocation method. Numerical simulations, including an application to a resource extraction model, confirm that the proposed methods attain a high level of accuracy and exhibit convergence rates in strong agreement with the theoretical predictions. These results demonstrate that the developed hybrid spectral–collocation frameworks constitute reliable and efficient tools for addressing stochastic optimal control problems based on HJB equations.

随机最优控制问题通常被表述为受随机动力系统约束的优化问题，其值函数满足Hamilton-Jacobi-Bellman （HJB）方程。由于HJB方程的强非线性和高维性，其闭型解很少可用，从而推动了鲁棒和高精度数值方法的发展。本文介绍了由正交多项式基的不同组合构造的随机HJB方程数值解的两种混合谱配策略。第一种策略使用移位的切比雪夫多项式进行时间逼近，分数阶的勒让德多项式进行状态逼近；第二种策略使用移位的勒让德多项式进行时间逼近，分数阶的切比雪夫多项式进行状态逼近。在卡普托分数阶导数框架内进行了收敛分析，以证明所提出的方法的合理性，并建立了相关的误差估计。然后用配置法求解得到的非线性代数方程组。数值模拟，包括对资源提取模型的应用，证实了所提出的方法达到了高水平的精度，并且显示出与理论预测高度一致的收敛速度。这些结果表明，所开发的混合谱配框架是解决基于HJB方程的随机最优控制问题的可靠和有效的工具。

{"title":"A Hybrid Fractional Chebyshev–Legendre Spectral Collocation Method for Hamilton–Jacobi–Bellman Equations","authors":"Alvian Alif Hidayatullah;Subchan Subchan;Sena Safarina;Tahiyatul Asfihani;R. Mohamad Atok;Irma Fitria;Andriyansah;Kasno Pamungkas","doi":"10.1109/ACCESS.2026.3668899","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3668899","url":null,"abstract":"Stochastic optimal control problems are commonly formulated as optimization problems constrained by stochastic dynamical systems, whose value functions satisfy Hamilton–Jacobi–Bellman (HJB) equations. Owing to their strong nonlinearity and high dimensionality, closed-form solutions of HJB equations are rarely available, thereby motivating the development of robust and highly accurate numerical methods. This research introduces two hybrid spectral–collocation strategies for the numerical solution of stochastic HJB equations, constructed from different combinations of orthogonal polynomial bases. The first strategy utilizes shifted Chebyshev polynomials for time approximation and fractional-order Legendre polynomials for state approximation, while the second utilizes shifted Legendre polynomials in time and fractional-order Chebyshev polynomials in state. A convergence analysis is developed within the Caputo fractional derivative framework to justify the proposed methods and to establish the associated error estimates. The resulting nonlinear algebraic system is then solved using the collocation method. Numerical simulations, including an application to a resource extraction model, confirm that the proposed methods attain a high level of accuracy and exhibit convergence rates in strong agreement with the theoretical predictions. These results demonstrate that the developed hybrid spectral–collocation frameworks constitute reliable and efficient tools for addressing stochastic optimal control problems based on HJB equations.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34564-34584"},"PeriodicalIF":3.6,"publicationDate":"2026-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11415601","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Novel Medical Image Compression/Decompression Technique Based on Bicubic Interpolation With Matrix Reduction Algorithm 基于矩阵约简算法的双三次插值医学图像压缩/解压新技术

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access

Pub Date : 2026-02-27 DOI: 10.1109/ACCESS.2026.3668851

Sazeen Taha Abdulrazzaq;Mohammed M. Siddeq;Mohd Asyraf Zulkifley;Mohd Hairi Mohd Zaman;Asraf Mohamed Moubark

Medical imaging is an important contributor to diagnostic accuracy and monitoring of various health conditions, enabling healthcare professionals to gain valuable insights into the internal structures and functions of the body. With the prevalence of telemedicine and big data integration in healthcare, the effective storage and online transmission of these images have grown in importance. The main two challenges in this field are limited transmission bandwidth and storage capacity. Lossless compression techniques are necessary because information must be preserved to guarantee the integrity of medical images. Optimizing and decreasing the duration of compression and decompression processing is a fundamental aspect that warrants attention. In this work, an innovative technique for compressing medical images is developed and evaluated. It shows superior image reconstruction quality and a compression performance of up to 99%. The compression technique involves downscaling medical images through bicubic interpolation. Matrix reduction is subsequently employed to further reduce the size of the interpolated matrix to one-fourth of its dimensions. Finally, efficiency arithmetic coding is added to improve compression. The proposed algorithm emerges as a superior candidate in applications of image compression, outperforming all known advanced compression methods. The effectiveness of this method has been assessed by X-ray, ultrasound, CT, and MRI images sourced from an available database. Various performance metrics, including peak signal-to-noise ratio (PSNR), mean square error, and structural similarity index measure (SSIM), have been utilized to evaluate the quality of the images compressed using the proposed algorithm. The results show that at a high compression ratio of 100:1 (up to 99%), the proposed method achieves a high PSNR and SSIM values.

医学成像是诊断准确性和监测各种健康状况的重要贡献者，使医疗保健专业人员能够获得对身体内部结构和功能的有价值的见解。随着远程医疗和大数据集成在医疗领域的普及，这些图像的有效存储和在线传输变得越来越重要。该领域面临的主要挑战是有限的传输带宽和存储容量。无损压缩技术是必要的，因为必须保留信息以保证医学图像的完整性。优化和减少压缩和解压缩处理的持续时间是一个值得注意的基本方面。在这项工作中，一种创新的技术压缩医学图像开发和评估。它具有优异的图像重建质量和高达99%的压缩性能。压缩技术涉及到通过双三次插值降尺度的医学图像。随后，采用矩阵约简进一步将插值矩阵的大小减少到其维数的四分之一。最后，加入了高效算术编码，提高了压缩效率。该算法在图像压缩应用中表现优异，优于所有已知的先进压缩方法。该方法的有效性已通过来自现有数据库的x射线、超声、CT和MRI图像进行了评估。各种性能指标，包括峰值信噪比（PSNR）、均方误差和结构相似指数测量（SSIM），已被用于评估使用该算法压缩的图像质量。结果表明，在100:1的高压缩比下（压缩比高达99%），该方法获得了较高的PSNR和SSIM值。

{"title":"Novel Medical Image Compression/Decompression Technique Based on Bicubic Interpolation With Matrix Reduction Algorithm","authors":"Sazeen Taha Abdulrazzaq;Mohammed M. Siddeq;Mohd Asyraf Zulkifley;Mohd Hairi Mohd Zaman;Asraf Mohamed Moubark","doi":"10.1109/ACCESS.2026.3668851","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3668851","url":null,"abstract":"Medical imaging is an important contributor to diagnostic accuracy and monitoring of various health conditions, enabling healthcare professionals to gain valuable insights into the internal structures and functions of the body. With the prevalence of telemedicine and big data integration in healthcare, the effective storage and online transmission of these images have grown in importance. The main two challenges in this field are limited transmission bandwidth and storage capacity. Lossless compression techniques are necessary because information must be preserved to guarantee the integrity of medical images. Optimizing and decreasing the duration of compression and decompression processing is a fundamental aspect that warrants attention. In this work, an innovative technique for compressing medical images is developed and evaluated. It shows superior image reconstruction quality and a compression performance of up to 99%. The compression technique involves downscaling medical images through bicubic interpolation. Matrix reduction is subsequently employed to further reduce the size of the interpolated matrix to one-fourth of its dimensions. Finally, efficiency arithmetic coding is added to improve compression. The proposed algorithm emerges as a superior candidate in applications of image compression, outperforming all known advanced compression methods. The effectiveness of this method has been assessed by X-ray, ultrasound, CT, and MRI images sourced from an available database. Various performance metrics, including peak signal-to-noise ratio (PSNR), mean square error, and structural similarity index measure (SSIM), have been utilized to evaluate the quality of the images compressed using the proposed algorithm. The results show that at a high compression ratio of 100:1 (up to 99%), the proposed method achieves a high PSNR and SSIM values.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34600-34613"},"PeriodicalIF":3.6,"publicationDate":"2026-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11415572","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Role of Reinforcement Learning in Production Control: A Systematic Literature Review 强化学习在生产控制中的作用：系统的文献回顾

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access

Pub Date : 2026-02-27 DOI: 10.1109/ACCESS.2026.3668903

Jonas Schneider;Carl Pfannschmidt;Peter Nyhuis;Matthias Schmidt

The expansion of product portfolios, the reduction of product life cycles and the volatility of markets pose significant challenges for production systems and their control. Concurrently, these trends present a challenge in achieving an optimal balance between logistical performance and internal company costs. The discordance between the escalating demands of customers on logistics performance, exemplified by metrics such as throughput time and schedule reliability, and the cost-driven corporate objective of minimizing work-in-process and maximizing machine utilization, is becoming increasingly challenging to reconcile. The application of reinforcement learning (RL) is a significant machine learning (ML) approach for overcoming these challenges. In comparison with other ML approaches, RL facilitates direct interaction with production systems and is consequently well suited for controlling them in operational use. Despite the extensive body of research on RL approaches for production control tasks, there is a paucity of literature addressing the influence of these approaches on key logistical targets for logistics performance and costs. The article’s added value derives from the systematic literature review it conducts, which provides researchers and practitioners with an overview of how existing RL approaches influence central logistical target variables. Furthermore, it highlights blind spots in the research landscape. The results indicate the existence of a substantial number of approaches; however, their distribution across control tasks is disproportionate. Furthermore, it is evident that there are distinct discrepancies in the classification system with respect to the impact on logistical target variables.

产品组合的扩大、产品生命周期的缩短和市场的波动对生产系统及其控制构成重大挑战。同时，这些趋势在实现物流绩效和公司内部成本之间的最佳平衡方面提出了挑战。客户对物流绩效的不断升级的需求（以生产时间和计划可靠性等指标为例）与成本驱动的企业目标（最小化在制品和最大化机器利用率）之间的不协调正变得越来越具有挑战性。强化学习（RL）的应用是克服这些挑战的重要机器学习（ML）方法。与其他机器学习方法相比，强化学习促进了与生产系统的直接交互，因此非常适合在操作使用中控制它们。尽管对生产控制任务的RL方法进行了广泛的研究，但缺乏文献解决这些方法对物流绩效和成本的关键物流目标的影响。本文的附加价值来自于它进行的系统文献综述，它为研究人员和实践者提供了现有强化学习方法如何影响中心物流目标变量的概述。此外，它还突出了研究领域的盲点。结果表明存在大量的方法；然而，它们在控制任务中的分布是不成比例的。此外，很明显，在对后勤目标变量的影响方面，分类制度存在明显的差异。

{"title":"The Role of Reinforcement Learning in Production Control: A Systematic Literature Review","authors":"Jonas Schneider;Carl Pfannschmidt;Peter Nyhuis;Matthias Schmidt","doi":"10.1109/ACCESS.2026.3668903","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3668903","url":null,"abstract":"The expansion of product portfolios, the reduction of product life cycles and the volatility of markets pose significant challenges for production systems and their control. Concurrently, these trends present a challenge in achieving an optimal balance between logistical performance and internal company costs. The discordance between the escalating demands of customers on logistics performance, exemplified by metrics such as throughput time and schedule reliability, and the cost-driven corporate objective of minimizing work-in-process and maximizing machine utilization, is becoming increasingly challenging to reconcile. The application of reinforcement learning (RL) is a significant machine learning (ML) approach for overcoming these challenges. In comparison with other ML approaches, RL facilitates direct interaction with production systems and is consequently well suited for controlling them in operational use. Despite the extensive body of research on RL approaches for production control tasks, there is a paucity of literature addressing the influence of these approaches on key logistical targets for logistics performance and costs. The article’s added value derives from the systematic literature review it conducts, which provides researchers and practitioners with an overview of how existing RL approaches influence central logistical target variables. Furthermore, it highlights blind spots in the research landscape. The results indicate the existence of a substantial number of approaches; however, their distribution across control tasks is disproportionate. Furthermore, it is evident that there are distinct discrepancies in the classification system with respect to the impact on logistical target variables.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34375-34389"},"PeriodicalIF":3.6,"publicationDate":"2026-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11415616","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Structure-Consistent Contrastive Learning for Unpaired Image Translation With Gradient-Domain Constraints 基于梯度域约束的非配对图像翻译结构一致性对比学习

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access

Pub Date : 2026-02-27 DOI: 10.1109/ACCESS.2026.3669078

Muhammad Awais Arshad;Haneul Lee;Hosun Lee;Myeongjin Kang;Yeowon Kim;Hyochoong Bang

Unpaired image-to-image translation is fundamental to autonomous driving, robotics, aerospace, remote sensing, and medical imaging, where visual realism must be achieved without compromising scene geometry. Generative Adversarial Network (GAN) based methods remain the most computationally feasible option for real-time deployment; However, they frequently introduce structural distortions, while diffusion and transformer-based models offer stronger controllability at a prohibitive computational cost. We propose CUT-GDC, a compact, structure-aware GAN framework that combines patchwise contrastive learning with gradient-domain constraints to enhance global geometric fidelity. CUT-GDC preserves the efficiency of the GAN-based architecture while enforcing the alignment of edge and gradient information to prevent global layout drift. Extensive experiments on multiple public benchmarks show that CUT-GDC consistently outperforms established GAN-based baselines. Compared with CUT, CUT-GDC reduces the average FID from

$210.278~rightarrow ~121.582$

and KID from

$0.199~rightarrow ~0.062$

, and improves SSIM from

$0.361~rightarrow ~0.501$

. CUT-GDC also yields higher downstream segmentation performance, improving mIoU (

$24.7~rightarrow ~28.63$

), pixel accuracy (68.8%

$rightarrow ~70.5$

%), and class accuracy (30.7%

$rightarrow ~41.4$

%) relative to CUT on the Cityscapes dataset. Edge-structure evaluation further verifies geometric fidelity, where CUT-GDC consistently improves Canny-IoU and Grad-IoU across validation tasks (e.g., Sim-to-Real IR: 0.659/0.737 vs. 0.256/0.321), confirming superior contour alignment and gradient consistency. Ablation studies on the flower dataset confirm that gradient-domain constraints are a reliable driver of structural gains, reducing FID to 85.035 (vs. 90.647 for CUT and 89.809 for CycleGAN) and raising SSIM to 0.761 (vs. 0.609 and 0.748, respectively). CUT-GDC maintains comparable memory usage and faster training/inference throughput than competing models, enabling practical deployment in real-time and hardware-restricted environments. Qualitative analysis, including t-SNE visualizations, further demonstrates improved domain alignment.

非配对图像到图像的转换是自动驾驶、机器人、航空航天、遥感和医学成像的基础，在这些领域，必须在不影响场景几何的情况下实现视觉真实感。基于生成对抗网络（GAN）的方法仍然是实时部署中计算上最可行的选择；然而，它们经常引入结构扭曲，而扩散和基于变压器的模型以令人望而却步的计算成本提供了更强的可控性。我们提出CUT-GDC，一个紧凑的，结构感知的GAN框架，结合了补丁对比学习和梯度域约束，以提高全局几何保真度。CUT-GDC保留了基于gan的架构的效率，同时强制边缘和梯度信息对齐，以防止全局布局漂移。在多个公共基准上进行的大量实验表明，CUT-GDC始终优于基于gan的既定基准。与CUT相比，CUT- gdc将平均FID从$210.278~rightarrow ~121.582$降低，KID从$0.199~rightarrow ~0.062$降低，SSIM从$0.361~rightarrow ~0.501$提高。CUT- gdc也产生了更高的下游分割性能，在cityscape数据集上，相对于CUT，提高了mIoU （$24.7~rightarrow ~28.63$）、像素精度（$ 68.8% ~ $ 70.5$ %）和类别精度（$ 30.7% ~ $ rightarrow ~41.4$ %）。边缘结构评估进一步验证了几何保真度，其中CUT-GDC在验证任务中持续改进了Canny-IoU和Grad-IoU（例如，模拟到真实的IR: 0.659/0.737 vs. 0.256/0.321），确认了卓越的轮廓对准和梯度一致性。对花数据集的研究证实，梯度域约束是结构增益的可靠驱动因素，FID降低到85.035 (CUT为90.647，CycleGAN为89.809)，SSIM提高到0.761（分别为0.609和0.748）。与竞争模型相比，CUT-GDC保持了相当的内存使用和更快的训练/推理吞吐量，能够在实时和硬件受限的环境中进行实际部署。定性分析，包括t-SNE可视化，进一步证明了改进的域对齐。

{"title":"Structure-Consistent Contrastive Learning for Unpaired Image Translation With Gradient-Domain Constraints","authors":"Muhammad Awais Arshad;Haneul Lee;Hosun Lee;Myeongjin Kang;Yeowon Kim;Hyochoong Bang","doi":"10.1109/ACCESS.2026.3669078","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3669078","url":null,"abstract":"Unpaired image-to-image translation is fundamental to autonomous driving, robotics, aerospace, remote sensing, and medical imaging, where visual realism must be achieved without compromising scene geometry. Generative Adversarial Network (GAN) based methods remain the most computationally feasible option for real-time deployment; However, they frequently introduce structural distortions, while diffusion and transformer-based models offer stronger controllability at a prohibitive computational cost. We propose CUT-GDC, a compact, structure-aware GAN framework that combines patchwise contrastive learning with gradient-domain constraints to enhance global geometric fidelity. CUT-GDC preserves the efficiency of the GAN-based architecture while enforcing the alignment of edge and gradient information to prevent global layout drift. Extensive experiments on multiple public benchmarks show that CUT-GDC consistently outperforms established GAN-based baselines. Compared with CUT, CUT-GDC reduces the average FID from <inline-formula> <tex-math>$210.278~rightarrow ~121.582$ </tex-math></inline-formula> and KID from <inline-formula> <tex-math>$0.199~rightarrow ~0.062$ </tex-math></inline-formula>, and improves SSIM from <inline-formula> <tex-math>$0.361~rightarrow ~0.501$ </tex-math></inline-formula>. CUT-GDC also yields higher downstream segmentation performance, improving mIoU (<inline-formula> <tex-math>$24.7~rightarrow ~28.63$ </tex-math></inline-formula>), pixel accuracy (68.8% <inline-formula> <tex-math>$rightarrow ~70.5$ </tex-math></inline-formula>%), and class accuracy (30.7% <inline-formula> <tex-math>$rightarrow ~41.4$ </tex-math></inline-formula>%) relative to CUT on the Cityscapes dataset. Edge-structure evaluation further verifies geometric fidelity, where CUT-GDC consistently improves Canny-IoU and Grad-IoU across validation tasks (e.g., Sim-to-Real IR: 0.659/0.737 vs. 0.256/0.321), confirming superior contour alignment and gradient consistency. Ablation studies on the flower dataset confirm that gradient-domain constraints are a reliable driver of structural gains, reducing FID to 85.035 (vs. 90.647 for CUT and 89.809 for CycleGAN) and raising SSIM to 0.761 (vs. 0.609 and 0.748, respectively). CUT-GDC maintains comparable memory usage and faster training/inference throughput than competing models, enabling practical deployment in real-time and hardware-restricted environments. Qualitative analysis, including t-SNE visualizations, further demonstrates improved domain alignment.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34510-34526"},"PeriodicalIF":3.6,"publicationDate":"2026-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11415593","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Application-Specific Instruction-Set Processors (ASIPs) for Deep Neural Networks: A Survey 深度神经网络专用指令集处理器（asip）：综述

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access

Pub Date : 2026-02-27 DOI: 10.1109/ACCESS.2026.3668964

Muhammad Ali;Diana Göhringer

The demand for artificial intelligence applications is rising in every field. And with this demand, machine learning algorithms and techniques are becoming more complicated and compute-intensive. For embedded devices, these compute-intensive and high memory-constrained algorithms are becoming a challenge. Dedicated hardware accelerators are generally the leading solution for embedded systems. This is because of their high performance, low area, and power footprint. Although hardware accelerators are a great solution for machine learning, they lack flexibility and programmability. Another alternative is General-Purpose Processors (GPPs), which provide better flexibility and programmability as compared with hardware accelerators, however, they lack performance, area, and power. Between these two plausible solutions, there is a big gap in terms of different metrics, which can be bridged by Application-Specific Instruction-set Processors (ASIPs). ASIPs are specialized processor systems with an architecture that is tailor-made for a certain application. ASIPs provide better performance as compared with general-purpose processors and have more flexibility and programmability as compared with hardware accelerators. The main goal of an ASIP design is to maximize the performance-to-power ratio for a specific application. This work provides an in-depth survey of different ASIP designs with a focus on Deep Neural Networks (DNNs). For a generic comparison, the proposed related works of ASIP design are classified based on their microarchitecture approach and optimization adopted. The strong and weak points are pointed out in different optimization and microarchitectures, and future trends for ASIP design for fast machine learning are identified.

各个领域对人工智能应用的需求都在上升。在这种需求下，机器学习算法和技术变得更加复杂和计算密集型。对于嵌入式设备，这些计算密集型和高内存约束的算法正在成为一个挑战。专用硬件加速器通常是嵌入式系统的主要解决方案。这是因为它们的高性能、低面积和功耗。虽然硬件加速器是机器学习的一个很好的解决方案，但它们缺乏灵活性和可编程性。另一种替代方案是通用处理器（gpp），与硬件加速器相比，它提供了更好的灵活性和可编程性，但是，它们缺乏性能、面积和功率。在这两种看似合理的解决方案之间，就不同的度量而言存在很大的差距，这可以通过特定于应用程序的指令集处理器（Application-Specific Instruction-set Processors, asip）来弥补。asip是专门的处理器系统，具有为特定应用程序量身定制的体系结构。与通用处理器相比，api提供了更好的性能，并且与硬件加速器相比具有更大的灵活性和可编程性。ASIP设计的主要目标是最大化特定应用程序的性能功耗比。这项工作提供了不同的ASIP设计的深入调查，重点是深度神经网络（dnn）。为了进行一般性比较，本文根据ASIP设计的微架构方法和所采用的优化方法对相关工作进行了分类。指出了不同优化和微架构的优缺点，并指出了面向快速机器学习的ASIP设计的未来趋势。

{"title":"Application-Specific Instruction-Set Processors (ASIPs) for Deep Neural Networks: A Survey","authors":"Muhammad Ali;Diana Göhringer","doi":"10.1109/ACCESS.2026.3668964","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3668964","url":null,"abstract":"The demand for artificial intelligence applications is rising in every field. And with this demand, machine learning algorithms and techniques are becoming more complicated and compute-intensive. For embedded devices, these compute-intensive and high memory-constrained algorithms are becoming a challenge. Dedicated hardware accelerators are generally the leading solution for embedded systems. This is because of their high performance, low area, and power footprint. Although hardware accelerators are a great solution for machine learning, they lack flexibility and programmability. Another alternative is General-Purpose Processors (GPPs), which provide better flexibility and programmability as compared with hardware accelerators, however, they lack performance, area, and power. Between these two plausible solutions, there is a big gap in terms of different metrics, which can be bridged by Application-Specific Instruction-set Processors (ASIPs). ASIPs are specialized processor systems with an architecture that is tailor-made for a certain application. ASIPs provide better performance as compared with general-purpose processors and have more flexibility and programmability as compared with hardware accelerators. The main goal of an ASIP design is to maximize the performance-to-power ratio for a specific application. This work provides an in-depth survey of different ASIP designs with a focus on Deep Neural Networks (DNNs). For a generic comparison, the proposed related works of ASIP design are classified based on their microarchitecture approach and optimization adopted. The strong and weak points are pointed out in different optimization and microarchitectures, and future trends for ASIP design for fast machine learning are identified.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34545-34563"},"PeriodicalIF":3.6,"publicationDate":"2026-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11415575","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0