Classifying sequential–global cognitive styles is essential for developing adaptive and personalized learning systems. Existing studies have relied on aggregated gaze statistics from proprietary eye tracking software, limiting feature diversity and classification accuracy. To address this gap, this study proposes a classification framework based on the Felder–Silverman Learning Style Model (FSLSM) that leverages time series eye tracking data and deep learning. Features from $x$ - and $y$ -coordinate gaze data were extracted using eight temporal window scales. The experimental results show that the proposed framework accurately distinguishes sequential and global cognitive styles. Among the evaluated methods, the Temporal Convolutional Network (TCN) combined with Robust scaling achieved the best classification accuracy of 99.51%. The temporal window of approximately 2.0 seconds (i.e., 121 samples) yielded the best performance. When combined with the feature set comprising gazeX, gazeY, speed, and direction, the proposed method achieved the most optimal discriminative capability. Our findings underscore the potential of time series eye tracking for identification of cognitive styles. This research serves as a crucial initial step in improving biometric-driven approaches to personalized education and adaptive learning technologies.
{"title":"Time Series Classification on Eye Tracking for Identification of Sequential-Global Cognitive Styles","authors":"Hafzatin Nurlatifa;Teguh Bharata Adji;Igi Ardiyanto;Generosa Lukhayu Pritalia;Sunu Wibirama","doi":"10.1109/ACCESS.2026.3668841","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3668841","url":null,"abstract":"Classifying sequential–global cognitive styles is essential for developing adaptive and personalized learning systems. Existing studies have relied on aggregated gaze statistics from proprietary eye tracking software, limiting feature diversity and classification accuracy. To address this gap, this study proposes a classification framework based on the Felder–Silverman Learning Style Model (FSLSM) that leverages time series eye tracking data and deep learning. Features from <inline-formula> <tex-math>$x$ </tex-math></inline-formula>- and <inline-formula> <tex-math>$y$ </tex-math></inline-formula>-coordinate gaze data were extracted using eight temporal window scales. The experimental results show that the proposed framework accurately distinguishes sequential and global cognitive styles. Among the evaluated methods, the Temporal Convolutional Network (TCN) combined with Robust scaling achieved the best classification accuracy of 99.51%. The temporal window of approximately 2.0 seconds (i.e., 121 samples) yielded the best performance. When combined with the feature set comprising gazeX, gazeY, speed, and direction, the proposed method achieved the most optimal discriminative capability. Our findings underscore the potential of time series eye tracking for identification of cognitive styles. This research serves as a crucial initial step in improving biometric-driven approaches to personalized education and adaptive learning technologies.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34676-34691"},"PeriodicalIF":3.6,"publicationDate":"2026-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11415584","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parallel computing is crucial for enhancing the computational efficiency of ocean numerical models. Traditionally, parallelization in such models relied primarily on MPI. Subsequent developments introduced GPU-oriented computing tools, including CUDA and OpenACC, which enable accelerated computation on NVIDIA GPUs. However, each tool has its own syntax and lacks interoperability. Recent support for Standard Language Parallelism from major Fortran compiler vendors (e.g., Intel, NVIDIA) has unlocked efficient parallelization across multi-core CPUs and GPUs from a single Fortran codebase. This study employs Standard Language Parallelism to accelerate the A2D model, a two-dimensional unstructured-grid ocean numerical model, by parallelizing loops in its source code. The refactored code can be compiled for either multi-core CPU or GPU execution simply by selecting appropriate compiling flags. This approach offers advantages including high code readability, broad hardware compatibility, and reduced maintenance overhead. Using compilers such as ifort or nvfortran, the parallelized model achieves speedups exceeding $23times $ in both multi-core and GPU configurations compared to the original serial implementation.
{"title":"Accelerating the A2D Ocean Model With Standard Language Parallelism","authors":"Bingrui Chen;Yizhong Chen;Hongyuan Guo;Jianrong Zhu","doi":"10.1109/ACCESS.2026.3668413","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3668413","url":null,"abstract":"Parallel computing is crucial for enhancing the computational efficiency of ocean numerical models. Traditionally, parallelization in such models relied primarily on MPI. Subsequent developments introduced GPU-oriented computing tools, including CUDA and OpenACC, which enable accelerated computation on NVIDIA GPUs. However, each tool has its own syntax and lacks interoperability. Recent support for Standard Language Parallelism from major Fortran compiler vendors (e.g., Intel, NVIDIA) has unlocked efficient parallelization across multi-core CPUs and GPUs from a single Fortran codebase. This study employs Standard Language Parallelism to accelerate the A2D model, a two-dimensional unstructured-grid ocean numerical model, by parallelizing loops in its source code. The refactored code can be compiled for either multi-core CPU or GPU execution simply by selecting appropriate compiling flags. This approach offers advantages including high code readability, broad hardware compatibility, and reduced maintenance overhead. Using compilers such as ifort or nvfortran, the parallelized model achieves speedups exceeding <inline-formula> <tex-math>$23times $ </tex-math></inline-formula> in both multi-core and GPU configurations compared to the original serial implementation.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34643-34654"},"PeriodicalIF":3.6,"publicationDate":"2026-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11414083","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This research on scalable ultra-fast charging and adaptive regulation technology for power distribution networks aimed at load active balancing proposes an intelligent regulation method based on multi-source data fusion. By constructing a collaborative control architecture between charging facilities and the distribution network, it designs a load active balancing algorithm responsive to frequency and voltage fluctuations, as well as an in-station resource optimization scheduling mechanism driven by multi-source measurement data fusion. This method can dynamically generate and issue control instructions based on real-time information about the state of the distribution network (such as frequency and voltage fluctuations) and local control resources (including photovoltaic, energy storage, charging piles, etc.), achieving rapid response and optimized allocation of ultra-fast charging loads. Experimental validation shows that the proposed regulation strategy can complete dynamic adjustments of load instructions within 5 seconds, with a steady-state error of less than 0.05 mV, significantly reducing charging time while effectively supporting the safe and stable operation of the distribution network. The research findings provide crucial technical support for the adaptive regulation of distribution networks with the integration of scalable ultra-fast charging facilities.
{"title":"Multi-Source Data-Driven Active Balancing Study of Ultra-Fast Charging Loads","authors":"Wen Wang;Ye Yang;Fan Wu;Xiangliang Fang;Xiujuan Zeng;Tong Liu;Bin Zhu","doi":"10.1109/ACCESS.2026.3668280","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3668280","url":null,"abstract":"This research on scalable ultra-fast charging and adaptive regulation technology for power distribution networks aimed at load active balancing proposes an intelligent regulation method based on multi-source data fusion. By constructing a collaborative control architecture between charging facilities and the distribution network, it designs a load active balancing algorithm responsive to frequency and voltage fluctuations, as well as an in-station resource optimization scheduling mechanism driven by multi-source measurement data fusion. This method can dynamically generate and issue control instructions based on real-time information about the state of the distribution network (such as frequency and voltage fluctuations) and local control resources (including photovoltaic, energy storage, charging piles, etc.), achieving rapid response and optimized allocation of ultra-fast charging loads. Experimental validation shows that the proposed regulation strategy can complete dynamic adjustments of load instructions within 5 seconds, with a steady-state error of less than 0.05 mV, significantly reducing charging time while effectively supporting the safe and stable operation of the distribution network. The research findings provide crucial technical support for the adaptive regulation of distribution networks with the integration of scalable ultra-fast charging facilities.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"38215-38229"},"PeriodicalIF":3.6,"publicationDate":"2026-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11413935","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-26DOI: 10.1109/ACCESS.2026.3668480
Abdul Kadar Muhammad Masum;Khandaker Mohammad Mohi Uddin;Chanda Rani Debi;Ramona Birău;Virgil Popescu;Md. Abul Kalam Azad
This work introduces a complete computational framework that combines game theory, sustainability metrics, and regulatory compliance into a unified decision support system for portfolio design. Traditional Environmental, Social, and Governance (ESG) investing often fails to adequately consider mandatory compliance, which we conceptualize as ESG Compliance (ESGC),an indicator of the degree to which a company meets sustainability standards and reporting regulations. To bridge the gap between financial maximization and regulatory demand, we propose a Game-Theoretic ESG Decision Support System (DSS), which captures the strategic interplay between three distinct investor archetypes: retail, institutional, and regulatory. The workflow encompasses rigorous data cleaning, feature engineering, and game-theoretic optimization based on Nash Equilibrium using a multi-region panel of 4,837 firms.Our empirical results demonstrate that the Game-Theoretic strategy achieves a mean annual return of 25.83% (representing a 117.14% cumulative return over the study period) and a Sharpe Ratio of 1.1623, significantly outperforming the standard Markowitz Mean-Variance benchmark (Sharpe: 0.9652). Furthermore, the framework maintains superior sustainability alignment with a Mean ESG score of 72.51 and a Mean ESGC score of 79.23. When adjusted for sustainability quality, the model generates an ESG-Adjusted Sharpe Ratio of 2.0051, outperforming the ESG-Floor MVO benchmark (1.69). This framework provides a robust tool that integrates financial performance, sustainability, and policy into a single, mathematically rigorous decision-making environment.
这项工作引入了一个完整的计算框架,将博弈论、可持续性指标和法规遵从性结合到一个统一的投资组合设计决策支持系统中。传统的环境、社会和治理(ESG)投资往往不能充分考虑强制性合规,我们将其概念化为ESG合规(ESGC),这是公司满足可持续性标准和报告法规程度的指标。为了弥合金融最大化和监管需求之间的差距,我们提出了一个博弈论的ESG决策支持系统(DSS),该系统捕捉了三种不同投资者原型(零售、机构和监管)之间的战略相互作用。工作流程包括严格的数据清理、特征工程和基于纳什均衡的博弈论优化,使用了4,837家公司的多区域面板。实证结果表明,博弈论策略的平均年回报率为25.83%(研究期间累计回报率为117.14%),夏普比率为1.1623,显著优于标准Markowitz mean - variance基准(Sharpe: 0.9652)。此外,该框架保持了卓越的可持续性一致性,ESG平均得分为72.51,ESGC平均得分为79.23。在对可持续性质量进行调整后,该模型产生的esg调整夏普比率为2.0051,优于ESG-Floor MVO基准(1.69)。该框架提供了一个强大的工具,将财务绩效、可持续性和政策整合到一个单一的、数学上严谨的决策环境中。
{"title":"Nash Equilibrium in Sustainable Finance: Designing a Game-Theoretic DSS for Compliance-Aligned Portfolio Optimization","authors":"Abdul Kadar Muhammad Masum;Khandaker Mohammad Mohi Uddin;Chanda Rani Debi;Ramona Birău;Virgil Popescu;Md. Abul Kalam Azad","doi":"10.1109/ACCESS.2026.3668480","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3668480","url":null,"abstract":"This work introduces a complete computational framework that combines game theory, sustainability metrics, and regulatory compliance into a unified decision support system for portfolio design. Traditional Environmental, Social, and Governance (ESG) investing often fails to adequately consider mandatory compliance, which we conceptualize as ESG Compliance (ESGC),an indicator of the degree to which a company meets sustainability standards and reporting regulations. To bridge the gap between financial maximization and regulatory demand, we propose a Game-Theoretic ESG Decision Support System (DSS), which captures the strategic interplay between three distinct investor archetypes: retail, institutional, and regulatory. The workflow encompasses rigorous data cleaning, feature engineering, and game-theoretic optimization based on Nash Equilibrium using a multi-region panel of 4,837 firms.Our empirical results demonstrate that the Game-Theoretic strategy achieves a mean annual return of 25.83% (representing a 117.14% cumulative return over the study period) and a Sharpe Ratio of 1.1623, significantly outperforming the standard Markowitz Mean-Variance benchmark (Sharpe: 0.9652). Furthermore, the framework maintains superior sustainability alignment with a Mean ESG score of 72.51 and a Mean ESGC score of 79.23. When adjusted for sustainability quality, the model generates an ESG-Adjusted Sharpe Ratio of 2.0051, outperforming the ESG-Floor MVO benchmark (1.69). This framework provides a robust tool that integrates financial performance, sustainability, and policy into a single, mathematically rigorous decision-making environment.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34356-34374"},"PeriodicalIF":3.6,"publicationDate":"2026-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11413992","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-26DOI: 10.1109/ACCESS.2026.3668201
Yosuke Naruse
This paper addresses the optimization of illumination and camera imaging conditions in imaging systems equipped with multi-channel illumination for visual inspection. While previous studies have primarily focused on HDR imaging, in practical settings HDR acquisition is time-consuming and typically performed using a single fixed exposure time. To enable the practical application of coded illumination for visual inspection, we extend the method to jointly optimize camera exposure time along with illumination conditions. We propose a unified framework that optimizes both texture and brightness based on BTF, leveraging the full expressive power of the imaging device under the common hardware constraint that illumination luminance is quantized. Furthermore, we demonstrate that the global optimum of Fisher’s LDA, subject to non-negative light intensity constraints, can be computed using SDP, and this solution effectively enhances foreground contrast. Experimental results show strong agreement between simulated and real images in an actual visual inspection hardware environment, thereby validating the feasibility of illumination condition optimization using a digital twin approach.
{"title":"Coded Illumination Under Quantized Luminosity for Visual Inspection","authors":"Yosuke Naruse","doi":"10.1109/ACCESS.2026.3668201","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3668201","url":null,"abstract":"This paper addresses the optimization of illumination and camera imaging conditions in imaging systems equipped with multi-channel illumination for visual inspection. While previous studies have primarily focused on HDR imaging, in practical settings HDR acquisition is time-consuming and typically performed using a single fixed exposure time. To enable the practical application of coded illumination for visual inspection, we extend the method to jointly optimize camera exposure time along with illumination conditions. We propose a unified framework that optimizes both texture and brightness based on BTF, leveraging the full expressive power of the imaging device under the common hardware constraint that illumination luminance is quantized. Furthermore, we demonstrate that the global optimum of Fisher’s LDA, subject to non-negative light intensity constraints, can be computed using SDP, and this solution effectively enhances foreground contrast. Experimental results show strong agreement between simulated and real images in an actual visual inspection hardware environment, thereby validating the feasibility of illumination condition optimization using a digital twin approach.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34390-34403"},"PeriodicalIF":3.6,"publicationDate":"2026-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11413920","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-26DOI: 10.1109/ACCESS.2026.3668745
Maximilian Weigand;Felix Gehlhoff;Alexander Fay
In engineering disciplines such as mechanical or automation engineering, data exchange between different software often relies on non-standardized interfaces. While Semantic Web technologies (SWTs) offer solutions for efficient and versatile data exchange, the majority of software in engineering lack interfaces based on these technologies. Previous research has explored approaches to abstract structured data from external sources as virtual knowledge graphs (VKGs). In this work, we propose an approach to create VKGs from data available via the APIs of engineering software, enabling the querying of engineering data through SWTs and thus facilitating standardized data exchange. We first mathematically introduce the components of APIs of object-oriented software, including classes, their hierarchical relationships, attributes, and instances. We then define formulas to derive the triples of the VKG representing the components, as well as resulting triples according to RDFS entailment rules. Finally, we define a procedure based on the aforementioned formulas, which enables querying the VKG using triple patterns. This provides access to the VKG representing the software-internal data without materialization, in line with the concept of virtual. We provide a general implementation of the approach, supporting triple pattern and SPARQL queries, and a software-specific adaption for a particular engineering software. Using this, we demonstrate the capabilities of the approach through an industrial use case. The performance characteristics of the approach are evaluated by analyzing query execution times and scaling behavior across different query types and graph sizes, in comparison with equivalent materialized graphs.
{"title":"Enhancing Software Interoperability Through Virtual Knowledge Graphs From Object-Oriented APIs","authors":"Maximilian Weigand;Felix Gehlhoff;Alexander Fay","doi":"10.1109/ACCESS.2026.3668745","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3668745","url":null,"abstract":"In engineering disciplines such as mechanical or automation engineering, data exchange between different software often relies on non-standardized interfaces. While Semantic Web technologies (SWTs) offer solutions for efficient and versatile data exchange, the majority of software in engineering lack interfaces based on these technologies. Previous research has explored approaches to abstract structured data from external sources as virtual knowledge graphs (VKGs). In this work, we propose an approach to create VKGs from data available via the APIs of engineering software, enabling the querying of engineering data through SWTs and thus facilitating standardized data exchange. We first mathematically introduce the components of APIs of object-oriented software, including classes, their hierarchical relationships, attributes, and instances. We then define formulas to derive the triples of the VKG representing the components, as well as resulting triples according to RDFS entailment rules. Finally, we define a procedure based on the aforementioned formulas, which enables querying the VKG using triple patterns. This provides access to the VKG representing the software-internal data without materialization, in line with the concept of virtual. We provide a general implementation of the approach, supporting triple pattern and SPARQL queries, and a software-specific adaption for a particular engineering software. Using this, we demonstrate the capabilities of the approach through an industrial use case. The performance characteristics of the approach are evaluated by analyzing query execution times and scaling behavior across different query types and graph sizes, in comparison with equivalent materialized graphs.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34339-34355"},"PeriodicalIF":3.6,"publicationDate":"2026-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11414095","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-25DOI: 10.1109/ACCESS.2026.3667968
Youngmin Seo;Jinha Kim;Unsang Park
We present the Swish-T family of activation functions, which extends Swish by integrating a bounded, zero-centered Tanh-based bias term inside the activation. This design provides finer control near the activation threshold while preserving computational simplicity as a drop-in replacement. We evaluate Swish-T on diverse benchmarks, including MNIST, Fashion-MNIST, SVHN, CIFAR-10/100, Tiny-ImageNet, and Cityscapes, covering image classification and semantic segmentation across 12 architectures (CNNs and a transformer baseline). Across these settings, Swish-T consistently matches or improves upon widely used activations such as ReLU, GELU, and Swish, while offering a more efficient alternative to SMU. For example, replacing ReLU with Swish-TC in ShuffleNetV2 on CIFAR-100 improves Top-1 accuracy by 4.12%, and replacing ReLU with Swish-T in PRN-50 on Tiny-ImageNet improves accuracy by 0.97%. Compared to SMU, which can incur substantial training-time and memory overhead, Swish-T achieves comparable or better accuracy with lower computational cost, making it a practical activation choice for a broad range of deep learning models.
{"title":"Swish-T: Enhancing Swish Activation With Tanh-Based Bias for Improved Neural Network Performance","authors":"Youngmin Seo;Jinha Kim;Unsang Park","doi":"10.1109/ACCESS.2026.3667968","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3667968","url":null,"abstract":"We present the Swish-T family of activation functions, which extends Swish by integrating a bounded, zero-centered Tanh-based bias term inside the activation. This design provides finer control near the activation threshold while preserving computational simplicity as a drop-in replacement. We evaluate Swish-T on diverse benchmarks, including MNIST, Fashion-MNIST, SVHN, CIFAR-10/100, Tiny-ImageNet, and Cityscapes, covering image classification and semantic segmentation across 12 architectures (CNNs and a transformer baseline). Across these settings, Swish-T consistently matches or improves upon widely used activations such as ReLU, GELU, and Swish, while offering a more efficient alternative to SMU. For example, replacing ReLU with Swish-TC in ShuffleNetV2 on CIFAR-100 improves Top-1 accuracy by 4.12%, and replacing ReLU with Swish-T in PRN-50 on Tiny-ImageNet improves accuracy by 0.97%. Compared to SMU, which can incur substantial training-time and memory overhead, Swish-T achieves comparable or better accuracy with lower computational cost, making it a practical activation choice for a broad range of deep learning models.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34404-34419"},"PeriodicalIF":3.6,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11411783","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-25DOI: 10.1109/ACCESS.2026.3668130
Jorge E. León;Miguel Carrasco;Andrés A. Peters
On one hand, recent advances in chatbots have led to a rising popularity in using these models for coding tasks. On the other hand, modern generative image models primarily rely on text encoders to translate semantic concepts into visual representations, even when there is clear evidence that audio can be employed as input as well. Given the previous, in this work, we explore whether state-of-the-art conversational agents can design effective audio encoders to replace the CLIP text encoder from Stable Diffusion 1.5, enabling image synthesis directly from sound. We prompted five publicly available chatbots (namely, ChatGPT o3-mini, Claude 3.7 Sonnet, DeepSeek-R1, Gemini 2.5 Pro Preview 03-25, and Grok 3) to propose neural architectures to work as these audio encoders, with a set of well-explained shared conditions. Each valid suggested encoder was trained on over two million context-related audio–image–text observations, and evaluated on held-out validation and test sets using various metrics, together with a qualitative analysis of their generated images. Although almost all chatbots generated valid model designs, none achieved satisfactory results, indicating that their audio embeddings failed to align reliably with those of the original text encoder. Among the proposals, the Gemini audio encoder showed the best quantitative metrics, while the Grok audio encoder produced more coherent images (particularly, when paired with the text encoder). Our findings reveal a shared architectural bias across chatbots and underscore the remaining coding gap that needs to be bridged in future versions of these models. We also created a public demo so everyone could study and try out these audio encoders. Finally, we propose research questions that should be tackled in the future, and encourage other researchers to perform more focused and highly specialized tasks like this one, so the respective chatbots cannot make use of well-known solutions and their creativity/reasoning is fully put to the test.
一方面,聊天机器人的最新进展使得使用这些模型进行编码任务越来越受欢迎。另一方面,现代生成图像模型主要依赖于文本编码器将语义概念转换为视觉表示,即使有明确的证据表明音频也可以作为输入。鉴于之前的研究,在这项工作中,我们探讨了最先进的会话代理是否可以设计有效的音频编码器来取代Stable Diffusion 1.5中的CLIP文本编码器,从而直接从声音中合成图像。我们提示五个公开可用的聊天机器人(即ChatGPT 03- mini, Claude 3.7 Sonnet, DeepSeek-R1, Gemini 2.5 Pro Preview 03-25和Grok 3)提出神经架构作为这些音频编码器,并具有一组良好解释的共享条件。每个有效的建议编码器都在超过200万个与上下文相关的音频-图像-文本观察中进行了训练,并使用各种度量标准在验证和测试集上进行了评估,同时对其生成的图像进行了定性分析。尽管几乎所有聊天机器人都生成了有效的模型设计,但没有一个获得了令人满意的结果,这表明它们的音频嵌入未能与原始文本编码器的音频嵌入可靠地对齐。在这些建议中,Gemini音频编码器显示出最好的定量指标,而Grok音频编码器产生了更连贯的图像(特别是与文本编码器配对时)。我们的研究结果揭示了聊天机器人之间的共同架构偏见,并强调了这些模型未来版本中需要弥合的剩余编码差距。我们还创建了一个公开演示,以便每个人都可以学习和尝试这些音频编码器。最后,我们提出了未来应该解决的研究问题,并鼓励其他研究人员执行更集中和高度专业化的任务,这样各自的聊天机器人就不能使用众所周知的解决方案,他们的创造力/推理能力就得到了充分的考验。
{"title":"Bad Designs by Good Talkers: Chatbots Failing to Architect Audio Encoders for Image Synthesis","authors":"Jorge E. León;Miguel Carrasco;Andrés A. Peters","doi":"10.1109/ACCESS.2026.3668130","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3668130","url":null,"abstract":"On one hand, recent advances in chatbots have led to a rising popularity in using these models for coding tasks. On the other hand, modern generative image models primarily rely on text encoders to translate semantic concepts into visual representations, even when there is clear evidence that audio can be employed as input as well. Given the previous, in this work, we explore whether state-of-the-art conversational agents can design effective audio encoders to replace the CLIP text encoder from Stable Diffusion 1.5, enabling image synthesis directly from sound. We prompted five publicly available chatbots (namely, ChatGPT o3-mini, Claude 3.7 Sonnet, DeepSeek-R1, Gemini 2.5 Pro Preview 03-25, and Grok 3) to propose neural architectures to work as these audio encoders, with a set of well-explained shared conditions. Each valid suggested encoder was trained on over two million context-related audio–image–text observations, and evaluated on held-out validation and test sets using various metrics, together with a qualitative analysis of their generated images. Although almost all chatbots generated valid model designs, none achieved satisfactory results, indicating that their audio embeddings failed to align reliably with those of the original text encoder. Among the proposals, the Gemini audio encoder showed the best quantitative metrics, while the Grok audio encoder produced more coherent images (particularly, when paired with the text encoder). Our findings reveal a shared architectural bias across chatbots and underscore the remaining coding gap that needs to be bridged in future versions of these models. We also created a public demo so everyone could study and try out these audio encoders. Finally, we propose research questions that should be tackled in the future, and encourage other researchers to perform more focused and highly specialized tasks like this one, so the respective chatbots cannot make use of well-known solutions and their creativity/reasoning is fully put to the test.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34436-34457"},"PeriodicalIF":3.6,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11411704","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-25DOI: 10.1109/ACCESS.2026.3668108
Sen-Tung Wu;Kuan-Yu Hsiao;Nian-Zong Xu
This article proposes a high step-up, series-connected secondary-side resonant converter that integrates a front-end boost stage with a push-pull LLC resonant stage. Benefiting from resonant operation, all switches achieve soft switching, which reduces switching loss and electromagnetic interference (EMI) and improves efficiency and reliability. Meanwhile, the inductance ratio K=Lm/Lr is selected by balancing converter size and light-load efficiency. A smaller K widens the gain range and reduces magnetic size; however, an overly small magnetizing inductance Lm increases the magnetizing current iLm under light load, which may prevent operation in the decoupling region and cause ZCS to fail, thereby degrading light-load efficiency. Therefore, Lm is moderately increased to maintain ZCS and improve light-load performance. The system emulates a fuel-cell input of 60–96 V and an 800 V electric vehicle (EV) battery output. A digital signal processor (DSP), TMS320F28335, is used for digital control. The rated power is 1.5 kW. Experimental results show peak efficiencies of 88.58% at low input voltage and 90.98% at high input voltage.
{"title":"Design of a Fuel Cell Input Series-Connected High Step-Up Ratio With Secondary-Side LLC Resonant Converter for 800V EV Charger","authors":"Sen-Tung Wu;Kuan-Yu Hsiao;Nian-Zong Xu","doi":"10.1109/ACCESS.2026.3668108","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3668108","url":null,"abstract":"This article proposes a high step-up, series-connected secondary-side resonant converter that integrates a front-end boost stage with a push-pull LLC resonant stage. Benefiting from resonant operation, all switches achieve soft switching, which reduces switching loss and electromagnetic interference (EMI) and improves efficiency and reliability. Meanwhile, the inductance ratio K=Lm/Lr is selected by balancing converter size and light-load efficiency. A smaller K widens the gain range and reduces magnetic size; however, an overly small magnetizing inductance Lm increases the magnetizing current iLm under light load, which may prevent operation in the decoupling region and cause ZCS to fail, thereby degrading light-load efficiency. Therefore, Lm is moderately increased to maintain ZCS and improve light-load performance. The system emulates a fuel-cell input of 60–96 V and an 800 V electric vehicle (EV) battery output. A digital signal processor (DSP), TMS320F28335, is used for digital control. The rated power is 1.5 kW. Experimental results show peak efficiencies of 88.58% at low input voltage and 90.98% at high input voltage.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34458-34472"},"PeriodicalIF":3.6,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11411787","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-23DOI: 10.1109/ACCESS.2026.3667074
Joung Min Choi;Monjura Afrin Rumi;Connor L. Brown;Peter J. Vikesland;Amy Pruden;Liqing Zhang
The global spread of antibiotic resistance presents a significant threat to human, animal, and plant health. Metagenomic sequencing is increasingly being utilized to profile antibiotic resistance genes (ARGs) in various environments, but presently a mechanism for predicting future trends in ARG occurrence patterns is lacking. Capability of forecasting ARG abundance trends could be extremely valuable towards informing policy and practice aimed at mitigating the evolution and spread of ARGs. Here we propose ARGfore, a multivariate forecasting model for predicting ARG abundances from time-series metagenomic data. ARGfore extracts features that capture inherent relationships among ARGs and is trained to recognize patterns in ARG trends and seasonality. ARGfore outperformed standard time-series forecasting methods, such as N-HiTS, LSTM, and ARIMA, exhibiting the lowest mean absolute percentage error when applied to different wastewater datasets. Additionally, ARGfore demonstrated enhanced computational efficiency, making it a promising candidate for a variety of ARG surveillance applications. The rapid prediction of future trends can facilitate early detection and deployment of mitigation efforts if necessary. ARGfore is publicly available at https://github.com/joungmin-choi/ARGfore
{"title":"ARGfore: A Multivariate Framework for Forecasting Antibiotic Resistance Gene Abundances Using Time-Series Metagenomic Datasets","authors":"Joung Min Choi;Monjura Afrin Rumi;Connor L. Brown;Peter J. Vikesland;Amy Pruden;Liqing Zhang","doi":"10.1109/ACCESS.2026.3667074","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3667074","url":null,"abstract":"The global spread of antibiotic resistance presents a significant threat to human, animal, and plant health. Metagenomic sequencing is increasingly being utilized to profile antibiotic resistance genes (ARGs) in various environments, but presently a mechanism for predicting future trends in ARG occurrence patterns is lacking. Capability of forecasting ARG abundance trends could be extremely valuable towards informing policy and practice aimed at mitigating the evolution and spread of ARGs. Here we propose ARGfore, a multivariate forecasting model for predicting ARG abundances from time-series metagenomic data. ARGfore extracts features that capture inherent relationships among ARGs and is trained to recognize patterns in ARG trends and seasonality. ARGfore outperformed standard time-series forecasting methods, such as N-HiTS, LSTM, and ARIMA, exhibiting the lowest mean absolute percentage error when applied to different wastewater datasets. Additionally, ARGfore demonstrated enhanced computational efficiency, making it a promising candidate for a variety of ARG surveillance applications. The rapid prediction of future trends can facilitate early detection and deployment of mitigation efforts if necessary. ARGfore is publicly available at <uri>https://github.com/joungmin-choi/ARGfore</uri>","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"34692-34704"},"PeriodicalIF":3.6,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11406105","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}