Pub Date : 2026-02-23DOI: 10.1109/ACCESS.2026.3665831
Joseph Thomas Año;Arren Matthew C. Antioquia
Skin disease classification presents significant challenges due to class imbalance, low inter-class variability, and high intra-class variation present in most clinical image datasets. To address these issues, we propose Fair Channel Enhancement (FCE), a novel module that improves fine-grained feature representation without requiring additional annotations nor complex architectures. FCE allocates feature channels proportionally based on class frequency, which ensures fairer representation of underrepresented classes. FCE is coupled with CutMix augmentation and label smoothing to enhance model robustness and generalization. Extensive experiments on three dermatology benchmark datasets, including SD-128, SD-198, and SD-260, demonstrate that our approach achieves up to a 7.13% accuracy improvement over baseline models and outperforms state-of-the-art methods by a significant margin. FCE also boosts the average accuracy of both low- and high-frequency classes by up to 8.60% and 10.48%, respectively. Furthermore, our method generalizes effectively to other medical image datasets, including ISIC 2018 and Hyper-Kvasir, and performs well on smaller dataset subsets. These results highlight FCE as a simple and effective solution for imbalanced classification problems.
{"title":"Channeling Fairness: Class Imbalance-Aware Skin Disease Recognition via Fair Channel Enhancement Module","authors":"Joseph Thomas Año;Arren Matthew C. Antioquia","doi":"10.1109/ACCESS.2026.3665831","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3665831","url":null,"abstract":"Skin disease classification presents significant challenges due to class imbalance, low inter-class variability, and high intra-class variation present in most clinical image datasets. To address these issues, we propose Fair Channel Enhancement (FCE), a novel module that improves fine-grained feature representation without requiring additional annotations nor complex architectures. FCE allocates feature channels proportionally based on class frequency, which ensures fairer representation of underrepresented classes. FCE is coupled with CutMix augmentation and label smoothing to enhance model robustness and generalization. Extensive experiments on three dermatology benchmark datasets, including SD-128, SD-198, and SD-260, demonstrate that our approach achieves up to a 7.13% accuracy improvement over baseline models and outperforms state-of-the-art methods by a significant margin. FCE also boosts the average accuracy of both low- and high-frequency classes by up to 8.60% and 10.48%, respectively. Furthermore, our method generalizes effectively to other medical image datasets, including ISIC 2018 and Hyper-Kvasir, and performs well on smaller dataset subsets. These results highlight FCE as a simple and effective solution for imbalanced classification problems.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"29674-29691"},"PeriodicalIF":3.6,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11407488","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147292792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-23DOI: 10.1109/ACCESS.2026.3665093
Muhammad Saeed Javed;Ali Hennache;Muhammad Imran;Saddam Hussain Abbasi
Healthcare analytics involving sensitive patient data demand robust statistical safeguards alongside verifiable compliance mechanisms that allow regulators to independently reconstruct query operations, applied policies, and reported outcomes without accessing raw medical information. We present a governance-first, upgradeable on-chain framework that anchors complete request lifecycles to public blockchain using keccak256 cryptographic commitments. This approach delineates a process wherein researchers submit structured queries, a controller captures active policy states, and hospitals provide result digests. A comprehensive and immutable event trail for this entire process is maintained on the Ethereum Sepolia blockchain. External auditors can then reconstruct complete timelines and verify consistent binding between requests, policies, and outcomes. The proposed modular verifier interface, currently implemented via configurable MockVerifier, maintains stable Application Binary Interface compatibility for future production verifier integration while validating end-to-end governance. We demonstrate the framework’s practicality through a detailed diabetes prevalence analytics case study. We provide detailed gas consumption profiles, request latency measurements, and observable failure modes that demonstrate how governance rules translate into enforceable reversions. The architecture maintains minimal on-chain state with Layer-2 readiness, offering immediate regulatory accountability while preserving straightforward upgrade paths to advanced cryptographic components like zk verifiers (Groth16), verifiable differential privacy through VRF-seeded randomness, content-addressed artifact storage, and versioned policy management. This approach effectively separates initial deployment feasibility from computationally intensive cryptography while delivering immediately actionable, externally verifiable evidence of policy compliance.
{"title":"Policy-Bound, Verifier-Pluggable Smart Contract Framework for Auditable Healthcare Analytics","authors":"Muhammad Saeed Javed;Ali Hennache;Muhammad Imran;Saddam Hussain Abbasi","doi":"10.1109/ACCESS.2026.3665093","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3665093","url":null,"abstract":"Healthcare analytics involving sensitive patient data demand robust statistical safeguards alongside verifiable compliance mechanisms that allow regulators to independently reconstruct query operations, applied policies, and reported outcomes without accessing raw medical information. We present a governance-first, upgradeable on-chain framework that anchors complete request lifecycles to public blockchain using <monospace>keccak256</monospace> cryptographic commitments. This approach delineates a process wherein researchers submit structured queries, a controller captures active policy states, and hospitals provide result digests. A comprehensive and immutable event trail for this entire process is maintained on the Ethereum Sepolia blockchain. External auditors can then reconstruct complete timelines and verify consistent binding between requests, policies, and outcomes. The proposed modular verifier interface, currently implemented via configurable MockVerifier, maintains stable Application Binary Interface compatibility for future production verifier integration while validating end-to-end governance. We demonstrate the framework’s practicality through a detailed diabetes prevalence analytics case study. We provide detailed gas consumption profiles, request latency measurements, and observable failure modes that demonstrate how governance rules translate into enforceable reversions. The architecture maintains minimal on-chain state with Layer-2 readiness, offering immediate regulatory accountability while preserving straightforward upgrade paths to advanced cryptographic components like zk verifiers (Groth16), verifiable differential privacy through VRF-seeded randomness, content-addressed artifact storage, and versioned policy management. This approach effectively separates initial deployment feasibility from computationally intensive cryptography while delivering immediately actionable, externally verifiable evidence of policy compliance.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"29590-29609"},"PeriodicalIF":3.6,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11407969","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Event Based Surveillance (EBS) monitors online sources such as broadcast, print, web news and generates early warning and response (EWAR) signals for use in disaster mitigation. These online sources provide a dynamic data source allowing for potential real-time EBS updates. However, in dealing with news articles, fragmented information exists in varied sources and redundant information are known to overburden EBS. In this study we propose a Large Language Model based approach that filters out redundancies while learning novel information from event centric online news corpora. We study this novelty task for events covering animal health, food security and climate change surveillance domains. Our approach focuses on features integrating spatio-temporal information (such as location and date of event) and thematic information (such as the name of disease, food insecurity triggers, climate change magnitude). We characterize novelty as presence of new and additional information (e.g., a newly mentioned disease name or additional location information) as distinguished from duplicate (e.g., an already seen disease name) and missing (expected but absent) information. To this regard, our approach proposes fine-grained classification of novelty in event surveillance and language modeling adoption with a multi-class classification objective to learn classifying of event information. Our LLM adoption strategy proposes question-based prompts whose extracted answers map to predefined feature types (e.g., location, date, name of disease) in order to enrich our classifier. In our empirical studies, we present comparative analysis with respect to language models and large language models for State-Of-The-Art performance in the event novelty classification task. Our findings demonstrates the ability of cross-domain novelty classification with our model EpidGPT (few-shot) achieving $F_{1}%$ scores of 82.3, 85.49 and 88.97 in animal health, food security and climate change domains while finetuned EpidGPT achieves $F_{1}%$ scores of 96.02, 86.0 and 88.45 on each respectively domains.
{"title":"Novelty Detection in Event Surveillance Documents","authors":"Edmond Menya;Roberto Interdonato;Dickson Owuor;Mathieu Roche","doi":"10.1109/ACCESS.2026.3666022","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3666022","url":null,"abstract":"Event Based Surveillance (EBS) monitors online sources such as broadcast, print, web news and generates early warning and response (EWAR) signals for use in disaster mitigation. These online sources provide a dynamic data source allowing for potential real-time EBS updates. However, in dealing with news articles, fragmented information exists in varied sources and redundant information are known to overburden EBS. In this study we propose a Large Language Model based approach that filters out redundancies while learning novel information from event centric online news corpora. We study this novelty task for events covering animal health, food security and climate change surveillance domains. Our approach focuses on features integrating spatio-temporal information (such as location and date of event) and thematic information (such as the name of disease, food insecurity triggers, climate change magnitude). We characterize novelty as presence of new and additional information (e.g., a newly mentioned disease name or additional location information) as distinguished from duplicate (e.g., an already seen disease name) and missing (expected but absent) information. To this regard, our approach proposes fine-grained classification of novelty in event surveillance and language modeling adoption with a multi-class classification objective to learn classifying of event information. Our LLM adoption strategy proposes question-based prompts whose extracted answers map to predefined feature types (e.g., location, date, name of disease) in order to enrich our classifier. In our empirical studies, we present comparative analysis with respect to language models and large language models for State-Of-The-Art performance in the event novelty classification task. Our findings demonstrates the ability of cross-domain novelty classification with our model EpidGPT (few-shot) achieving <inline-formula> <tex-math>$F_{1}%$ </tex-math></inline-formula> scores of 82.3, 85.49 and 88.97 in animal health, food security and climate change domains while finetuned EpidGPT achieves <inline-formula> <tex-math>$F_{1}%$ </tex-math></inline-formula> scores of 96.02, 86.0 and 88.45 on each respectively domains.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"29566-29589"},"PeriodicalIF":3.6,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11407933","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The surge in access to explicit content across various platforms has sparked major concerns, yet existing content filtering systems find it difficult to analyze different media formats leading to the spread of unchecked dissemination of harmful content. To tackle these shortcomings, the authors proposed SHIELD, which is an optimized end-to-end pipeline to detect & analyze explicit content, using a large-language-model (LLM) driven approach. SHIELD processes multi media inputs by segregating and preprocessing them, followed by converting all formats into text through advanced models, extracting meaningful textual context and subjecting the resulting data to two parallel evaluation mechanisms: an LLM-based classifier for contextual analysis, and a semantic vector-based scoring system for quantitative measurement. Explicitness classifications are output in a JSON format, which allows easy integration into real-world systems. When benchmarked against a manually curated ground truth dataset, the LLM-based system surpasses vector-based approach, with an accuracy of 93.32%, as against 67.81%. The pipeline shows robustness across all media types and file sizes, confirming its viability as a scalable, context-aware solution.
{"title":"SHIELD: System for Harmful Explicit-Content Identification and Evaluation Through LLM-Driven Approach","authors":"Dishant Kapoor;Karan Ahuja;Deepika Kumar;Paanav Puri;Srinivas Jangirala;Vedika Gupta;Anandadeep Mandal","doi":"10.1109/ACCESS.2026.3667099","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3667099","url":null,"abstract":"The surge in access to explicit content across various platforms has sparked major concerns, yet existing content filtering systems find it difficult to analyze different media formats leading to the spread of unchecked dissemination of harmful content. To tackle these shortcomings, the authors proposed SHIELD, which is an optimized end-to-end pipeline to detect & analyze explicit content, using a large-language-model (LLM) driven approach. SHIELD processes multi media inputs by segregating and preprocessing them, followed by converting all formats into text through advanced models, extracting meaningful textual context and subjecting the resulting data to two parallel evaluation mechanisms: an LLM-based classifier for contextual analysis, and a semantic vector-based scoring system for quantitative measurement. Explicitness classifications are output in a JSON format, which allows easy integration into real-world systems. When benchmarked against a manually curated ground truth dataset, the LLM-based system surpasses vector-based approach, with an accuracy of 93.32%, as against 67.81%. The pipeline shows robustness across all media types and file sizes, confirming its viability as a scalable, context-aware solution.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"29493-29522"},"PeriodicalIF":3.6,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11406083","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-23DOI: 10.1109/ACCESS.2026.3664958
Sandesh Bhaktha B;P. N. Sarath Kannan;Jeyaraj Pitchaimani;K. V. Gangadharan
This paper presents a novel methodology for realizing a computationally efficient multi-objective design optimization (CE-MODO) of switched reluctance motors (SRMs). Existing MODO approaches rely heavily on static and dynamic finite element analysis (FEA) to evaluate the electromagnetic performance parameters, which, despite their accuracy, incur significant computational costs. While analytical models reduce the dependence on static FEA, their limited accuracy constrains their applicability. Similarly, existing machine learning methods are inadequate for mapping multiple geometric parameters (GPs) to static characteristics across varied SRM designs. To overcome these limitations, this study proposes a novel integrated approach that combines K-means clustering with an artificial neural network (ANN) to enable accurate and efficient prediction of static characteristics. This technique led to a 52.16% decrease in the total computation time for determining the static characteristics of the SRM designs. Further, dynamic performance is evaluated using a MATLAB/Simulink-based SRM drive model, offering a computationally lightweight alternative to dynamic FEA. The proposed CE-MODO framework is applied to a four-phase 8/6 SRM topology designed for an electric three-wheeler, with average torque and electromagnetic losses as optimization objectives. Optimization is carried out by coupling the nondominated sorting genetic algorithm II (NSGA-II) with Kriging surrogate models, significantly reducing the computational load. The proposed methodology achieved an 11.01% improvement in average torque and a 4.56% reduction in electromagnetic losses compared to the initial design. The FEA models corresponding to both static and dynamic analyses employed in this study are further validated through experimental testing on a fabricated SRM prototype.
{"title":"A Computationally Efficient Multi-Objective Design Optimization of SRM Using K-Means Clustering and Artificial Neural Networks","authors":"Sandesh Bhaktha B;P. N. Sarath Kannan;Jeyaraj Pitchaimani;K. V. Gangadharan","doi":"10.1109/ACCESS.2026.3664958","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3664958","url":null,"abstract":"This paper presents a novel methodology for realizing a computationally efficient multi-objective design optimization (CE-MODO) of switched reluctance motors (SRMs). Existing MODO approaches rely heavily on static and dynamic finite element analysis (FEA) to evaluate the electromagnetic performance parameters, which, despite their accuracy, incur significant computational costs. While analytical models reduce the dependence on static FEA, their limited accuracy constrains their applicability. Similarly, existing machine learning methods are inadequate for mapping multiple geometric parameters (GPs) to static characteristics across varied SRM designs. To overcome these limitations, this study proposes a novel integrated approach that combines K-means clustering with an artificial neural network (ANN) to enable accurate and efficient prediction of static characteristics. This technique led to a 52.16% decrease in the total computation time for determining the static characteristics of the SRM designs. Further, dynamic performance is evaluated using a MATLAB/Simulink-based SRM drive model, offering a computationally lightweight alternative to dynamic FEA. The proposed CE-MODO framework is applied to a four-phase 8/6 SRM topology designed for an electric three-wheeler, with average torque and electromagnetic losses as optimization objectives. Optimization is carried out by coupling the nondominated sorting genetic algorithm II (NSGA-II) with Kriging surrogate models, significantly reducing the computational load. The proposed methodology achieved an 11.01% improvement in average torque and a 4.56% reduction in electromagnetic losses compared to the initial design. The FEA models corresponding to both static and dynamic analyses employed in this study are further validated through experimental testing on a fabricated SRM prototype.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"29729-29747"},"PeriodicalIF":3.6,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11407487","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the context of increasing climate variability and the gradual decline of ground-based observation networks, satellite-based rainfall estimates (SREs) have become indispensable tools for hydrological monitoring disaster preparedness, and climate modeling. Satellite technology has evolved rapidly in recent years, with new missions, sensors, and techniques implemented by agencies and researchers to improve SRE products. This study presents a global systematic review explicitly applying the PRISMA methodology, offering a structured and reproducible framework for evidence synthesis. It evaluates the performance of the most widely used SREs across all continents from January 2018 to November 2025, with particular emphasis on their application in data-scarce and hydrologically complex regions. Drawing from 636 peer-reviewed studies, the review identifies key factors affecting the accuracy of SREs, including topography, rainfall type, and seasonality. Notably, products that integrate satellite data with ground-based observations consistently demonstrate superior performance compared to satellite-only estimates. Among them, IMERG-Final and CHIRPS stand out as the most widely used datasets worldwide, with IMERG-Final showing particularly promising performance across most continents. The findings highlight the need for future research to prioritize the development of advanced bias correction algorithms, region-specific calibration methods, and hybrid models that incorporate additional meteorological variables. Although previous reviews have addressed this approach, the present synthesis offers an updated and concise reference for selecting suitable SREs across diverse environmental and operational contexts.
{"title":"Satellite-Based Rainfall Datasets: A Global Systematic Review of Applications, Accuracy, and Research Gaps","authors":"Luiza Chiarelli Conte;Rutineia Tassi;Débora Missio Bayer","doi":"10.1109/ACCESS.2026.3667060","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3667060","url":null,"abstract":"In the context of increasing climate variability and the gradual decline of ground-based observation networks, satellite-based rainfall estimates (SREs) have become indispensable tools for hydrological monitoring disaster preparedness, and climate modeling. Satellite technology has evolved rapidly in recent years, with new missions, sensors, and techniques implemented by agencies and researchers to improve SRE products. This study presents a global systematic review explicitly applying the PRISMA methodology, offering a structured and reproducible framework for evidence synthesis. It evaluates the performance of the most widely used SREs across all continents from January 2018 to November 2025, with particular emphasis on their application in data-scarce and hydrologically complex regions. Drawing from 636 peer-reviewed studies, the review identifies key factors affecting the accuracy of SREs, including topography, rainfall type, and seasonality. Notably, products that integrate satellite data with ground-based observations consistently demonstrate superior performance compared to satellite-only estimates. Among them, IMERG-Final and CHIRPS stand out as the most widely used datasets worldwide, with IMERG-Final showing particularly promising performance across most continents. The findings highlight the need for future research to prioritize the development of advanced bias correction algorithms, region-specific calibration methods, and hybrid models that incorporate additional meteorological variables. Although previous reviews have addressed this approach, the present synthesis offers an updated and concise reference for selecting suitable SREs across diverse environmental and operational contexts.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"29539-29565"},"PeriodicalIF":3.6,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11405855","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-23DOI: 10.1109/ACCESS.2026.3665976
Filipe O. F. Arsénio;António Raimundo;João Pedro C. B. B. Pavia
The telecommunications industry is characterized by intense competition and rapid technological evolution, making financial stability a critical factor for sustained growth. This work focuses on leveraging machine learning techniques to analyze and predict customer payment behavior within a Portuguese telecommunications company, aiming to reduce financial losses associated with unpaid debts. Using the CRISP-DM methodology, the project first develops supervised learning models to predict whether customers will remain good payers, based solely on internal data. Among the algorithms tested, Random Forest achieved the highest accuracy of 99%, enabling early identification of potential defaulters. Complementing this, unsupervised learning methods, specifically Principal Component Analysis for dimensionality reduction and K-Means clustering, uncover hidden behavioral segments within the customer base. The optimal clustering identified five distinct groups, some of which show near-homogeneous target values (close to 0 or 1), allowing for strong characterization of compliant and non-compliant profiles. The findings demonstrate the effectiveness of combining supervised and unsupervised learning for risk analysis. Supervised models allow scenario testing by altering feature values to simulate changes in payment behavior. In unsupervised learning, analyzing ambiguous clusters through comparison with more definitive ones helps estimate likely client outcomes and supports proactive management. Future work may explore focused clustering of non-compliant clients, alternative data preprocessing, and time series forecasting to further improve predictive accuracy and operational utility.
{"title":"Anticipating Financial Risk: Machine Learning for Debt Management in Telecommunications","authors":"Filipe O. F. Arsénio;António Raimundo;João Pedro C. B. B. Pavia","doi":"10.1109/ACCESS.2026.3665976","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3665976","url":null,"abstract":"The telecommunications industry is characterized by intense competition and rapid technological evolution, making financial stability a critical factor for sustained growth. This work focuses on leveraging machine learning techniques to analyze and predict customer payment behavior within a Portuguese telecommunications company, aiming to reduce financial losses associated with unpaid debts. Using the CRISP-DM methodology, the project first develops supervised learning models to predict whether customers will remain good payers, based solely on internal data. Among the algorithms tested, Random Forest achieved the highest accuracy of 99%, enabling early identification of potential defaulters. Complementing this, unsupervised learning methods, specifically Principal Component Analysis for dimensionality reduction and K-Means clustering, uncover hidden behavioral segments within the customer base. The optimal clustering identified five distinct groups, some of which show near-homogeneous target values (close to 0 or 1), allowing for strong characterization of compliant and non-compliant profiles. The findings demonstrate the effectiveness of combining supervised and unsupervised learning for risk analysis. Supervised models allow scenario testing by altering feature values to simulate changes in payment behavior. In unsupervised learning, analyzing ambiguous clusters through comparison with more definitive ones helps estimate likely client outcomes and supports proactive management. Future work may explore focused clustering of non-compliant clients, alternative data preprocessing, and time series forecasting to further improve predictive accuracy and operational utility.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"29523-29538"},"PeriodicalIF":3.6,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11407965","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-23DOI: 10.1109/ACCESS.2026.3667399
Kai Zhang;Qiuxia Zhang;Chung-Che Wang;Jyh-Shing Roger Jang
This study addresses the challenge in Chinese automatic speech recognition (ASR) systems of accurately recognizing proper nouns such as place names, personal names, song titles, and movie or TV show titles, which is often hindered by tonal features and abundant homophones. We propose an integrated contextual biasing framework centered on multimodal large language models (MLLMs) to enhance the system’s context awareness and task adaptability. The core of this framework is an intent-driven dynamic contextual biasing mechanism: first, a fine-tuned MLLM performs end-to-end intent recognition, achieving an 81.82% relative error rate reduction compared to the unfine-tuned model and a 66.71% reduction relative to a cascaded model; subsequently, based on the highly accurate intent predictions, context-relevant keyword prompts are dynamically generated to guide speech recognition. Models fine-tuned using this strategy demonstrate significant improvements in both character error rate (CER) and keyword error rate (KER), with a 41.48% relative error reduction in KER. To address the cold-start problem, we also develop an automated data generation pipeline that requires only a domain-specific list of proper nouns to generate natural sentences using a small language model, followed by speech synthesis to produce training audio. Experiments show that models fine-tuned with synthetic data achieve a 41.91% relative error reduction in keyword recognition, nearly matching the performance of models trained on real annotated data. Overall, this work provides an innovative framework for contextual biasing in Chinese ASR and demonstrates, through open-source code and evaluation standards, the potential of multimodal large language models in integrating speech understanding and recognition tasks.
{"title":"Improving Contextual Biasing in Chinese ASR via Multimodal Large Language Models","authors":"Kai Zhang;Qiuxia Zhang;Chung-Che Wang;Jyh-Shing Roger Jang","doi":"10.1109/ACCESS.2026.3667399","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3667399","url":null,"abstract":"This study addresses the challenge in Chinese automatic speech recognition (ASR) systems of accurately recognizing proper nouns such as place names, personal names, song titles, and movie or TV show titles, which is often hindered by tonal features and abundant homophones. We propose an integrated contextual biasing framework centered on multimodal large language models (MLLMs) to enhance the system’s context awareness and task adaptability. The core of this framework is an intent-driven dynamic contextual biasing mechanism: first, a fine-tuned MLLM performs end-to-end intent recognition, achieving an 81.82% relative error rate reduction compared to the unfine-tuned model and a 66.71% reduction relative to a cascaded model; subsequently, based on the highly accurate intent predictions, context-relevant keyword prompts are dynamically generated to guide speech recognition. Models fine-tuned using this strategy demonstrate significant improvements in both character error rate (CER) and keyword error rate (KER), with a 41.48% relative error reduction in KER. To address the cold-start problem, we also develop an automated data generation pipeline that requires only a domain-specific list of proper nouns to generate natural sentences using a small language model, followed by speech synthesis to produce training audio. Experiments show that models fine-tuned with synthetic data achieve a 41.91% relative error reduction in keyword recognition, nearly matching the performance of models trained on real annotated data. Overall, this work provides an innovative framework for contextual biasing in Chinese ASR and demonstrates, through open-source code and evaluation standards, the potential of multimodal large language models in integrating speech understanding and recognition tasks.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"29706-29728"},"PeriodicalIF":3.6,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11408189","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph Neural Networks have emerged as powerful tools for analyzing graph-structured data. However, their performance often varies across datasets due to challenges such as noisy edges, sparse connectivity, and over-smoothing in deep layers. To address these limitations, Graph Deformation Network Convolution (GDNConv) is proposed as a novel graph convolution model that incorporates four key innovations: dynamic edge weight learning to filter noisy connections, graph attention deformation to prioritize relevant neighbors, multi-level aggregation to capture multi-scale patterns, and self-regularization to stabilize training. This proposed model demonstrates robustness and scalability, particularly for real-world applications involving complex and noisy graph structures, such as social networks and recommendation systems. It has the ability to dynamically adapt graph topology during training and superior performance on both dense and sparse datasets highlight its potential as a versatile solution for graph-based learning tasks. Additionally, GDNConv’s computational efficiency and self-regularization mechanisms make it suitable for large-scale applications where resource constraints are a concern. The proposed model is evaluated on four benchmark datasets—Cora, CiteSeer, PubMed, and ogbn-arxiv—and compared with several state-of-the-art models, including Graph Convolutional Network, Graph Attention Network, and Graph Sample and Aggregate. The experimental results demonstrate that the proposed model consistently outperforms these baseline approaches, achieving improvements of 4.7% and 4.2% in both accuracy and F1 Score.
{"title":"GDNConv: A Novel Graph Deformation Network for Robust Representation Learning on Noisy Graph Structures","authors":"Vinay Santhosh Chitla;Hemantha Kumar Kalluri;Satya Krishna Nunna;Mahesh Kumar Morampudi","doi":"10.1109/ACCESS.2026.3666584","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3666584","url":null,"abstract":"Graph Neural Networks have emerged as powerful tools for analyzing graph-structured data. However, their performance often varies across datasets due to challenges such as noisy edges, sparse connectivity, and over-smoothing in deep layers. To address these limitations, Graph Deformation Network Convolution (GDNConv) is proposed as a novel graph convolution model that incorporates four key innovations: dynamic edge weight learning to filter noisy connections, graph attention deformation to prioritize relevant neighbors, multi-level aggregation to capture multi-scale patterns, and self-regularization to stabilize training. This proposed model demonstrates robustness and scalability, particularly for real-world applications involving complex and noisy graph structures, such as social networks and recommendation systems. It has the ability to dynamically adapt graph topology during training and superior performance on both dense and sparse datasets highlight its potential as a versatile solution for graph-based learning tasks. Additionally, GDNConv’s computational efficiency and self-regularization mechanisms make it suitable for large-scale applications where resource constraints are a concern. The proposed model is evaluated on four benchmark datasets—Cora, CiteSeer, PubMed, and ogbn-arxiv—and compared with several state-of-the-art models, including Graph Convolutional Network, Graph Attention Network, and Graph Sample and Aggregate. The experimental results demonstrate that the proposed model consistently outperforms these baseline approaches, achieving improvements of 4.7% and 4.2% in both accuracy and F1 Score.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"29610-29627"},"PeriodicalIF":3.6,"publicationDate":"2026-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11404168","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147292791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-20DOI: 10.1109/ACCESS.2026.3666623
Eman Alattas;John Clark;Bassma Alsulami;Salma Kammoun Jarraya
Deepfakes pose a growing risk to digital integrity and public trust, driving the need for robust video-level forgery-detection methods. Many existing approaches analyse individual frames independently and overlook temporal dependencies, thereby weakening the generalisation to unseen manipulation techniques. This paper introduces 3D-CoAtNet, a spatiotemporal architecture for deepfake video detection that processes multiple frames simultaneously, thereby reducing reliance on single-frame artefacts. The model inflates CoAtNet’s 2D convolutional, residual, pooling, and self-attention layers into their 3D counterparts to learn spatial and temporal representations from multiple frames. We evaluated two input modalities: RGB 15-frame clips sampled from each video, and 15-frame optical-flow sequences that capture motion cues. Extensive experiments on FaceForensics++ (FF++), DFDC, and Celeb-DF under intra- and cross-dataset settings show that 3D-CoAtNet is competitive in intra-dataset evaluations (best in the DeepFakes dataset) and transfers well to Celeb-DF. Moreover, although frame-based CoAtNet16A achieves strong within-dataset accuracy, 3D-CoAtNet improves cross-dataset generalisation. These findings highlight the importance of the proposed 3D-CoAtNet model for deepfake forensics.
{"title":"Beyond Frames: 3D-CoAtNet for Generalizable Deepfake Video Detection","authors":"Eman Alattas;John Clark;Bassma Alsulami;Salma Kammoun Jarraya","doi":"10.1109/ACCESS.2026.3666623","DOIUrl":"https://doi.org/10.1109/ACCESS.2026.3666623","url":null,"abstract":"Deepfakes pose a growing risk to digital integrity and public trust, driving the need for robust video-level forgery-detection methods. Many existing approaches analyse individual frames independently and overlook temporal dependencies, thereby weakening the generalisation to unseen manipulation techniques. This paper introduces 3D-CoAtNet, a spatiotemporal architecture for deepfake video detection that processes multiple frames simultaneously, thereby reducing reliance on single-frame artefacts. The model inflates CoAtNet’s 2D convolutional, residual, pooling, and self-attention layers into their 3D counterparts to learn spatial and temporal representations from multiple frames. We evaluated two input modalities: RGB 15-frame clips sampled from each video, and 15-frame optical-flow sequences that capture motion cues. Extensive experiments on FaceForensics++ (FF++), DFDC, and Celeb-DF under intra- and cross-dataset settings show that 3D-CoAtNet is competitive in intra-dataset evaluations (best in the DeepFakes dataset) and transfers well to Celeb-DF. Moreover, although frame-based CoAtNet16A achieves strong within-dataset accuracy, 3D-CoAtNet improves cross-dataset generalisation. These findings highlight the importance of the proposed 3D-CoAtNet model for deepfake forensics.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"14 ","pages":"29692-29705"},"PeriodicalIF":3.6,"publicationDate":"2026-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11404125","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}