Spiking Neural Networks (SNNs) offer promising solutions for efficient real-time processing of time-series data by closely emulating biological neuronal dynamics. However, existing encoding methods for converting raw input data into spike trains often introduce significant temporal distortions, complexity, or limitations in learnability, hindering their practical deployment. In this study, we propose the Filtered Temporal-Population (FTP) encoding method, a novel technique that integrates filtering operations into SNN encoding. FTP encoding effectively captures both temporal and spatial correlations within data segments while aligning inputs directly with the temporal axis, making it highly suitable for real-time applications. Evaluations on the MIT-BIH electrocardiogram dataset and other time-series datasets demonstrate that FTP encoding outperforms traditional encoding methods in terms of accuracy, speed, and robustness. Our findings highlight FTP encoding’s potential as a practical and effective solution for real-time SNN-based time-series classification tasks.
{"title":"FTP: Filtered Temporal-Population for time series encoding in Spiking Neural Network","authors":"Hyunwon Lee, Won-Seok Hong, Kwon Hong, Hyun-Soo Choi","doi":"10.1016/j.icte.2025.07.006","DOIUrl":"10.1016/j.icte.2025.07.006","url":null,"abstract":"<div><div>Spiking Neural Networks (SNNs) offer promising solutions for efficient real-time processing of time-series data by closely emulating biological neuronal dynamics. However, existing encoding methods for converting raw input data into spike trains often introduce significant temporal distortions, complexity, or limitations in learnability, hindering their practical deployment. In this study, we propose the Filtered Temporal-Population (FTP) encoding method, a novel technique that integrates filtering operations into SNN encoding. FTP encoding effectively captures both temporal and spatial correlations within data segments while aligning inputs directly with the temporal axis, making it highly suitable for real-time applications. Evaluations on the MIT-BIH electrocardiogram dataset and other time-series datasets demonstrate that FTP encoding outperforms traditional encoding methods in terms of accuracy, speed, and robustness. Our findings highlight FTP encoding’s potential as a practical and effective solution for real-time SNN-based time-series classification tasks.</div></div>","PeriodicalId":48526,"journal":{"name":"ICT Express","volume":"11 5","pages":"Pages 963-968"},"PeriodicalIF":4.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145289832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01DOI: 10.1016/j.icte.2025.07.003
Man Zhou , Xin Che
To improve the security, controllability, and efficiency of ethanol fuel production systems, this paper proposes the Modeling-Virtualization-Rehearsal-Execution (MVRE) feedback control method. This method involves the development of a comprehensive operational process model that integrates technical, procedural, and social interactions to address multi-layered threats. Leveraging virtualization technology, this paper creates a digital twin of the production environment, facilitating performance simulation and the training of unsupervised anomaly detection models. Experimental results show that our approach outperforms baseline methods in terms of precision, recall, F1 score, and training efficiency.
{"title":"MVRE: A feedback control approach to strengthen ethanol fuel production systems against multi-layered threats","authors":"Man Zhou , Xin Che","doi":"10.1016/j.icte.2025.07.003","DOIUrl":"10.1016/j.icte.2025.07.003","url":null,"abstract":"<div><div>To improve the security, controllability, and efficiency of ethanol fuel production systems, this paper proposes the Modeling-Virtualization-Rehearsal-Execution (MVRE) feedback control method. This method involves the development of a comprehensive operational process model that integrates technical, procedural, and social interactions to address multi-layered threats. Leveraging virtualization technology, this paper creates a digital twin of the production environment, facilitating performance simulation and the training of unsupervised anomaly detection models. Experimental results show that our approach outperforms baseline methods in terms of precision, recall, F1 score, and training efficiency.</div></div>","PeriodicalId":48526,"journal":{"name":"ICT Express","volume":"11 5","pages":"Pages 839-845"},"PeriodicalIF":4.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145289693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01DOI: 10.1016/j.icte.2025.06.001
Zhongyang Li, Yu Zhao, Joohyun Lee
In this work, we model multi-user distributed channel access as a game with channels and users, and propose the Multi-Agent Thompson Sampling (MA-TS) algorithm. It uses Bayes’ theorem to dynamically optimize action selection. This optimization aims to maximize throughput. We derive the algorithm’s computational complexity as . Simulations show that MA-TS converges to a pure strategy Nash equilibrium (PNE) and outperforms existing methods in average throughput.
{"title":"Multi-agent reinforcement learning for a distributed multi-channel access game","authors":"Zhongyang Li, Yu Zhao, Joohyun Lee","doi":"10.1016/j.icte.2025.06.001","DOIUrl":"10.1016/j.icte.2025.06.001","url":null,"abstract":"<div><div>In this work, we model multi-user distributed channel access as a game with <span><math><mi>U</mi></math></span> channels and <span><math><mi>N</mi></math></span> users, and propose the Multi-Agent Thompson Sampling (MA-TS) algorithm. It uses Bayes’ theorem to dynamically optimize action selection. This optimization aims to maximize throughput. We derive the algorithm’s computational complexity as <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>T</mi><mi>N</mi><mi>U</mi><msubsup><mrow><mi>N</mi></mrow><mrow><mtext>max</mtext></mrow><mrow><mn>2</mn></mrow></msubsup><mo>)</mo></mrow></mrow></math></span>. Simulations show that MA-TS converges to a pure strategy Nash equilibrium (PNE) and outperforms existing methods in average throughput.</div></div>","PeriodicalId":48526,"journal":{"name":"ICT Express","volume":"11 5","pages":"Pages 863-869"},"PeriodicalIF":4.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145289697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01DOI: 10.1016/j.icte.2025.05.008
Kai Chen , Yufei Zhao , Jing Guo , Zhimin Gu , Longxi Han
The Internet of Things (IoT) has transformed industries like smart grids and homes. However, firmware security is a growing concern due to vulnerabilities like command execution and buffer overflows. To address this, we propose ReachDFuzz, a directed fuzzing method using reaching-definition analysis. It targets risky library functions affected by external inputs and integrates static analysis for path pruning. Experiments show that ReachDFuzz outperforms FirmAFL in reducing invalid paths and detecting firmware vulnerabilities.
{"title":"Accelerating firmware vulnerability detection through directed reaching definition analysis","authors":"Kai Chen , Yufei Zhao , Jing Guo , Zhimin Gu , Longxi Han","doi":"10.1016/j.icte.2025.05.008","DOIUrl":"10.1016/j.icte.2025.05.008","url":null,"abstract":"<div><div>The Internet of Things (IoT) has transformed industries like smart grids and homes. However, firmware security is a growing concern due to vulnerabilities like command execution and buffer overflows. To address this, we propose ReachDFuzz, a directed fuzzing method using reaching-definition analysis. It targets risky library functions affected by external inputs and integrates static analysis for path pruning. Experiments show that ReachDFuzz outperforms FirmAFL in reducing invalid paths and detecting firmware vulnerabilities.</div></div>","PeriodicalId":48526,"journal":{"name":"ICT Express","volume":"11 5","pages":"Pages 951-956"},"PeriodicalIF":4.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145289830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01DOI: 10.1016/j.icte.2025.06.017
Yun Yang
Customer reviews for wireless earbuds were collected and preprocessed using Playwright and Requests-HTML libraries, ensuring high-quality and relevant data. This paper introduces sentiments associated with these aspects were identified using Recurrent Neural Networks (RNNs) and Bidirectional Encoder Representations from Transformers (BERT) enhanced with attention mechanisms, which helped focus on the most relevant text segments. The models were integrated using ensemble methods, specifically Voting+BERT and Bagging+BERT, to improve accuracy and robustness. The Bagging+BERT model achieved the best performance, with an accuracy of 89.9 %, outperforming traditional machine learning models like Bayesian and logistic regression by 9.6 % and 8.7 %, respectively.
{"title":"Sentiment analysis of consumer reviews on online shopping platforms using integrated deep learning models","authors":"Yun Yang","doi":"10.1016/j.icte.2025.06.017","DOIUrl":"10.1016/j.icte.2025.06.017","url":null,"abstract":"<div><div>Customer reviews for wireless earbuds were collected and preprocessed using Playwright and Requests-HTML libraries, ensuring high-quality and relevant data. This paper introduces sentiments associated with these aspects were identified using Recurrent Neural Networks (RNNs) and Bidirectional Encoder Representations from Transformers (BERT) enhanced with attention mechanisms, which helped focus on the most relevant text segments. The models were integrated using ensemble methods, specifically Voting+BERT and Bagging+BERT, to improve accuracy and robustness. The Bagging+BERT model achieved the best performance, with an accuracy of 89.9 %, outperforming traditional machine learning models like Bayesian and logistic regression by 9.6 % and 8.7 %, respectively.</div></div>","PeriodicalId":48526,"journal":{"name":"ICT Express","volume":"11 5","pages":"Pages 881-887"},"PeriodicalIF":4.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145289836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01DOI: 10.1016/j.icte.2025.06.009
Huy Nguyen , Yeong Min Jang
This study proposes HS2PSK-OFDM, a hybrid waveform for vehicular systems using optical camera communication (OCC), which integrates spatial-2-phase-shift-keying (S2-PSK) and rolling-shutter orthogonal frequency division multiplexing (RS-OFDM). The proposed hybrid scheme enables simultaneous transmission of low-rate and high-rate data streams in OCC systems. The low-rate data stream facilitates the detection and tracking of light sources in vehicles to establish OCC links using the You Only Look Once (YOLO) algorithm. In contrast, the high-rate data stream transmits high-rate data, which are supported by region-of-interest (RoI) updates derived from the low-rate data stream. This process reduces the noise and computational costs of high-rate data streams in mobile environments. Deep learning techniques have also been proposed to improve the decoder performance of high data rate streams (RS-OFDM decoder) in highly mobile environments. This paper analyzes the technical considerations of the HS2PSK-OFDM scheme-based deep learning approach to validate its performance in mobile environments. In addition, the implementation results are presented to evaluate the feasibility of the proposed hybrid scheme.
本研究提出了HS2PSK-OFDM,一种用于使用光学摄像机通信(OCC)的车载系统的混合波形,它集成了空间2相移键控(S2-PSK)和滚动快门正交频分复用(RS-OFDM)。所提出的混合方案能够在OCC系统中同时传输低速率和高速率数据流。低速率数据流有助于检测和跟踪车辆中的光源,并使用You Only Look Once (YOLO)算法建立OCC链接。相比之下,高速率数据流传输高速率数据,这些数据由从低速率数据流派生的兴趣区域(RoI)更新支持。该过程降低了移动环境中高速数据流的噪声和计算成本。深度学习技术也被提出用于提高高移动环境下高数据率流(RS-OFDM解码器)的解码器性能。本文分析了基于HS2PSK-OFDM方案的深度学习方法的技术考虑,验证了其在移动环境中的性能。最后给出了实现结果,以评价所提出的混合方案的可行性。
{"title":"Experimental demonstration of deep learning-based HS2PSK-OFDM scheme for optical camera communication","authors":"Huy Nguyen , Yeong Min Jang","doi":"10.1016/j.icte.2025.06.009","DOIUrl":"10.1016/j.icte.2025.06.009","url":null,"abstract":"<div><div>This study proposes HS2PSK-OFDM, a hybrid waveform for vehicular systems using optical camera communication (OCC), which integrates spatial-2-phase-shift-keying (S2-PSK) and rolling-shutter orthogonal frequency division multiplexing (RS-OFDM). The proposed hybrid scheme enables simultaneous transmission of low-rate and high-rate data streams in OCC systems. The low-rate data stream facilitates the detection and tracking of light sources in vehicles to establish OCC links using the You Only Look Once (YOLO) algorithm. In contrast, the high-rate data stream transmits high-rate data, which are supported by region-of-interest (RoI) updates derived from the low-rate data stream. This process reduces the noise and computational costs of high-rate data streams in mobile environments. Deep learning techniques have also been proposed to improve the decoder performance of high data rate streams (RS-OFDM decoder) in highly mobile environments. This paper analyzes the technical considerations of the HS2PSK-OFDM scheme-based deep learning approach to validate its performance in mobile environments. In addition, the implementation results are presented to evaluate the feasibility of the proposed hybrid scheme.</div></div>","PeriodicalId":48526,"journal":{"name":"ICT Express","volume":"11 5","pages":"Pages 894-900"},"PeriodicalIF":4.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145289704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01DOI: 10.1016/j.icte.2025.05.012
Phu Tran Tin , Minh-Sang Van Nguyen , Tran Trung Duy , Van Huy Pham , Byung-Seo Kim
Non-orthogonal multiple access (NOMA) and reconfigurable intelligent surface (RIS) are critical technologies for future wireless communications that provide spectral efficiency while consuming little power. In this research, we explore the security of a downlink NOMA wireless relay network that incorporates the RIS and Fountain Codes (FCs) technique. To assess system performance and security, we compute closed-form formulas for outage probability (OP) and intercept probability (IP). Furthermore, deep neural networks (DNNs) are used in the system model to evaluate and optimize OP and IP. Monte Carlo simulations are used to validate the theoretical conclusions, yielding the following major insights: (i) The major goal of these simulations is to validate analytical expressions. (ii) This study greatly improves our understanding of RIS-NOMA systems, setting the groundwork for future research into actual implementations. (iii) The results further illustrate the better performance of RIS-NOMA by evaluating important system factors such as the number of reflecting elements, the user threshold rate and the maximum number of encoded packets.
{"title":"Combination of RIS and Fountain Codes in NOMA relay wireless networks for enhancing system performance and security","authors":"Phu Tran Tin , Minh-Sang Van Nguyen , Tran Trung Duy , Van Huy Pham , Byung-Seo Kim","doi":"10.1016/j.icte.2025.05.012","DOIUrl":"10.1016/j.icte.2025.05.012","url":null,"abstract":"<div><div>Non-orthogonal multiple access (NOMA) and reconfigurable intelligent surface (RIS) are critical technologies for future wireless communications that provide spectral efficiency while consuming little power. In this research, we explore the security of a downlink NOMA wireless relay network that incorporates the RIS and Fountain Codes (FCs) technique. To assess system performance and security, we compute closed-form formulas for outage probability (OP) and intercept probability (IP). Furthermore, deep neural networks (DNNs) are used in the system model to evaluate and optimize OP and IP. Monte Carlo simulations are used to validate the theoretical conclusions, yielding the following major insights: (i) The major goal of these simulations is to validate analytical expressions. (ii) This study greatly improves our understanding of RIS-NOMA systems, setting the groundwork for future research into actual implementations. (iii) The results further illustrate the better performance of RIS-NOMA by evaluating important system factors such as the number of reflecting elements, the user threshold rate and the maximum number of encoded packets.</div></div>","PeriodicalId":48526,"journal":{"name":"ICT Express","volume":"11 5","pages":"Pages 909-913"},"PeriodicalIF":4.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145289705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01DOI: 10.1016/j.icte.2025.08.007
Kexun Li, Zhijun Gao
To address the high complexity, poor real-time performance, and the prevalence of false positives and false negatives in current algorithms for detecting small-target pollutants on UAV-based building facades, this study proposes SDS-YOLOv8. The spatial pyramid pooling structure in the backbone is enhanced to improve feature representation. DySample is incorporated into the neck to adaptively adjust sampling points based on the image feature distribution. Additionally, the SCAM module is introduced to improve the memory of important information, and the loss function is further optimized. Experimental results demonstrate that the accuracy of the proposed algorithm is significantly improved, exhibiting strong generalization capability.
Pub Date : 2025-10-01DOI: 10.1016/j.icte.2025.07.007
MinGi Kim , DongHyun Shin , WooHyung Ko , YoungBeom Kim , Seog Chung Seo
In this paper, we present an optimized implementation of SMAUG-T, one of Round 2 Key Encapsulation Mechanism algorithms in Korean Post-quantum Cryptography Competition, on a widely used 16-bit MSP430 MCU. To achieve performance efficiency of polynomial multiplication, one of the most time-consuming operations in SMAUG-T, we find the optimal method by investigating several latest algorithms such as the Toom–Cook method and the Number-Theoretic Transform (NTT)-based methods (32-bit single moduli version and 16-bit multi-moduli version). Through the investigation, we found that 32-bit single moduli version is the best approach for polynomial multiplication in SMAUG-T on 16-bit MSP430 MCU. To enhance the performance of NTT-based polynomial multiplication, we proposed an improved 32-bit signed Montgomery multiplication method with a newly found Montgomery prime (0x250001) and the intrinsic hardware multiplier. We also apply the state-of-the-art techniques for NTT and inverse NTT (iNTT) such as the layer merging, CT butterfly by tuning them proper to the target device. As a result, our NTT implementation achieves around 35% of improved performance compared to the previous best result of 32-bit single moduli version implementation proposed for Dilithium on 16-bit MSP430 MCU. Finally, our SMAUG-T implementation with the proposed NTT implementation provides 43%–63%, 92%–99%, and 85%–95% of improved performance for key generation, encapsulation, and decapsulation compared to the reference implementation, respectively.
{"title":"Optimized implementation of SMAUG-T on resource-constrained 16-bit MSP430 MCU","authors":"MinGi Kim , DongHyun Shin , WooHyung Ko , YoungBeom Kim , Seog Chung Seo","doi":"10.1016/j.icte.2025.07.007","DOIUrl":"10.1016/j.icte.2025.07.007","url":null,"abstract":"<div><div>In this paper, we present an optimized implementation of SMAUG-T, one of Round 2 Key Encapsulation Mechanism algorithms in Korean Post-quantum Cryptography Competition, on a widely used 16-bit MSP430 MCU. To achieve performance efficiency of polynomial multiplication, one of the most time-consuming operations in SMAUG-T, we find the optimal method by investigating several latest algorithms such as the Toom–Cook method and the Number-Theoretic Transform (NTT)-based methods (32-bit single moduli version and 16-bit multi-moduli version). Through the investigation, we found that 32-bit single moduli version is the best approach for polynomial multiplication in SMAUG-T on 16-bit MSP430 MCU. To enhance the performance of NTT-based polynomial multiplication, we proposed an improved 32-bit signed Montgomery multiplication method with a newly found Montgomery prime (0x250001) and the intrinsic hardware multiplier. We also apply the state-of-the-art techniques for NTT and inverse NTT (iNTT) such as the layer merging, CT butterfly by tuning them proper to the target device. As a result, our NTT implementation achieves around 35% of improved performance compared to the previous best result of 32-bit single moduli version implementation proposed for Dilithium on 16-bit MSP430 MCU. Finally, our SMAUG-T implementation with the proposed NTT implementation provides 43%–63%, 92%–99%, and 85%–95% of improved performance for key generation, encapsulation, and decapsulation compared to the reference implementation, respectively.</div></div>","PeriodicalId":48526,"journal":{"name":"ICT Express","volume":"11 5","pages":"Pages 851-857"},"PeriodicalIF":4.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145289695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01DOI: 10.1016/j.icte.2025.08.001
Md Ilias Bappi , Jannat Afrin Juthy , Kyungbaek Kim
Diabetic Retinopathy (DR) is a leading cause of vision impairment and blindness worldwide. Early diagnosis is crucial for preventing irreversible vision loss, but manual screening methods are time-consuming and often inconsistent. Deep learning (DL) techniques have shown promise in automating DR detection; however, many existing models still struggle to capture subtle lesions and distinguish fine-grained severity stages. In this survey, we comprehensively review recent DL-based approaches for DR classification, emphasizing attention mechanisms, feature fusion strategies, and stage-wise grading. To address current gaps, we propose a hybrid taxonomy that identifies effective combinations such as texture-based attention, CNN-Transformer fusion, and multi-modal integration. Additionally, we validate our previously published model, STMFNet, a spatial texture-aware attention network based on EfficientNet, across four benchmark datasets. On EyePACS and Messidor, STMFNet achieves up to 98.10% accuracy, outperforming several state-of-the-art (SOTA) models under similar settings. This study provides both a consolidated overview of DR detection advancements and a practical benchmark framework to guide future research in AI-assisted DR classification.
{"title":"Deep learning-based diabetic retinopathy recognition and grading: Challenges, gaps, and an improved approach — A survey","authors":"Md Ilias Bappi , Jannat Afrin Juthy , Kyungbaek Kim","doi":"10.1016/j.icte.2025.08.001","DOIUrl":"10.1016/j.icte.2025.08.001","url":null,"abstract":"<div><div>Diabetic Retinopathy (DR) is a leading cause of vision impairment and blindness worldwide. Early diagnosis is crucial for preventing irreversible vision loss, but manual screening methods are time-consuming and often inconsistent. Deep learning (DL) techniques have shown promise in automating DR detection; however, many existing models still struggle to capture subtle lesions and distinguish fine-grained severity stages. In this survey, we comprehensively review recent DL-based approaches for DR classification, emphasizing attention mechanisms, feature fusion strategies, and stage-wise grading. To address current gaps, we propose a hybrid taxonomy that identifies effective combinations such as texture-based attention, CNN-Transformer fusion, and multi-modal integration. Additionally, we validate our previously published model, STMFNet, a spatial texture-aware attention network based on EfficientNet, across four benchmark datasets. On EyePACS and Messidor, STMFNet achieves up to 98.10% accuracy, outperforming several state-of-the-art (SOTA) models under similar settings. This study provides both a consolidated overview of DR detection advancements and a practical benchmark framework to guide future research in AI-assisted DR classification.</div></div>","PeriodicalId":48526,"journal":{"name":"ICT Express","volume":"11 5","pages":"Pages 993-1013"},"PeriodicalIF":4.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145289691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}