首页 > 最新文献

Frontiers in Big Data最新文献

英文 中文
Parameter-efficient fine-tuning for low-resource text classification: a comparative study of LoRA, IA3, and ReFT. 低资源文本分类的参数高效微调:LoRA、IA3和ReFT的比较研究。
IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-12-02 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1677331
Steve Nwaiwu

The successful application of large-scale transformer models in Natural Language Processing (NLP) is often hindered by the substantial computational cost and data requirements of full fine-tuning. This challenge is particularly acute in low-resource settings, where standard fine-tuning can lead to catastrophic overfitting and model collapse. To address this, Parameter-Efficient Fine-Tuning (PEFT) methods have emerged as a promising solution. However, a direct comparative analysis of their trade-offs under unified low-resource conditions is lacking. This study provides a rigorous empirical evaluation of three prominent PEFT methods: Low-Rank Adaptation (LoRA), Infused Adapter by Inhibiting and Amplifying Inner Activations (IA3), and a Representation Fine-Tuning (ReFT) strategy. Using a DistilBERT base model on low-resource versions of the AG News and Amazon Reviews datasets, the present work compares these methods against a full fine-tuning baseline across accuracy, F1 score, trainable parameters, and GPU memory usage. The findings reveal that while all PEFT methods dramatically outperform the baseline, LoRA consistently achieves the highest F1 scores (0.909 on Amazon Reviews). Critically, ReFT delivers nearly identical performance (~98% of LoRA's F1 score) while training only ~3% of the parameters, establishing it as the most efficient method. This research demonstrates that PEFT is not merely an efficiency optimization, but a necessary tool for robust generalization in data-scarce environments, providing practitioners with a clear guide to navigate the performance-efficiency trade-off. By unifying these evaluations under controlled conditions, this study advances beyond fragmented prior research and offers a systematic framework for selecting PEFT strategies.

大规模变压器模型在自然语言处理(NLP)中的成功应用常常受到大量计算成本和全微调数据需求的阻碍。在资源匮乏的环境中,这一挑战尤为严峻,因为标准微调可能导致灾难性的过拟合和模型崩溃。为了解决这个问题,参数有效微调(PEFT)方法已经成为一个有前途的解决方案。然而,在统一的低资源条件下,缺乏对其权衡的直接比较分析。本文对低秩自适应(Low-Rank Adaptation, LoRA)、抑制和放大内激活(IA3)注入适配器(Infused Adapter, IA3)和表征微调(Representation Fine-Tuning, ReFT)策略这三种PEFT方法进行了严格的实证评价。在低资源版本的AG News和Amazon Reviews数据集上使用蒸馏器基础模型,目前的工作将这些方法与准确性、F1分数、可训练参数和GPU内存使用的完整微调基线进行比较。研究结果表明,虽然所有PEFT方法的表现都明显优于基线,但LoRA始终获得最高的F1分数(亚马逊评论上的0.909)。关键是,ReFT提供了几乎相同的性能(约98%的LoRA F1分数),而只训练了约3%的参数,使其成为最有效的方法。这项研究表明,PEFT不仅是一种效率优化,而且是在数据稀缺环境中进行鲁棒泛化的必要工具,为从业者提供了一个清晰的指导,以导航性能-效率权衡。通过在受控条件下统一这些评估,本研究超越了先前零散的研究,并为选择PEFT策略提供了系统的框架。
{"title":"Parameter-efficient fine-tuning for low-resource text classification: a comparative study of LoRA, IA<sup>3</sup>, and ReFT.","authors":"Steve Nwaiwu","doi":"10.3389/fdata.2025.1677331","DOIUrl":"https://doi.org/10.3389/fdata.2025.1677331","url":null,"abstract":"<p><p>The successful application of large-scale transformer models in Natural Language Processing (NLP) is often hindered by the substantial computational cost and data requirements of full fine-tuning. This challenge is particularly acute in low-resource settings, where standard fine-tuning can lead to catastrophic overfitting and model collapse. To address this, Parameter-Efficient Fine-Tuning (PEFT) methods have emerged as a promising solution. However, a direct comparative analysis of their trade-offs under unified low-resource conditions is lacking. This study provides a rigorous empirical evaluation of three prominent PEFT methods: Low-Rank Adaptation (LoRA), Infused Adapter by Inhibiting and Amplifying Inner Activations (IA<sup>3</sup>), and a Representation Fine-Tuning (ReFT) strategy. Using a DistilBERT base model on low-resource versions of the AG News and Amazon Reviews datasets, the present work compares these methods against a full fine-tuning baseline across accuracy, F1 score, trainable parameters, and GPU memory usage. The findings reveal that while all PEFT methods dramatically outperform the baseline, LoRA consistently achieves the highest F1 scores (0.909 on Amazon Reviews). Critically, ReFT delivers nearly identical performance (~98% of LoRA's F1 score) while training only ~3% of the parameters, establishing it as the most efficient method. This research demonstrates that PEFT is not merely an efficiency optimization, but a necessary tool for robust generalization in data-scarce environments, providing practitioners with a clear guide to navigate the performance-efficiency trade-off. By unifying these evaluations under controlled conditions, this study advances beyond fragmented prior research and offers a systematic framework for selecting PEFT strategies.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1677331"},"PeriodicalIF":2.4,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12705377/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145776710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive deep Q-networks for accurate electric vehicle range estimation. 基于自适应深度q网络的电动汽车里程准确估计。
IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-27 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1697478
Urvashi Khekare, Rajay Vedaraj I S

It is critical that electric vehicles estimate the remaining driving range after charging, as this has direct implications for drivers' range anxiety and thus for large-scale EV adoption. Traditional approaches to predicting range using machine learning rely heavily on large amounts of vehicle-specific data and therefore are not scalable or adaptable. In this paper, a deep reinforcement learning framework is proposed, utilizing big data from 103 different EV models from 31 different manufacturers. This dataset combines several operational variables (state of charge, voltage, current, temperature, vehicle speed, and discharge characteristics) that reflect highly dynamic driving states. Some outliers in this heterogeneous data were reduced through a hybrid fuzzy k-means clustering approach, enhancing the quality of the data used in training. Secondly, a pathfinder meta-heuristics approach has been applied to optimize the reward function of the deep Q-learning algorithm, and thus accelerate convergence and improve accuracy. Experimental validation reveals that the proposed framework halves the range error to [-0.28, 0.40] for independent testing and [-0.23, 0.34] at 10-fold cross-validation. The proposed approach outperforms traditional machine learning and transformer-based approaches in Mean Absolute Error (outperforming by 61.86% and 4.86%, respectively) and in Root Mean Square Error (outperforming by 6.36% and 3.56%, respectively). This highlights the robustness of the proposed framework under complex, dynamic EV data and its ability to enable scalable intelligent range prediction, which engenders innovation in infrastructure and climate conscious mobility.

电动汽车在充电后估计剩余的行驶里程至关重要,因为这直接影响到司机的里程焦虑,从而影响到电动汽车的大规模采用。使用机器学习预测里程的传统方法严重依赖于大量车辆特定数据,因此不具有可扩展性或适应性。本文提出了一个深度强化学习框架,利用来自31家不同制造商的103种不同电动汽车模型的大数据。该数据集结合了几个操作变量(充电状态、电压、电流、温度、车速和放电特性),反映了高度动态的驾驶状态。通过混合模糊k-means聚类方法减少了异构数据中的一些异常值,提高了训练中使用的数据质量。其次,采用探路者元启发式方法对深度q -学习算法的奖励函数进行优化,加快了算法收敛速度,提高了算法精度。实验验证表明,该框架将独立测试的距离误差减半至[-0.28,0.40],10倍交叉验证的距离误差减半至[-0.23,0.34]。该方法在平均绝对误差(分别优于61.86%和4.86%)和均方根误差(分别优于6.36%和3.56%)方面优于传统的机器学习和基于变压器的方法。这凸显了所提出的框架在复杂、动态的电动汽车数据下的鲁棒性,以及它能够实现可扩展的智能里程预测,从而在基础设施和气候意识移动性方面实现创新。
{"title":"Adaptive deep Q-networks for accurate electric vehicle range estimation.","authors":"Urvashi Khekare, Rajay Vedaraj I S","doi":"10.3389/fdata.2025.1697478","DOIUrl":"10.3389/fdata.2025.1697478","url":null,"abstract":"<p><p>It is critical that electric vehicles estimate the remaining driving range after charging, as this has direct implications for drivers' range anxiety and thus for large-scale EV adoption. Traditional approaches to predicting range using machine learning rely heavily on large amounts of vehicle-specific data and therefore are not scalable or adaptable. In this paper, a deep reinforcement learning framework is proposed, utilizing big data from 103 different EV models from 31 different manufacturers. This dataset combines several operational variables (state of charge, voltage, current, temperature, vehicle speed, and discharge characteristics) that reflect highly dynamic driving states. Some outliers in this heterogeneous data were reduced through a hybrid fuzzy k-means clustering approach, enhancing the quality of the data used in training. Secondly, a pathfinder meta-heuristics approach has been applied to optimize the reward function of the deep Q-learning algorithm, and thus accelerate convergence and improve accuracy. Experimental validation reveals that the proposed framework halves the range error to [-0.28, 0.40] for independent testing and [-0.23, 0.34] at 10-fold cross-validation. The proposed approach outperforms traditional machine learning and transformer-based approaches in Mean Absolute Error (outperforming by 61.86% and 4.86%, respectively) and in Root Mean Square Error (outperforming by 6.36% and 3.56%, respectively). This highlights the robustness of the proposed framework under complex, dynamic EV data and its ability to enable scalable intelligent range prediction, which engenders innovation in infrastructure and climate conscious mobility.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1697478"},"PeriodicalIF":2.4,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12695611/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145758301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
M-PSGP: a momentum-based proximal scaled gradient projection algorithm for nonsmooth optimization with application to image deblurring. M-PSGP:一种基于动量的近尺度梯度投影算法,用于非光滑优化,并应用于图像去模糊。
IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-24 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1704189
Kexin Ning, Qingguo Lü, Xiaofeng Liao

In this study, we focus on investigating a nonsmooth convex optimization problem involving the l 1-norm under a non-negative constraint, with the goal of developing an inverse-problem solver for image deblurring. Research focused on solving this problem has garnered extensive attention and has had a significant impact on the field of image processing. However, existing optimization algorithms often suffer from overfitting and slow convergence, particularly when working with ill-conditioned data or noise. To address these challenges, we propose a momentum-based proximal scaled gradient projection (M-PSGP) algorithm. The M-PSGP algorithm, which is based on the proximal operator and scaled gradient projection (SGP) algorithm, integrates an improved Barzilai-Borwein-like step-size selection rule and a unified momentum acceleration framework to achieve a balance between performance optimization and convergence rate. Numerical experiments demonstrate the superiority of the M-PSGP algorithm over several seminal algorithms in image deblurring tasks, highlighting the significance of our improved step-size strategy and momentum-acceleration framework in enhancing convergence properties.

在本研究中,我们重点研究了非负约束下涉及1.1范数的非光滑凸优化问题,目的是开发图像去模糊的反问题求解器。针对这一问题的研究已经引起了广泛的关注,并对图像处理领域产生了重大影响。然而,现有的优化算法往往存在过拟合和缓慢收敛的问题,特别是在处理病态数据或噪声时。为了解决这些挑战,我们提出了一种基于动量的近尺度梯度投影(M-PSGP)算法。M-PSGP算法基于近端算子和缩放梯度投影(SGP)算法,融合了改进的Barzilai-Borwein-like步长选择规则和统一的动量加速框架,实现了性能优化和收敛速度的平衡。数值实验证明了M-PSGP算法在图像去模糊任务中的优越性,突出了我们改进的步长策略和动量-加速度框架在增强收敛性能方面的重要性。
{"title":"M-PSGP: a momentum-based proximal scaled gradient projection algorithm for nonsmooth optimization with application to image deblurring.","authors":"Kexin Ning, Qingguo Lü, Xiaofeng Liao","doi":"10.3389/fdata.2025.1704189","DOIUrl":"10.3389/fdata.2025.1704189","url":null,"abstract":"<p><p>In this study, we focus on investigating a nonsmooth convex optimization problem involving the <i>l</i> <sub>1</sub>-norm under a non-negative constraint, with the goal of developing an inverse-problem solver for image deblurring. Research focused on solving this problem has garnered extensive attention and has had a significant impact on the field of image processing. However, existing optimization algorithms often suffer from overfitting and slow convergence, particularly when working with ill-conditioned data or noise. To address these challenges, we propose a momentum-based proximal scaled gradient projection (M-PSGP) algorithm. The M-PSGP algorithm, which is based on the proximal operator and scaled gradient projection (SGP) algorithm, integrates an improved Barzilai-Borwein-like step-size selection rule and a unified momentum acceleration framework to achieve a balance between performance optimization and convergence rate. Numerical experiments demonstrate the superiority of the M-PSGP algorithm over several seminal algorithms in image deblurring tasks, highlighting the significance of our improved step-size strategy and momentum-acceleration framework in enhancing convergence properties.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1704189"},"PeriodicalIF":2.4,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12682648/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrated analysis for drug repositioning in migraine using genetic evidence and claims database. 基于遗传证据和索赔数据库的偏头痛药物重新定位综合分析。
IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-21 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1677167
Shoichiro Inokuchi, Takumi Tajima

Introduction: Migraine is a prevalent neurological disorder with a substantial socioeconomic burden, underscoring the need for continued identification of therapeutic targets. Given the significant role of genetic factors in migraine pathogenesis, a genetic-based approach is considered effective for identifying potential therapeutic targets. This study aimed to identify candidate treatments for migraine by integrating genome-wide association study (GWAS) data, perturbagen profiles, and a large-scale claims database.

Methods: We used published GWAS data to impute disease-specific gene expression profiles using a transcriptome-wide association study approach. The imputed gene signatures were cross-referenced with perturbagen signatures from the LINCS Connectivity Map to identify candidate compounds capable of reversing the disease-associated gene expression. A real-world claims database was subsequently utilized to assess the clinical efficacy of the identified perturbagens on acute migraine, employing a cohort study design and mixed-effects log-linear models with the frequency of prescribed acute migraine medications as the outcome.

Results: Eighteen approved drugs were identified as candidate therapeutics based on the perturbagen profiles. Real-world analysis using the claims database demonstrated potential inhibitory effects of metformin (relative risk [RR]: 0.81; 95% confidence interval [CI]: 0.77-0.86), statins (RR: 0.94; 95% CI: 0.92-0.96), thiazolidines (RR: 0.84; 95% CI: 0.73-0.97), and angiotensin receptor neprilysin inhibitors (RR: 0.69; 95% CI: 0.61-0.77) on migraine attacks.

Conclusion: This multidisciplinary approach highlights a cost-effective framework for drug repositioning for migraine treatment by integrating genetic, pharmacological, and real-world clinical database.

偏头痛是一种普遍存在的神经系统疾病,具有巨大的社会经济负担,强调了继续确定治疗靶点的必要性。鉴于遗传因素在偏头痛发病机制中的重要作用,基于遗传的方法被认为是识别潜在治疗靶点的有效方法。本研究旨在通过整合全基因组关联研究(GWAS)数据、摄动原谱和大规模索赔数据库来确定偏头痛的候选治疗方法。方法:我们使用已发表的GWAS数据,使用全转录组关联研究方法来推算疾病特异性基因表达谱。将输入的基因特征与LINCS连通性图中的摄动原特征进行交叉比对,以确定能够逆转疾病相关基因表达的候选化合物。随后,利用真实世界的索赔数据库,采用队列研究设计和混合效应对数线性模型,以处方急性偏头痛药物的频率为结果,评估确定的扰动原对急性偏头痛的临床疗效。结果:根据摄动原谱,18种获批药物被确定为候选治疗药物。使用索赔数据库进行的真实世界分析显示,二甲双胍(相对危险度[RR]: 0.81; 95%可信区间[CI]: 0.77-0.86)、他汀类药物(RR: 0.94; 95% CI: 0.92-0.96)、噻唑烷类药物(RR: 0.84; 95% CI: 0.73-0.97)和血管紧张素受体neprilysin抑制剂(RR: 0.69; 95% CI: 0.61-0.77)对偏头痛发作有潜在的抑制作用。结论:这种多学科的方法通过整合遗传、药理学和现实世界的临床数据库,突出了偏头痛治疗药物重新定位的成本效益框架。
{"title":"Integrated analysis for drug repositioning in migraine using genetic evidence and claims database.","authors":"Shoichiro Inokuchi, Takumi Tajima","doi":"10.3389/fdata.2025.1677167","DOIUrl":"10.3389/fdata.2025.1677167","url":null,"abstract":"<p><strong>Introduction: </strong>Migraine is a prevalent neurological disorder with a substantial socioeconomic burden, underscoring the need for continued identification of therapeutic targets. Given the significant role of genetic factors in migraine pathogenesis, a genetic-based approach is considered effective for identifying potential therapeutic targets. This study aimed to identify candidate treatments for migraine by integrating genome-wide association study (GWAS) data, perturbagen profiles, and a large-scale claims database.</p><p><strong>Methods: </strong>We used published GWAS data to impute disease-specific gene expression profiles using a transcriptome-wide association study approach. The imputed gene signatures were cross-referenced with perturbagen signatures from the LINCS Connectivity Map to identify candidate compounds capable of reversing the disease-associated gene expression. A real-world claims database was subsequently utilized to assess the clinical efficacy of the identified perturbagens on acute migraine, employing a cohort study design and mixed-effects log-linear models with the frequency of prescribed acute migraine medications as the outcome.</p><p><strong>Results: </strong>Eighteen approved drugs were identified as candidate therapeutics based on the perturbagen profiles. Real-world analysis using the claims database demonstrated potential inhibitory effects of metformin (relative risk [RR]: 0.81; 95% confidence interval [CI]: 0.77-0.86), statins (RR: 0.94; 95% CI: 0.92-0.96), thiazolidines (RR: 0.84; 95% CI: 0.73-0.97), and angiotensin receptor neprilysin inhibitors (RR: 0.69; 95% CI: 0.61-0.77) on migraine attacks.</p><p><strong>Conclusion: </strong>This multidisciplinary approach highlights a cost-effective framework for drug repositioning for migraine treatment by integrating genetic, pharmacological, and real-world clinical database.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1677167"},"PeriodicalIF":2.4,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12678156/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145702971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Intelligent leak monitoring of oil pipeline based on distributed temperature and vibration fiber signals. 基于分布式温度和振动光纤信号的输油管道泄漏智能监测。
IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-20 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1667284
Xiaobin Liang, Yonghong Deng, Yibin Wang, Hongtao Li, Weifeng Ma, Ke Wang, Junjie Ren, Ruijiao Ma, Shuai Zhang, Jiawei Liu, Wei Wu

Due to long-term usage, natural disasters and human factors, pipeline leaks or ruptures may occur, resulting in serious consequences. Therefore, it is of great significance to monitor and conduct real-time detection of pipeline leaks. Currently, the mainstream methods for pipeline leak monitoring mostly rely on a single signal, which have significant limitations such as single temperature being susceptible to environmental temperature interference leading to misjudgment, and single vibration signal being affected by pipeline operation noise. Based on this phenomenon, this research has built a distributed optical fiber system as an experimental platform for temperature and vibration monitoring, obtaining 3,530 sets of real-time synchronized spatial-temporal temperature and vibration signals. A dual-parameter fusion residual neural network structure has been constructed, which can extract characteristic signals from the original spatial-temporal temperature and vibration signals obtained from the above monitoring system, thereby achieving a classification accuracy of 92.16% for pipeline leak status and a leakage location accuracy of 1 m. This solves the problem of insufficient feature extraction and weak anti-interference ability in single signal monitoring. By fusing the original temperature and vibration signals, more leakage features can be extracted. Therefore, compared with single signal monitoring, this study has improved the accuracy of leakage identification and location, bridging the gap of misjudgment caused by single signal interference, and providing a basis for pipeline leakage monitoring and real-time warning in the oil industry.

由于长期使用、自然灾害和人为因素,管道可能发生泄漏或破裂,造成严重后果。因此,对管道泄漏进行监测和实时检测具有重要意义。目前,主流的管道泄漏监测方法大多依赖于单一信号,存在单一温度容易受到环境温度干扰导致误判、单一振动信号受管道运行噪声影响等明显的局限性。基于这一现象,本研究构建了分布式光纤系统作为温度与振动监测的实验平台,获取了3530组实时同步的时空温度与振动信号。构建双参数融合残差神经网络结构,从上述监测系统获得的原始时空温度和振动信号中提取特征信号,对管道泄漏状态的分类精度达到92.16%,泄漏定位精度达到1 m。解决了单信号监测中特征提取不足、抗干扰能力弱的问题。通过对原始温度和振动信号的融合,可以提取出更多的泄漏特征。因此,与单信号监测相比,本研究提高了泄漏识别和定位的准确性,弥补了单信号干扰造成的误判差距,为石油行业的管道泄漏监测和实时预警提供了依据。
{"title":"Intelligent leak monitoring of oil pipeline based on distributed temperature and vibration fiber signals.","authors":"Xiaobin Liang, Yonghong Deng, Yibin Wang, Hongtao Li, Weifeng Ma, Ke Wang, Junjie Ren, Ruijiao Ma, Shuai Zhang, Jiawei Liu, Wei Wu","doi":"10.3389/fdata.2025.1667284","DOIUrl":"10.3389/fdata.2025.1667284","url":null,"abstract":"<p><p>Due to long-term usage, natural disasters and human factors, pipeline leaks or ruptures may occur, resulting in serious consequences. Therefore, it is of great significance to monitor and conduct real-time detection of pipeline leaks. Currently, the mainstream methods for pipeline leak monitoring mostly rely on a single signal, which have significant limitations such as single temperature being susceptible to environmental temperature interference leading to misjudgment, and single vibration signal being affected by pipeline operation noise. Based on this phenomenon, this research has built a distributed optical fiber system as an experimental platform for temperature and vibration monitoring, obtaining 3,530 sets of real-time synchronized spatial-temporal temperature and vibration signals. A dual-parameter fusion residual neural network structure has been constructed, which can extract characteristic signals from the original spatial-temporal temperature and vibration signals obtained from the above monitoring system, thereby achieving a classification accuracy of 92.16% for pipeline leak status and a leakage location accuracy of 1 m. This solves the problem of insufficient feature extraction and weak anti-interference ability in single signal monitoring. By fusing the original temperature and vibration signals, more leakage features can be extracted. Therefore, compared with single signal monitoring, this study has improved the accuracy of leakage identification and location, bridging the gap of misjudgment caused by single signal interference, and providing a basis for pipeline leakage monitoring and real-time warning in the oil industry.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1667284"},"PeriodicalIF":2.4,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12675210/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145702915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust detection framework for adversarial threats in Autonomous Vehicle Platooning. 自主车辆队列中对抗威胁的鲁棒检测框架。
IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-19 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1617978
Stephanie Ness

Introduction: The study addresses adversarial threats in Autonomous Vehicle Platooning (AVP) using machine learning.

Methods: A novel method integrating active learning with RF, GB, XGB, KNN, LR, and AdaBoost classifiers was developed.

Results: Random Forest with active learning yielded the highest accuracy of 83.91%.

Discussion: The proposed framework significantly reduces labeling efforts and improves threat detection, enhancing AVP system security.

简介:该研究利用机器学习解决了自动驾驶车辆队列(AVP)中的对抗性威胁。方法:提出了一种将主动学习与RF、GB、XGB、KNN、LR和AdaBoost分类器相结合的新方法。结果:随机森林主动学习的准确率最高,达到83.91%。讨论:提出的框架显著减少了标记工作,改进了威胁检测,增强了AVP系统的安全性。
{"title":"Robust detection framework for adversarial threats in Autonomous Vehicle Platooning.","authors":"Stephanie Ness","doi":"10.3389/fdata.2025.1617978","DOIUrl":"https://doi.org/10.3389/fdata.2025.1617978","url":null,"abstract":"<p><strong>Introduction: </strong>The study addresses adversarial threats in Autonomous Vehicle Platooning (AVP) using machine learning.</p><p><strong>Methods: </strong>A novel method integrating active learning with RF, GB, XGB, KNN, LR, and AdaBoost classifiers was developed.</p><p><strong>Results: </strong>Random Forest with active learning yielded the highest accuracy of 83.91%.</p><p><strong>Discussion: </strong>The proposed framework significantly reduces labeling efforts and improves threat detection, enhancing AVP system security.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1617978"},"PeriodicalIF":2.4,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12672304/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145679388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced SQL injection detection using chi-square feature selection and machine learning classifiers. 使用卡方特征选择和机器学习分类器增强SQL注入检测。
IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-19 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1686479
Emanuel Casmiry, Neema Mduma, Ramadhani Sinde

In the face of increasing cyberattacks, Structured Query Language (SQL) injection remains one of the most common and damaging types of web threats, accounting for over 20% of global cyberattack costs. However, due to its dynamic and variable nature, the current detection methods often suffer from high false positive rates and lower accuracy. This study proposes an enhanced SQL injection detection using Chi-square feature selection (FS) and machine learning models. A combined dataset was assembled by merging a custom dataset with the SQLiV3.csv file from the Kaggle repository. A Jensen-Shannon Divergence (JSD) analysis revealed moderate domain variation (overall JSD = 0.5775), with class-wise divergence of 0.1340 for SQLi and 0.5320 for benign queries. Term Frequency-Inverse Document Frequency (TF-IDF) was used to convert SQL queries into feature vectors, followed by the Chi-square feature selection to retain the most statistically significant features. Five classifiers, namely multinomial Naïve Bayes, support vector machine, logistic regression, decision tree, and K-nearest neighbor, were tested before and after feature selection. The results reveal that Chi-square feature selection improves classification performance across all models by reducing noise and eliminating redundant features. Notably, Decision Tree and K-Nearest Neighbors (KNN) models, which initially performed poorly, showed substantial improvements after feature selection. The Decision Tree improved from being the second-worst performer before feature selection to the best classifier afterward, achieving the highest accuracy of 99.73%, precision of 99.72%, recall of 99.70%, F1-score of 99.71%, a false positive rate (FPR) of 0.25%, and a misclassification rate of 0.27%. These findings highlight the crucial role of feature selection in high-dimensional data environments. Future research will investigate how feature selection impacts deep learning architectures, adaptive feature selection, incremental learning approaches, robustness against adversarial attacks, and evaluate model transferability across production web environments to ensure real-time detection reliability, establishing feature selection as a vital step in developing reliable SQL injection detection systems.

面对日益增多的网络攻击,结构化查询语言(SQL)注入仍然是最常见和最具破坏性的网络威胁类型之一,占全球网络攻击成本的20%以上。然而,由于其动态性和可变性,目前的检测方法往往存在较高的假阳性率和较低的准确率。本研究提出了一种使用卡方特征选择(FS)和机器学习模型的增强SQL注入检测。通过将自定义数据集与来自Kaggle存储库的SQLiV3.csv文件合并来组装组合数据集。Jensen-Shannon Divergence (JSD)分析显示了适度的域变化(总体JSD = 0.5775), SQLi的类间差异为0.1340,良性查询的类间差异为0.5320。使用术语频率-逆文档频率(TF-IDF)将SQL查询转换为特征向量,然后进行卡方特征选择以保留最具有统计意义的特征。在特征选择前后分别对多项Naïve贝叶斯、支持向量机、逻辑回归、决策树和k近邻五种分类器进行了测试。结果表明,卡方特征选择通过降低噪声和消除冗余特征,提高了所有模型的分类性能。值得注意的是,决策树和k近邻(KNN)模型最初表现不佳,但在特征选择后表现出了实质性的改进。决策树从特征选择前的倒数第二提高到最佳分类器,准确率达到99.73%,精密度达到99.72%,召回率达到99.70%,f1得分达到99.71%,假阳性率(FPR)为0.25%,误分类率为0.27%。这些发现突出了特征选择在高维数据环境中的关键作用。未来的研究将探讨特征选择如何影响深度学习架构、自适应特征选择、增量学习方法、对对抗性攻击的鲁棒性,并评估模型在生产网络环境中的可移植性,以确保实时检测的可靠性,将特征选择作为开发可靠的SQL注入检测系统的重要一步。
{"title":"Enhanced SQL injection detection using chi-square feature selection and machine learning classifiers.","authors":"Emanuel Casmiry, Neema Mduma, Ramadhani Sinde","doi":"10.3389/fdata.2025.1686479","DOIUrl":"https://doi.org/10.3389/fdata.2025.1686479","url":null,"abstract":"<p><p>In the face of increasing cyberattacks, Structured Query Language (SQL) injection remains one of the most common and damaging types of web threats, accounting for over 20% of global cyberattack costs. However, due to its dynamic and variable nature, the current detection methods often suffer from high false positive rates and lower accuracy. This study proposes an enhanced SQL injection detection using Chi-square feature selection (FS) and machine learning models. A combined dataset was assembled by merging a custom dataset with the SQLiV3.csv file from the Kaggle repository. A Jensen-Shannon Divergence (JSD) analysis revealed moderate domain variation (overall JSD = 0.5775), with class-wise divergence of 0.1340 for SQLi and 0.5320 for benign queries. Term Frequency-Inverse Document Frequency (TF-IDF) was used to convert SQL queries into feature vectors, followed by the Chi-square feature selection to retain the most statistically significant features. Five classifiers, namely multinomial Naïve Bayes, support vector machine, logistic regression, decision tree, and K-nearest neighbor, were tested before and after feature selection. The results reveal that Chi-square feature selection improves classification performance across all models by reducing noise and eliminating redundant features. Notably, Decision Tree and K-Nearest Neighbors (KNN) models, which initially performed poorly, showed substantial improvements after feature selection. The Decision Tree improved from being the second-worst performer before feature selection to the best classifier afterward, achieving the highest accuracy of 99.73%, precision of 99.72%, recall of 99.70%, F1-score of 99.71%, a false positive rate (FPR) of 0.25%, and a misclassification rate of 0.27%. These findings highlight the crucial role of feature selection in high-dimensional data environments. Future research will investigate how feature selection impacts deep learning architectures, adaptive feature selection, incremental learning approaches, robustness against adversarial attacks, and evaluate model transferability across production web environments to ensure real-time detection reliability, establishing feature selection as a vital step in developing reliable SQL injection detection systems.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1686479"},"PeriodicalIF":2.4,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12672241/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145679438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CrossDF: improving cross-domain deepfake detection with deep information decomposition. crosssdf:利用深度信息分解改进跨域深度伪造检测。
IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-18 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1669488
Shanmin Yang, Hui Guo, Shu Hu, Bin Zhu, Ying Fu, Siwei Lyu, Xi Wu, Xin Wang

Deepfake technology represents a serious risk to safety and public confidence. While current detection approaches perform well in identifying manipulations within datasets that utilize identical deepfake methods for both training and validation, they experience notable declines in accuracy when applied to cross-dataset situations, where unfamiliar deepfake techniques are encountered during testing. To tackle this issue, we propose a Deep Information Decomposition (DID) framework to improve Cross-dataset Deepfake Detection (CrossDF). Distinct from most existing deepfake detection approaches, our framework emphasizes high-level semantic attributes instead of focusing on particular visual anomalies. More specifically, it intrinsically decomposes facial representations into deepfake-relevant and unrelated components, leveraging only the deepfake-relevant features for classification between genuine and fabricated images. Furthermore, we introduce an adversarial mutual information minimization strategy that enhances the separability between these two types of information through decorrelation learning. This significantly improves the model's robustness to irrelevant variations and strengthens its generalization capability to previously unseen manipulation techniques. Extensive experiments demonstrate the effectiveness and superiority of our proposed DID framework for cross-dataset deepfake detection. It achieves an AUC of 0.779 in cross-dataset evaluation from FF++ to CDF2 and improves the state-of-the-art AUC significantly from 0.669 to 0.802 on the diffusion-based Text-to-Image dataset.

深度造假技术对安全和公众信心构成严重威胁。虽然目前的检测方法在识别使用相同深度伪造方法进行训练和验证的数据集中的操作方面表现良好,但当应用于跨数据集情况时,它们的准确性显着下降,因为在测试过程中会遇到不熟悉的深度伪造技术。为了解决这个问题,我们提出了一个深度信息分解(DID)框架来改进跨数据集深度伪造检测(CrossDF)。与大多数现有的深度伪造检测方法不同,我们的框架强调高级语义属性,而不是专注于特定的视觉异常。更具体地说,它本质上将面部表征分解为与深度造假相关和不相关的组件,仅利用与深度造假相关的特征对真实图像和伪造图像进行分类。此外,我们引入了一种对抗性互信息最小化策略,该策略通过去相关学习增强了这两类信息之间的可分离性。这大大提高了模型对不相关变量的鲁棒性,并增强了其对以前未见过的操作技术的泛化能力。大量的实验证明了我们提出的DID框架在跨数据集深度假检测中的有效性和优越性。在FF++到CDF2的跨数据集评估中,该方法实现了0.779的AUC,并将基于扩散的文本到图像数据集的最先进AUC从0.669提高到0.802。
{"title":"CrossDF: improving cross-domain deepfake detection with deep information decomposition.","authors":"Shanmin Yang, Hui Guo, Shu Hu, Bin Zhu, Ying Fu, Siwei Lyu, Xi Wu, Xin Wang","doi":"10.3389/fdata.2025.1669488","DOIUrl":"10.3389/fdata.2025.1669488","url":null,"abstract":"<p><p>Deepfake technology represents a serious risk to safety and public confidence. While current detection approaches perform well in identifying manipulations within datasets that utilize identical deepfake methods for both training and validation, they experience notable declines in accuracy when applied to cross-dataset situations, where unfamiliar deepfake techniques are encountered during testing. To tackle this issue, we propose a Deep Information Decomposition (DID) framework to improve Cross-dataset Deepfake Detection (CrossDF). Distinct from most existing deepfake detection approaches, our framework emphasizes high-level semantic attributes instead of focusing on particular visual anomalies. More specifically, it intrinsically decomposes facial representations into deepfake-relevant and unrelated components, leveraging only the deepfake-relevant features for classification between genuine and fabricated images. Furthermore, we introduce an adversarial mutual information minimization strategy that enhances the separability between these two types of information through decorrelation learning. This significantly improves the model's robustness to irrelevant variations and strengthens its generalization capability to previously unseen manipulation techniques. Extensive experiments demonstrate the effectiveness and superiority of our proposed DID framework for cross-dataset deepfake detection. It achieves an AUC of 0.779 in cross-dataset evaluation from FF++ to CDF2 and improves the state-of-the-art AUC significantly from 0.669 to 0.802 on the diffusion-based Text-to-Image dataset.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1669488"},"PeriodicalIF":2.4,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12674592/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145679359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementation of a net benefit parameter in ROC curve decision thresholds for AI-powered mammography screening. 在人工智能乳房x线摄影筛查的ROC曲线决策阈值中实现净收益参数。
IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-18 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1690955
Anastasia Petrovna Pamova, Yuriy Aleksandrovich Vasilev, Tatyana Mikhaylovna Bobrovskaya, Anton Vyacheslavovich Vladzimirskyy, Olga Vasilyevna Omelyanskaya, Kirill Mikhailovich Arzamasov

Background: The rapid integration of artificial intelligence (AI) into mammography necessitates robust quality control methods. The lack of standardized methods for establishing decision thresholds on the Receiver Operating Characteristic (ROC) curves makes it challenging to judge the AI performance. This study aims to develop a method for determining the decision threshold for AI in screening mammography to ensure the widest possible population of women with a breast pathology is diagnosed.

Methods: Three AI models were retrospectively evaluated using a dataset of digital mammograms. The dataset consisted of screening mammography examinations obtained from 663,606 patients over the age of 40. Our method estimates the decision threshold using a novel approach to net benefit (NB) analysis. Our approach to setting the cutoff threshold was compared with the threshold determined by Youden's index using McNemar's test.

Results: Replacing the Youden index with our method across three AI models, resulted in a threefold reduction in false-positive rates, twofold reduction in false-negative rates, and twofold increase in true-positive rates. Thus, the sensitivity at the cutoff threshold determined by NB increased to 99% (maximum) compared to the sensitivity determined by Youden's index threshold (72% maximum). Correspondingly, the specificity when using our method decreased to 48% (minimum), compared to 75% (minimum) with the Youden's index method.

Conclusions: We propose using AI as the initial reader together with our novel method for determining the decision threshold in screening with double reading. This approach enhances the AI sensitivity and improves timely breast cancer diagnosis.

背景:人工智能(AI)与乳房x光检查的快速整合需要强大的质量控制方法。由于缺乏在受试者工作特征(ROC)曲线上建立决策阈值的标准化方法,使得对人工智能性能的判断具有挑战性。本研究旨在开发一种方法来确定人工智能在筛查乳房x光检查中的决策阈值,以确保尽可能多的患有乳房病理的女性被诊断出来。方法:使用数字乳房x线照片数据集对三种人工智能模型进行回顾性评估。该数据集包括来自663,606名40岁以上患者的筛查性乳房x光检查。我们的方法估计决策阈值使用一种新的方法来净效益(NB)分析。我们的方法设置的截止阈值与使用McNemar的测试约登指数确定的阈值进行比较。结果:用我们的方法在三个人工智能模型中替换约登指数,导致假阳性率降低了三倍,假阴性率降低了两倍,真阳性率增加了两倍。因此,与由约登指数阈值确定的灵敏度(最大72%)相比,NB确定的截止阈值处的灵敏度增加到99%(最大)。相应地,使用我们的方法时,特异性降低到48%(最小),而使用约登指数法时的特异性为75%(最小)。结论:我们建议使用人工智能作为初始阅读器,并结合我们的新方法来确定双重阅读筛选中的决策阈值。该方法提高了人工智能的敏感性,提高了乳腺癌的及时诊断。
{"title":"Implementation of a net benefit parameter in ROC curve decision thresholds for AI-powered mammography screening.","authors":"Anastasia Petrovna Pamova, Yuriy Aleksandrovich Vasilev, Tatyana Mikhaylovna Bobrovskaya, Anton Vyacheslavovich Vladzimirskyy, Olga Vasilyevna Omelyanskaya, Kirill Mikhailovich Arzamasov","doi":"10.3389/fdata.2025.1690955","DOIUrl":"10.3389/fdata.2025.1690955","url":null,"abstract":"<p><strong>Background: </strong>The rapid integration of artificial intelligence (AI) into mammography necessitates robust quality control methods. The lack of standardized methods for establishing decision thresholds on the Receiver Operating Characteristic (ROC) curves makes it challenging to judge the AI performance. This study aims to develop a method for determining the decision threshold for AI in screening mammography to ensure the widest possible population of women with a breast pathology is diagnosed.</p><p><strong>Methods: </strong>Three AI models were retrospectively evaluated using a dataset of digital mammograms. The dataset consisted of screening mammography examinations obtained from 663,606 patients over the age of 40. Our method estimates the decision threshold using a novel approach to net benefit (NB) analysis. Our approach to setting the cutoff threshold was compared with the threshold determined by Youden's index using McNemar's test.</p><p><strong>Results: </strong>Replacing the Youden index with our method across three AI models, resulted in a threefold reduction in false-positive rates, twofold reduction in false-negative rates, and twofold increase in true-positive rates. Thus, the sensitivity at the cutoff threshold determined by NB increased to 99% (maximum) compared to the sensitivity determined by Youden's index threshold (72% maximum). Correspondingly, the specificity when using our method decreased to 48% (minimum), compared to 75% (minimum) with the Youden's index method.</p><p><strong>Conclusions: </strong>We propose using AI as the initial reader together with our novel method for determining the decision threshold in screening with double reading. This approach enhances the AI sensitivity and improves timely breast cancer diagnosis.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1690955"},"PeriodicalIF":2.4,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12669008/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145670990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Bangla handwritten character recognition using Vision Transformers, VGG-16, and ResNet-50: a performance analysis. 使用Vision transformer、VGG-16和ResNet-50增强孟加拉语手写字符识别:性能分析。
IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-14 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1682984
A H M Shahariar Parvez, Md Samiul Islam, Fahmid Al Farid, Tashida Yeasmin, Md Monirul Islam, Md Shafiul Azam, Jia Uddin, Hezerul Abdul Karim

Bangla Handwritten Character Recognition (BHCR) remains challenging due to complex alphabets, and handwriting variations. In this study, we present a comparative evaluation of three deep learning architectures-Vision Transformer (ViT), VGG-16, and ResNet-50-on the CMATERdb 3.1.2 dataset comprising 24,000 images of 50 basic Bangla characters. Our work highlights the effectiveness of ViT in capturing global context and long-range dependencies, leading to improved generalization. Experimental results show that ViT achieves a state-of-the-art accuracy of 98.26%, outperforming VGG-16 (94.54%) and ResNet-50 (93.12%). We also analyze model behavior, discuss overfitting in CNNs, and provide insights into character-level misclassifications. This study demonstrates the potential of transformer-based architectures for robust BHCR and offers a benchmark for future research.

由于复杂的字母和手写变化,孟加拉语手写字符识别(BHCR)仍然具有挑战性。在这项研究中,我们在CMATERdb 3.1.2数据集上对三种深度学习架构——视觉转换器(ViT)、VGG-16和resnet -50进行了比较评估,该数据集包含50个基本孟加拉语字符的24,000张图像。我们的工作强调了ViT在捕获全局上下文和远程依赖关系方面的有效性,从而改进了泛化。实验结果表明,ViT的准确率达到了98.26%,优于VGG-16(94.54%)和ResNet-50(93.12%)。我们还分析了模型行为,讨论了cnn中的过拟合,并提供了对字符级错误分类的见解。该研究展示了基于变压器的结构在稳健BHCR方面的潜力,并为未来的研究提供了基准。
{"title":"Enhancing Bangla handwritten character recognition using Vision Transformers, VGG-16, and ResNet-50: a performance analysis.","authors":"A H M Shahariar Parvez, Md Samiul Islam, Fahmid Al Farid, Tashida Yeasmin, Md Monirul Islam, Md Shafiul Azam, Jia Uddin, Hezerul Abdul Karim","doi":"10.3389/fdata.2025.1682984","DOIUrl":"https://doi.org/10.3389/fdata.2025.1682984","url":null,"abstract":"<p><p>Bangla Handwritten Character Recognition (BHCR) remains challenging due to complex alphabets, and handwriting variations. In this study, we present a comparative evaluation of three deep learning architectures-Vision Transformer (ViT), VGG-16, and ResNet-50-on the CMATERdb 3.1.2 dataset comprising 24,000 images of 50 basic Bangla characters. Our work highlights the effectiveness of ViT in capturing global context and long-range dependencies, leading to improved generalization. Experimental results show that ViT achieves a state-of-the-art accuracy of 98.26%, outperforming VGG-16 (94.54%) and ResNet-50 (93.12%). We also analyze model behavior, discuss overfitting in CNNs, and provide insights into character-level misclassifications. This study demonstrates the potential of transformer-based architectures for robust BHCR and offers a benchmark for future research.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1682984"},"PeriodicalIF":2.4,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12660064/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145650071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Frontiers in Big Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1