Pub Date : 2025-09-10eCollection Date: 2025-01-01DOI: 10.3389/fdata.2025.1638307
Thijs Veugen, Gabriele Spini, Frank Muller
Secure aggregation of distributed inputs is a well-studied problem. In this study, anonymity of inputs is achieved by assuring a minimal quota before publishing the outcome. We design and implement an efficient cryptographic protocol that mitigates the most important security risks and show its application in the cyber threat intelligence (CTI) domain. Our approach allows for generic aggregation and quota functions. With 20 inputs from different parties, we can do three secure and anonymous aggregations per second, and in a CTI community of 100 partners, 10, 000 aggregations could be performed during one night.
{"title":"Secure aggregation of sufficiently many private inputs.","authors":"Thijs Veugen, Gabriele Spini, Frank Muller","doi":"10.3389/fdata.2025.1638307","DOIUrl":"https://doi.org/10.3389/fdata.2025.1638307","url":null,"abstract":"<p><p>Secure aggregation of distributed inputs is a well-studied problem. In this study, anonymity of inputs is achieved by assuring a minimal quota before publishing the outcome. We design and implement an efficient cryptographic protocol that mitigates the most important security risks and show its application in the cyber threat intelligence (CTI) domain. Our approach allows for generic aggregation and quota functions. With 20 inputs from different parties, we can do three secure and anonymous aggregations per second, and in a CTI community of 100 partners, 10, 000 aggregations could be performed during one night.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1638307"},"PeriodicalIF":2.4,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12457162/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145151634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-25eCollection Date: 2025-01-01DOI: 10.3389/fdata.2025.1564521
Elena Senger, Yuri Campbell, Rob van der Goot, Barbara Plank
Predicting career trajectories is a complex yet impactful task, offering significant benefits for personalized career counseling, recruitment optimization, and workforce planning. However, effective career path prediction (CPP) modeling faces challenges including highly variable career trajectories, free-text resume data, and limited publicly available benchmark datasets. In this study, we present a comprehensive comparative evaluation of CPP models-linear projection, multilayer perceptron (MLP), LSTM, and large language models (LLMs)-across multiple input settings and two recently introduced public datasets. Our contributions are threefold: (1) we propose novel model variants, including an MLP extension and a standardized LLM approach, (2) we systematically evaluate model performance across input types (titles only vs. title+description, standardized vs. free-text), and (3) we investigate the role of synthetic data and fine-tuning strategies in addressing data scarcity and improving model generalization. Additionally, we provide a detailed qualitative analysis of prediction behaviors across industries, career lengths, and transitions. Our findings establish new baselines, reveal the trade-offs of different modeling strategies, and offer practical insights for deploying CPP systems in real-world settings.
{"title":"Toward more realistic career path prediction: evaluation and methods.","authors":"Elena Senger, Yuri Campbell, Rob van der Goot, Barbara Plank","doi":"10.3389/fdata.2025.1564521","DOIUrl":"https://doi.org/10.3389/fdata.2025.1564521","url":null,"abstract":"<p><p>Predicting career trajectories is a complex yet impactful task, offering significant benefits for personalized career counseling, recruitment optimization, and workforce planning. However, effective career path prediction (CPP) modeling faces challenges including highly variable career trajectories, free-text resume data, and limited publicly available benchmark datasets. In this study, we present a comprehensive comparative evaluation of CPP models-linear projection, multilayer perceptron (MLP), LSTM, and large language models (LLMs)-across multiple input settings and two recently introduced public datasets. Our contributions are threefold: (1) we propose novel model variants, including an MLP extension and a standardized LLM approach, (2) we systematically evaluate model performance across input types (titles only vs. title+description, standardized vs. free-text), and (3) we investigate the role of synthetic data and fine-tuning strategies in addressing data scarcity and improving model generalization. Additionally, we provide a detailed qualitative analysis of prediction behaviors across industries, career lengths, and transitions. Our findings establish new baselines, reveal the trade-offs of different modeling strategies, and offer practical insights for deploying CPP systems in real-world settings.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1564521"},"PeriodicalIF":2.4,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12415007/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-13eCollection Date: 2025-01-01DOI: 10.3389/fdata.2025.1657320
R Parvathi, V Pattabiraman, Nancy Saxena, Aakarsh Mishra, Utkarsh Mishra, Ansh Pandey
Introduction: OpenStreetMap (OSM) road surface data is critical for navigation, infrastructure monitoring, and urban planning but is often incomplete or inconsistent. This study addresses the need for automated validation and classification of road surfaces by leveraging high-resolution aerial imagery and deep learning techniques.
Methods: We propose a MaskCNN-based deep learning model enhanced with attention mechanisms and a hierarchical loss function to classify road surfaces into four types: asphalt, concrete, gravel, and dirt. The model uses NAIP (National Agriculture Imagery Program) aerial imagery aligned with OSM labels. Preprocessing includes georeferencing, data augmentation, label cleaning, and class balancing. The architecture comprises a ResNet-50 encoder with squeeze-and-excitation blocks and a U-Net-style decoder with spatial attention. Evaluation metrics include accuracy, mIoU, precision, recall, and F1-score.
Results: The proposed model achieved an overall accuracy of 92.3% and a mean Intersection over Union (mIoU) of 83.7%, outperforming baseline models such as SVM (81.2% accuracy), Random Forest (83.7%), and standard U-Net (89.6%). Class-wise performance showed high precision and recall even for challenging surface types like gravel and dirt. Comparative evaluations against state-of-the-art models (COANet, SA-UNet, MMFFNet) also confirmed superior performance.
Discussion: The results demonstrate that combining NAIP imagery with attention-guided CNN architectures and hierarchical loss functions significantly improves road surface classification. The model is robust across varied terrains and visual conditions and shows potential for real-world applications such as OSM data enhancement, infrastructure analysis, and autonomous navigation. Limitations include label noise in OSM and class imbalance, which can be addressed through future work involving semi-supervised learning and multimodal data integration.
{"title":"Automated road surface classification in OpenStreetMap using MaskCNN and aerial imagery.","authors":"R Parvathi, V Pattabiraman, Nancy Saxena, Aakarsh Mishra, Utkarsh Mishra, Ansh Pandey","doi":"10.3389/fdata.2025.1657320","DOIUrl":"10.3389/fdata.2025.1657320","url":null,"abstract":"<p><strong>Introduction: </strong>OpenStreetMap (OSM) road surface data is critical for navigation, infrastructure monitoring, and urban planning but is often incomplete or inconsistent. This study addresses the need for automated validation and classification of road surfaces by leveraging high-resolution aerial imagery and deep learning techniques.</p><p><strong>Methods: </strong>We propose a MaskCNN-based deep learning model enhanced with attention mechanisms and a hierarchical loss function to classify road surfaces into four types: asphalt, concrete, gravel, and dirt. The model uses NAIP (National Agriculture Imagery Program) aerial imagery aligned with OSM labels. Preprocessing includes georeferencing, data augmentation, label cleaning, and class balancing. The architecture comprises a ResNet-50 encoder with squeeze-and-excitation blocks and a U-Net-style decoder with spatial attention. Evaluation metrics include accuracy, mIoU, precision, recall, and F1-score.</p><p><strong>Results: </strong>The proposed model achieved an overall accuracy of 92.3% and a mean Intersection over Union (mIoU) of 83.7%, outperforming baseline models such as SVM (81.2% accuracy), Random Forest (83.7%), and standard U-Net (89.6%). Class-wise performance showed high precision and recall even for challenging surface types like gravel and dirt. Comparative evaluations against state-of-the-art models (COANet, SA-UNet, MMFFNet) also confirmed superior performance.</p><p><strong>Discussion: </strong>The results demonstrate that combining NAIP imagery with attention-guided CNN architectures and hierarchical loss functions significantly improves road surface classification. The model is robust across varied terrains and visual conditions and shows potential for real-world applications such as OSM data enhancement, infrastructure analysis, and autonomous navigation. Limitations include label noise in OSM and class imbalance, which can be addressed through future work involving semi-supervised learning and multimodal data integration.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1657320"},"PeriodicalIF":2.4,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12382388/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144978127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-12eCollection Date: 2025-01-01DOI: 10.3389/fdata.2025.1666305
Roberto Interdonato, Hocine Cherifi
{"title":"Editorial: Interdisciplinary approaches to complex systems: highlights from FRCCS 2023/24.","authors":"Roberto Interdonato, Hocine Cherifi","doi":"10.3389/fdata.2025.1666305","DOIUrl":"10.3389/fdata.2025.1666305","url":null,"abstract":"","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1666305"},"PeriodicalIF":2.4,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12382158/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144978069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-08eCollection Date: 2025-01-01DOI: 10.3389/fdata.2025.1605018
Zeena Kailani, Lauren Kim, Joshua Bierbrier, Michael Balas, David J Mathew
Introduction: Glaucoma is a leading cause of irreversible blindness, and its rising global prevalence has led to a significant increase in glaucoma surgeries. However, predicting postoperative outcomes remains challenging due to the complex interplay of patient factors, surgical techniques, and postoperative care. Artificial intelligence (AI) has emerged as a promising tool for enhancing predictive accuracy in clinical decision-making.
Methods: This systematic review was conducted to evaluate the current evidence on the use of AI to predict surgical outcomes in glaucoma patients. A comprehensive search of Medline, Embase, Web of Science, and Scopus was performed. Studies were included if they applied AI models to glaucoma surgery outcome prediction.
Results: Six studies met inclusion criteria, collectively analyzing 4,630 surgeries. A variety of algorithms were applied, including random forests, support vector machines, and neural networks. Overall, AI models consistently outperformed traditional statistical approaches, with the best-performing model achieving an accuracy of 87.5%. Key predictors of outcomes included demographic factors (e.g., age), systemic health indicators (e.g., smoking status and body mass index), and ophthalmic parameters (e.g., baseline intraocular pressure, central corneal thickness, mitomycin C use).
Discussion: While AI models demonstrated superior performance to traditional statistical approaches, the lack of external validation and standardized surgical success definitions limit their clinical applicability. This review highlights both the promise and the current limitations of artificial intelligence in glaucoma surgery outcome prediction, emphasizing the need for prospective, multicenter studies, publicly available datasets, and standardized evaluation metrics to enhance the generalizability and clinical utility of future models.
青光眼是不可逆失明的主要原因,其全球患病率的上升导致青光眼手术的显著增加。然而,由于患者因素、手术技术和术后护理的复杂相互作用,预测术后结果仍然具有挑战性。人工智能(AI)已成为提高临床决策预测准确性的有前途的工具。方法:本系统综述旨在评估目前使用人工智能预测青光眼患者手术结果的证据。对Medline、Embase、Web of Science和Scopus进行了综合检索。将人工智能模型应用于青光眼手术结果预测的研究被纳入。结果:6项研究符合纳入标准,共分析了4630例手术。应用了各种算法,包括随机森林、支持向量机和神经网络。总体而言,人工智能模型的表现一直优于传统的统计方法,表现最好的模型达到了87.5%的准确率。结果的主要预测因素包括人口统计学因素(如年龄)、全身健康指标(如吸烟状况和体重指数)和眼科参数(如基线眼压、角膜中央厚度、丝裂霉素C的使用)。讨论:虽然人工智能模型表现出优于传统统计方法的性能,但缺乏外部验证和标准化的手术成功定义限制了其临床适用性。这篇综述强调了人工智能在青光眼手术结果预测中的前景和局限性,强调需要前瞻性、多中心研究、公开可用的数据集和标准化的评估指标,以提高未来模型的普遍性和临床实用性。系统综述注册:https://www.crd.york.ac.uk/PROSPERO/view/CRD42024621758,标识符:CRD42024621758。
{"title":"Artificial intelligence for surgical outcome prediction in glaucoma: a systematic review.","authors":"Zeena Kailani, Lauren Kim, Joshua Bierbrier, Michael Balas, David J Mathew","doi":"10.3389/fdata.2025.1605018","DOIUrl":"10.3389/fdata.2025.1605018","url":null,"abstract":"<p><strong>Introduction: </strong>Glaucoma is a leading cause of irreversible blindness, and its rising global prevalence has led to a significant increase in glaucoma surgeries. However, predicting postoperative outcomes remains challenging due to the complex interplay of patient factors, surgical techniques, and postoperative care. Artificial intelligence (AI) has emerged as a promising tool for enhancing predictive accuracy in clinical decision-making.</p><p><strong>Methods: </strong>This systematic review was conducted to evaluate the current evidence on the use of AI to predict surgical outcomes in glaucoma patients. A comprehensive search of Medline, Embase, Web of Science, and Scopus was performed. Studies were included if they applied AI models to glaucoma surgery outcome prediction.</p><p><strong>Results: </strong>Six studies met inclusion criteria, collectively analyzing 4,630 surgeries. A variety of algorithms were applied, including random forests, support vector machines, and neural networks. Overall, AI models consistently outperformed traditional statistical approaches, with the best-performing model achieving an accuracy of 87.5%. Key predictors of outcomes included demographic factors (e.g., age), systemic health indicators (e.g., smoking status and body mass index), and ophthalmic parameters (e.g., baseline intraocular pressure, central corneal thickness, mitomycin C use).</p><p><strong>Discussion: </strong>While AI models demonstrated superior performance to traditional statistical approaches, the lack of external validation and standardized surgical success definitions limit their clinical applicability. This review highlights both the promise and the current limitations of artificial intelligence in glaucoma surgery outcome prediction, emphasizing the need for prospective, multicenter studies, publicly available datasets, and standardized evaluation metrics to enhance the generalizability and clinical utility of future models.</p><p><strong>Systematic review registration: </strong>https://www.crd.york.ac.uk/PROSPERO/view/CRD42024621758, identifier: CRD42024621758.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1605018"},"PeriodicalIF":2.4,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12370750/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144977903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-07eCollection Date: 2025-01-01DOI: 10.3389/fdata.2025.1557779
R Sabitha, D Sundar
Introduction: Recommender systems are essential in e-commerce for assisting users in navigating large product catalogs, particularly in visually driven domains like fashion. Traditional keyword-based systems often struggle to capture subjective style preferences.
Methods: This study proposes a novel fashion recommendation framework using an Adaptive VPKNN-net algorithm. The model integrates deep visual feature extraction using a pre-trained VGG16 Convolutional Neural Network (CNN), dimensionality reduction through Principal Component Analysis (PCA), and a modified K-Nearest Neighbors (KNN) algorithm that combines Euclidean and cosine similarity metrics to enhance visual similarity assessment.
Results: Experiments were conducted using the "Fashion Product Images (Small)" dataset from Kaggle. The proposed system achieved high accuracy (98.69%) and demonstrated lower RMSE (0.8213) and MAE (0.6045) compared to baseline models such as Random Forest, SVM, and standard KNN.
Discussion: The proposed Adaptive VPKNN-net framework significantly improves the precision, interpretability, and efficiency of visual fashion recommendations. It eliminates the limitations of fuzzy similarity models and offers a scalable solution for visually oriented e-commerce platforms, particularly in cold-start scenarios and low-data conditions.
{"title":"A fashion product recommendation based on adaptive VPKNN-NET algorithm without fuzzy similar image.","authors":"R Sabitha, D Sundar","doi":"10.3389/fdata.2025.1557779","DOIUrl":"10.3389/fdata.2025.1557779","url":null,"abstract":"<p><strong>Introduction: </strong>Recommender systems are essential in e-commerce for assisting users in navigating large product catalogs, particularly in visually driven domains like fashion. Traditional keyword-based systems often struggle to capture subjective style preferences.</p><p><strong>Methods: </strong>This study proposes a novel fashion recommendation framework using an Adaptive VPKNN-net algorithm. The model integrates deep visual feature extraction using a pre-trained VGG16 Convolutional Neural Network (CNN), dimensionality reduction through Principal Component Analysis (PCA), and a modified K-Nearest Neighbors (KNN) algorithm that combines Euclidean and cosine similarity metrics to enhance visual similarity assessment.</p><p><strong>Results: </strong>Experiments were conducted using the \"Fashion Product Images (Small)\" dataset from Kaggle. The proposed system achieved high accuracy (98.69%) and demonstrated lower RMSE (0.8213) and MAE (0.6045) compared to baseline models such as Random Forest, SVM, and standard KNN.</p><p><strong>Discussion: </strong>The proposed Adaptive VPKNN-net framework significantly improves the precision, interpretability, and efficiency of visual fashion recommendations. It eliminates the limitations of fuzzy similarity models and offers a scalable solution for visually oriented e-commerce platforms, particularly in cold-start scenarios and low-data conditions.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1557779"},"PeriodicalIF":2.4,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12367692/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144977884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-04eCollection Date: 2025-01-01DOI: 10.3389/fdata.2025.1634133
Salma A Mahmood, Asaad A Khalaf, Saad S Hamadi
Iron deficiency anemia (IDA) and beta-thalassemia trait (BTT) are prevalent causes of microcytic anemia, often presenting overlapping hematological features that pose diagnostic challenges and necessitate prompt and precise management. Traditional discrimination indices-such as the Mentzer Index, Ihsan's formula, and the England and Fraser criteria-have been extensively applied in both research and clinical settings; however, their diagnostic performance varies considerably across different populations and datasets. This study proposes a novel and interpretable diagnostic model, the Basrah Score, developed using Elastic Net Logistic Regression (ENLR). This machine learning-based approach yields a flexible discrimination function that adapts to variations in clinical and environmental factors. The model was trained and validated on a local dataset of 2,120 individuals (1,080 with IDA and 1,040 with BTT), and was benchmarked against eight conventional indices. The Basrah Score demonstrated superior diagnostic performance, with an accuracy of 96.7%, a sensitivity of 95.0%, and a specificity of 98.6%. These results underscore the importance of incorporating advanced pre-processing techniques, class balancing, hyperparameter optimization, and rigorous cross-validation to ensure the robustness of diagnostic models. Overall, this research highlights the potential of integrating interpretable machine learning models with established clinical parameters to improve diagnostic accuracy in hematological disorders, particularly in resource-constrained settings.
{"title":"Basrah Score: a novel machine learning-based score for differentiating iron deficiency anemia and beta thalassemia trait using RBC indices.","authors":"Salma A Mahmood, Asaad A Khalaf, Saad S Hamadi","doi":"10.3389/fdata.2025.1634133","DOIUrl":"10.3389/fdata.2025.1634133","url":null,"abstract":"<p><p>Iron deficiency anemia (IDA) and beta-thalassemia trait (BTT) are prevalent causes of microcytic anemia, often presenting overlapping hematological features that pose diagnostic challenges and necessitate prompt and precise management. Traditional discrimination indices-such as the Mentzer Index, Ihsan's formula, and the England and Fraser criteria-have been extensively applied in both research and clinical settings; however, their diagnostic performance varies considerably across different populations and datasets. This study proposes a novel and interpretable diagnostic model, the Basrah Score, developed using Elastic Net Logistic Regression (ENLR). This machine learning-based approach yields a flexible discrimination function that adapts to variations in clinical and environmental factors. The model was trained and validated on a local dataset of 2,120 individuals (1,080 with IDA and 1,040 with BTT), and was benchmarked against eight conventional indices. The Basrah Score demonstrated superior diagnostic performance, with an accuracy of 96.7%, a sensitivity of 95.0%, and a specificity of 98.6%. These results underscore the importance of incorporating advanced pre-processing techniques, class balancing, hyperparameter optimization, and rigorous cross-validation to ensure the robustness of diagnostic models. Overall, this research highlights the potential of integrating interpretable machine learning models with established clinical parameters to improve diagnostic accuracy in hematological disorders, particularly in resource-constrained settings.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1634133"},"PeriodicalIF":2.4,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12358405/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144884292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01eCollection Date: 2025-01-01DOI: 10.3389/fdata.2025.1590551
Hanxin Lu, Xinyan Cheng, Jun Xiong
Background: Adverse effects of medical treatment (AEMT) pose critical global health challenges, yet comprehensive analyses of their long-term burden across socio-demographic contexts remain limited. This study evaluates 30-year trends (1990-2021) in AEMT-related mortality, disability-adjusted life years (DALYs), years lived with disability (YLDs), and years of life lost (YLLs) across 204 countries using Global Burden of Disease (GBD) 2021 data.
Methods: Age-standardized rates (ASRs) were stratified by sociodemographic index (SDI) quintiles. Frontier efficiency analysis quantified health loss boundaries relative to SDI, while concentration (C) and slope indices of inequality (SII) assessed health inequities. Predictive models projected trends to 2035.
Results: Global age-standardized mortality rates (ASDR) declined by 36.3%, with low-SDI countries achieving the steepest reductions (5.31 to 3.71/100,000) but remaining 3.9-fold higher than high-SDI nations. DALYs decreased by 39.7% (106.49 to 64.19/100,000), driven by infectious disease control in low-SDI regions. High-SDI countries experienced post-2010 mortality rebounds (0.86 to 0.95/100,000), linked to aging and complex interventions. YLLs declined by 40.3% (104.87 to 62.66/100,000), while YLDs peaked transiently (2010: 1.95/100,000). Frontier analysis revealed low-SDI countries lagged furthest from optimal health outcomes, and inequality indices highlighted entrenched disparities (C: -0.34 for premature mortality). Projections suggest continued declines in ASDR, DALYs, and YLLs by 2035, contingent on addressing antimicrobial resistance and surgical overuse.
Conclusions: SDI-driven inequities necessitate tailored interventions: low-SDI regions require strengthened infection control and primary care, while high-SDI systems must mitigate overmedicalization risks. Hybrid strategies integrating digital health and cross-sector collaboration are critical for equitable burden reduction.
{"title":"The global burden of adverse effects of medical treatment: a 30-year socio-demographic and geographic analysis using GBD 2021 data.","authors":"Hanxin Lu, Xinyan Cheng, Jun Xiong","doi":"10.3389/fdata.2025.1590551","DOIUrl":"10.3389/fdata.2025.1590551","url":null,"abstract":"<p><strong>Background: </strong>Adverse effects of medical treatment (AEMT) pose critical global health challenges, yet comprehensive analyses of their long-term burden across socio-demographic contexts remain limited. This study evaluates 30-year trends (1990-2021) in AEMT-related mortality, disability-adjusted life years (DALYs), years lived with disability (YLDs), and years of life lost (YLLs) across 204 countries using Global Burden of Disease (GBD) 2021 data.</p><p><strong>Methods: </strong>Age-standardized rates (ASRs) were stratified by sociodemographic index (SDI) quintiles. Frontier efficiency analysis quantified health loss boundaries relative to SDI, while concentration (C) and slope indices of inequality (SII) assessed health inequities. Predictive models projected trends to 2035.</p><p><strong>Results: </strong>Global age-standardized mortality rates (ASDR) declined by 36.3%, with low-SDI countries achieving the steepest reductions (5.31 to 3.71/100,000) but remaining 3.9-fold higher than high-SDI nations. DALYs decreased by 39.7% (106.49 to 64.19/100,000), driven by infectious disease control in low-SDI regions. High-SDI countries experienced post-2010 mortality rebounds (0.86 to 0.95/100,000), linked to aging and complex interventions. YLLs declined by 40.3% (104.87 to 62.66/100,000), while YLDs peaked transiently (2010: 1.95/100,000). Frontier analysis revealed low-SDI countries lagged furthest from optimal health outcomes, and inequality indices highlighted entrenched disparities (C: -0.34 for premature mortality). Projections suggest continued declines in ASDR, DALYs, and YLLs by 2035, contingent on addressing antimicrobial resistance and surgical overuse.</p><p><strong>Conclusions: </strong>SDI-driven inequities necessitate tailored interventions: low-SDI regions require strengthened infection control and primary care, while high-SDI systems must mitigate overmedicalization risks. Hybrid strategies integrating digital health and cross-sector collaboration are critical for equitable burden reduction.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1590551"},"PeriodicalIF":2.4,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12354518/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144876764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-29eCollection Date: 2025-01-01DOI: 10.3389/fdata.2025.1609124
Fatema-E Jannat, Sina Gholami, Minhaj Nur Alam, Hamed Tabkhi
Introduction: In the medical AI field, there is a significant gap between advances in AI technology and the challenge of applying locally trained models to diverse patient populations. This is mainly due to the limited availability of labeled medical image data, driven by privacy concerns. To address this, we have developed a self-supervised machine learning framework for detecting eye diseases from optical coherence tomography (OCT) images, aiming to achieve generalized learning while minimizing the need for large labeled datasets.
Methods: Our framework, OCT-SelfNet, effectively addresses the challenge of data scarcity by integrating diverse datasets from multiple sources, ensuring a comprehensive representation of eye diseases. By employing a robust two-phase training strategy self-supervised pre-training with unlabeled data followed by a supervised training stage, we utilized the power of a masked autoencoder built on the SwinV2 backbone.
Results: Extensive experiments were conducted across three datasets with varying encoder backbones, assessing scenarios including the absence of self-supervised pre-training, the absence of data fusion, low data availability, and unseen data to evaluate the efficacy of our methodology. OCT-SelfNet outperformed the baseline model (ResNet-50, ViT) in most cases. Additionally, when tested for cross-dataset generalization, OCT-SelfNet surpassed the performance of the baseline model, further demonstrating its strong generalization ability. An ablation study revealed significant improvements attributable to self-supervised pre-training and data fusion methodologies.
Discussion: Our findings suggest that the OCT-SelfNet framework is highly promising for real-world clinical deployment in detecting eye diseases from OCT images. This demonstrates the effectiveness of our two-phase training approach and the use of a masked autoencoder based on the SwinV2 backbone. Our work bridges the gap between basic research and clinical application, which significantly enhances the framework's domain adaptation and generalization capabilities in detecting eye diseases.
{"title":"OCT-SelfNet: a self-supervised framework with multi-source datasets for generalized retinal disease detection.","authors":"Fatema-E Jannat, Sina Gholami, Minhaj Nur Alam, Hamed Tabkhi","doi":"10.3389/fdata.2025.1609124","DOIUrl":"10.3389/fdata.2025.1609124","url":null,"abstract":"<p><strong>Introduction: </strong>In the medical AI field, there is a significant gap between advances in AI technology and the challenge of applying locally trained models to diverse patient populations. This is mainly due to the limited availability of labeled medical image data, driven by privacy concerns. To address this, we have developed a self-supervised machine learning framework for detecting eye diseases from optical coherence tomography (OCT) images, aiming to achieve generalized learning while minimizing the need for large labeled datasets.</p><p><strong>Methods: </strong>Our framework, OCT-SelfNet, effectively addresses the challenge of data scarcity by integrating diverse datasets from multiple sources, ensuring a comprehensive representation of eye diseases. By employing a robust two-phase training strategy self-supervised pre-training with unlabeled data followed by a supervised training stage, we utilized the power of a masked autoencoder built on the SwinV2 backbone.</p><p><strong>Results: </strong>Extensive experiments were conducted across three datasets with varying encoder backbones, assessing scenarios including the absence of self-supervised pre-training, the absence of data fusion, low data availability, and unseen data to evaluate the efficacy of our methodology. OCT-SelfNet outperformed the baseline model (ResNet-50, ViT) in most cases. Additionally, when tested for cross-dataset generalization, OCT-SelfNet surpassed the performance of the baseline model, further demonstrating its strong generalization ability. An ablation study revealed significant improvements attributable to self-supervised pre-training and data fusion methodologies.</p><p><strong>Discussion: </strong>Our findings suggest that the OCT-SelfNet framework is highly promising for real-world clinical deployment in detecting eye diseases from OCT images. This demonstrates the effectiveness of our two-phase training approach and the use of a masked autoencoder based on the SwinV2 backbone. Our work bridges the gap between basic research and clinical application, which significantly enhances the framework's domain adaptation and generalization capabilities in detecting eye diseases.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1609124"},"PeriodicalIF":2.4,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12339447/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144838516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Collaborative filtering generates recommendations by exploiting user-item similarities based on rating data, which often contains numerous unrated items. To predict scores for unrated items, matrix factorization techniques such as nonnegative matrix factorization (NMF) are often employed. Nonnegative/binary matrix factorization (NBMF), which is an extension of NMF, approximates a nonnegative matrix as the product of nonnegative and binary matrices. While previous studies have applied NBMF primarily to dense data such as images, this paper proposes a modified NBMF algorithm tailored for collaborative filtering with sparse data. In the modified method, unrated entries in the rating matrix are masked, enhancing prediction accuracy. Furthermore, utilizing a low-latency Ising machine in NBMF is advantageous in terms of the computation time, making the proposed method beneficial.
{"title":"Collaborative filtering based on nonnegative/binary matrix factorization.","authors":"Yukino Terui, Yuka Inoue, Yohei Hamakawa, Kosuke Tatsumura, Kazue Kudo","doi":"10.3389/fdata.2025.1599704","DOIUrl":"10.3389/fdata.2025.1599704","url":null,"abstract":"<p><p>Collaborative filtering generates recommendations by exploiting user-item similarities based on rating data, which often contains numerous unrated items. To predict scores for unrated items, matrix factorization techniques such as nonnegative matrix factorization (NMF) are often employed. Nonnegative/binary matrix factorization (NBMF), which is an extension of NMF, approximates a nonnegative matrix as the product of nonnegative and binary matrices. While previous studies have applied NBMF primarily to dense data such as images, this paper proposes a modified NBMF algorithm tailored for collaborative filtering with sparse data. In the modified method, unrated entries in the rating matrix are masked, enhancing prediction accuracy. Furthermore, utilizing a low-latency Ising machine in NBMF is advantageous in terms of the computation time, making the proposed method beneficial.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1599704"},"PeriodicalIF":2.4,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12339527/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144838515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}