Pub Date : 2026-02-06DOI: 10.1186/s42492-026-00214-4
Berenice Montalvo-Lezama, Gibran Fuentes-Pineda
The limited availability of annotated data presents a major challenge in applying deep learning methods to medical image analysis. Few-shot learning methods aim to recognize new classes from only a few labeled examples. These methods are typically investigated within a standard few-shot learning paradigm, in which all classes in a task are new. However, medical applications, such as pathology classification from chest X-rays, often require learning new classes while simultaneously leveraging the knowledge of previously known ones, a scenario more closely aligned with generalized few-shot classification. Despite its practical relevance, few-shot learning has rarely been investigated in this context. This study presents MetaChest, a large-scale dataset of 479,215 chest X-rays collected from four public databases. It includes a meta-set partition specifically designed for standard few-shot classification, as well as an algorithm for generating multi-label episodes. Extensive experiments were conducted to evaluate both the standard transfer learning (TL) approach and an extension of ProtoNet across a wide range of few-shot multi-label classification tasks. The results indicate that increasing the number of classes per episode and the number of training examples per class improves the classification performance. Notably, the TL approach consistently outperformed the ProtoNet extension, even though it was not specifically tailored for few-shot learning. Furthermore, higher-resolution images improved the accuracy at the cost of additional computation, whereas efficient model architectures achieved performances comparable to larger models with significantly reduced resource requirements.
{"title":"MetaChest: generalized few-shot learning of pathologies from chest X-rays.","authors":"Berenice Montalvo-Lezama, Gibran Fuentes-Pineda","doi":"10.1186/s42492-026-00214-4","DOIUrl":"10.1186/s42492-026-00214-4","url":null,"abstract":"<p><p>The limited availability of annotated data presents a major challenge in applying deep learning methods to medical image analysis. Few-shot learning methods aim to recognize new classes from only a few labeled examples. These methods are typically investigated within a standard few-shot learning paradigm, in which all classes in a task are new. However, medical applications, such as pathology classification from chest X-rays, often require learning new classes while simultaneously leveraging the knowledge of previously known ones, a scenario more closely aligned with generalized few-shot classification. Despite its practical relevance, few-shot learning has rarely been investigated in this context. This study presents MetaChest, a large-scale dataset of 479,215 chest X-rays collected from four public databases. It includes a meta-set partition specifically designed for standard few-shot classification, as well as an algorithm for generating multi-label episodes. Extensive experiments were conducted to evaluate both the standard transfer learning (TL) approach and an extension of ProtoNet across a wide range of few-shot multi-label classification tasks. The results indicate that increasing the number of classes per episode and the number of training examples per class improves the classification performance. Notably, the TL approach consistently outperformed the ProtoNet extension, even though it was not specifically tailored for few-shot learning. Furthermore, higher-resolution images improved the accuracy at the cost of additional computation, whereas efficient model architectures achieved performances comparable to larger models with significantly reduced resource requirements.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":"4"},"PeriodicalIF":6.0,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12876522/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1186/s42492-025-00213-x
Lei Wang, Weiming Zeng, Kai Long, Hongyu Chen, Rongfeng Lan, Li Liu, Wai Ting Siok, Nizhuan Wang
Photoacoustic imaging (PAI), a modality that combines the high contrast of optical imaging with the deep penetration of ultrasound, is rapidly transitioning from preclinical research to clinical practice. However, its widespread clinical adoption faces challenges such as the inherent trade-off between penetration depth and spatial resolution, along with the demand for faster imaging speeds. This review comprehensively examines the fundamental principles of PAI, focusing on three primary implementations: photoacoustic computed tomography, photoacoustic microscopy, and photoacoustic endoscopy. It critically analyzes their respective advantages and limitations to provide insights into practical applications. The discussion then extends to recent advancements in image reconstruction and artifact suppression, where both conventional and deep learning (DL)-based approaches have been highlighted for their role in enhancing image quality and streamlining workflows. Furthermore, this work explores progress in quantitative PAI, particularly its ability to precisely measure hemoglobin concentration, oxygen saturation, and other physiological biomarkers. Finally, this review outlines emerging trends and future directions, underscoring the transformative potential of DL in shaping the clinical evolution of PAI.
{"title":"Advances in photoacoustic imaging reconstruction and quantitative analysis for biomedical applications.","authors":"Lei Wang, Weiming Zeng, Kai Long, Hongyu Chen, Rongfeng Lan, Li Liu, Wai Ting Siok, Nizhuan Wang","doi":"10.1186/s42492-025-00213-x","DOIUrl":"10.1186/s42492-025-00213-x","url":null,"abstract":"<p><p>Photoacoustic imaging (PAI), a modality that combines the high contrast of optical imaging with the deep penetration of ultrasound, is rapidly transitioning from preclinical research to clinical practice. However, its widespread clinical adoption faces challenges such as the inherent trade-off between penetration depth and spatial resolution, along with the demand for faster imaging speeds. This review comprehensively examines the fundamental principles of PAI, focusing on three primary implementations: photoacoustic computed tomography, photoacoustic microscopy, and photoacoustic endoscopy. It critically analyzes their respective advantages and limitations to provide insights into practical applications. The discussion then extends to recent advancements in image reconstruction and artifact suppression, where both conventional and deep learning (DL)-based approaches have been highlighted for their role in enhancing image quality and streamlining workflows. Furthermore, this work explores progress in quantitative PAI, particularly its ability to precisely measure hemoglobin concentration, oxygen saturation, and other physiological biomarkers. Finally, this review outlines emerging trends and future directions, underscoring the transformative potential of DL in shaping the clinical evolution of PAI.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":"3"},"PeriodicalIF":6.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12860771/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146097413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.1186/s42492-025-00211-z
Md Motaleb Hossen Manik, William Muldowney, Md Zabirul Islam, Ge Wang
Computed tomography (CT) is a powerful imaging modality widely used in medicine, research, and industry for noninvasive visualization of internal structures. However, conventional CT systems rely on X-rays, which involve radiation exposure, high equipment costs, and complex regulatory requirements, making them unsuitable for educational or low-resource settings. To address these limitations, we developed a compact, low-cost, optically emulated CT scanner that uses visible light to image semi-transparent specimens. The system consists of a rotating stage enclosed within a light-isolated box, backlight illumination, and a fixed digital single-lens reflex camera. A Teensy 2.0 microcontroller regulates the rotation of the stage, while MATLAB is used to process the captured images using the inverse Radon transform and visualize the reconstructed volume using the Volumetric 3D MATLAB toolbox. Experimental results using a lemon slice demonstrate that the scanner can resolve internal features such as the peel, pulp, and seeds in both 2D and 3D renderings. This system offers a safe and affordable platform for demonstrating CT principles, with potential applications in education, industrial inspection, and visual computing.
计算机断层扫描(CT)是一种强大的成像方式,广泛应用于医学、研究和工业中,用于内部结构的无创可视化。然而,传统的CT系统依赖于x射线,这涉及到辐射暴露、高设备成本和复杂的监管要求,使其不适合教育或低资源环境。为了解决这些限制,我们开发了一种紧凑、低成本、光学模拟的CT扫描仪,它使用可见光对半透明标本进行成像。该系统由一个封闭在光隔离盒内的旋转舞台、背光照明和一个固定的数字单镜头反光相机组成。一个Teensy 2.0微控制器调节舞台的旋转,而MATLAB使用反Radon变换对捕获的图像进行处理,并使用Volumetric 3D MATLAB工具箱将重建的体积可视化。使用柠檬切片的实验结果表明,扫描仪可以在2D和3D渲染中解析内部特征,如果皮,果肉和种子。该系统为演示CT原理提供了一个安全且经济实惠的平台,在教育、工业检测和视觉计算方面具有潜在的应用前景。
{"title":"Development of an optically emulated computed tomography scanner for college education.","authors":"Md Motaleb Hossen Manik, William Muldowney, Md Zabirul Islam, Ge Wang","doi":"10.1186/s42492-025-00211-z","DOIUrl":"10.1186/s42492-025-00211-z","url":null,"abstract":"<p><p>Computed tomography (CT) is a powerful imaging modality widely used in medicine, research, and industry for noninvasive visualization of internal structures. However, conventional CT systems rely on X-rays, which involve radiation exposure, high equipment costs, and complex regulatory requirements, making them unsuitable for educational or low-resource settings. To address these limitations, we developed a compact, low-cost, optically emulated CT scanner that uses visible light to image semi-transparent specimens. The system consists of a rotating stage enclosed within a light-isolated box, backlight illumination, and a fixed digital single-lens reflex camera. A Teensy 2.0 microcontroller regulates the rotation of the stage, while MATLAB is used to process the captured images using the inverse Radon transform and visualize the reconstructed volume using the Volumetric 3D MATLAB toolbox. Experimental results using a lemon slice demonstrate that the scanner can resolve internal features such as the peel, pulp, and seeds in both 2D and 3D renderings. This system offers a safe and affordable platform for demonstrating CT principles, with potential applications in education, industrial inspection, and visual computing.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":"2"},"PeriodicalIF":6.0,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12808011/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145991049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1186/s42492-025-00212-y
Xuanang Xu, Joshua Yan, Gloria Nwachukwu, Hongming Shan, Uwe Kruger, Ge Wang
Efficient and accurate assignment of journal submissions to suitable associate editors (AEs) is critical in maintaining review quality and timeliness, particularly in high-volume, rapidly evolving fields such as medical imaging. This study investigates the feasibility of leveraging large language models for AE-paper matching in IEEE Transactions on Medical Imaging. An AE database was curated from historical AE assignments and AE-authored publications, and extracted six key textual components from each paper title, four categories of structured keywords, and abstracts. ModernBERT was employed locally to generate high-dimensional semantic embeddings, which were then reduced using principal component analysis (PCA) for efficient similarity computation. Keyword similarity, derived from structured domain-specific metadata, and textual similarity from ModernBERT embeddings were combined to rank the candidate AEs. Experiments on internal (historical assignments) and external (AE Publications) test sets showed that keyword similarity is the dominant contributor to matching performance. Contrarily, textual similarity offers complementary gains, particularly when PCA is applied. Ablation studies confirmed that structured keywords alone provide strong matching accuracy, with titles offering additional benefits and abstracts offering minimal improvements. The proposed approach offers a practical, interpretable, and scalable tool for editorial workflows, reduces manual workload, and supports high-quality peer reviews.
{"title":"Artificial intelligence-aided assignment of journal submissions to associate editors-a feasibility study on IEEE transactions on medical imaging.","authors":"Xuanang Xu, Joshua Yan, Gloria Nwachukwu, Hongming Shan, Uwe Kruger, Ge Wang","doi":"10.1186/s42492-025-00212-y","DOIUrl":"10.1186/s42492-025-00212-y","url":null,"abstract":"<p><p>Efficient and accurate assignment of journal submissions to suitable associate editors (AEs) is critical in maintaining review quality and timeliness, particularly in high-volume, rapidly evolving fields such as medical imaging. This study investigates the feasibility of leveraging large language models for AE-paper matching in IEEE Transactions on Medical Imaging. An AE database was curated from historical AE assignments and AE-authored publications, and extracted six key textual components from each paper title, four categories of structured keywords, and abstracts. ModernBERT was employed locally to generate high-dimensional semantic embeddings, which were then reduced using principal component analysis (PCA) for efficient similarity computation. Keyword similarity, derived from structured domain-specific metadata, and textual similarity from ModernBERT embeddings were combined to rank the candidate AEs. Experiments on internal (historical assignments) and external (AE Publications) test sets showed that keyword similarity is the dominant contributor to matching performance. Contrarily, textual similarity offers complementary gains, particularly when PCA is applied. Ablation studies confirmed that structured keywords alone provide strong matching accuracy, with titles offering additional benefits and abstracts offering minimal improvements. The proposed approach offers a practical, interpretable, and scalable tool for editorial workflows, reduces manual workload, and supports high-quality peer reviews.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":"1"},"PeriodicalIF":6.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12791093/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145953273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-18DOI: 10.1186/s42492-025-00210-0
Zuan Gu, Tianhan Gao, Huimin Liu
Text-to-3D scene generation is pivotal for digital content creation; however, existing methods often struggle with global consistency across views. We present 3DS-Gen, a modular "generate-then-reconstruct" framework that first produces a temporally coherent multi-view video prior and then reconstructs consistent 3D scenes using sparse geometry estimation and Gaussian optimization. A cascaded variational autoencoder (2D for spatial compression and 3D for temporal compression) provides a compact and coherent latent sequence that facilitates robust reconstruction. An adaptive density threshold improves detailed allocation in the Gaussian stage under a fixed computational budget. While explicit meshes can be extracted from the optimized representation when needed, our claims emphasize multiview consistency and reconstructability; the mesh quality depends on the video prior and the chosen explicitification backend. 3DS-Gen runs on a single GPU and yields coherent scene reconstructions across diverse prompts, thereby providing a practical bridge between text and 3D content creation.
{"title":"Text-to-3D scene generation framework: bridging textual descriptions to high-fidelity 3D scenes.","authors":"Zuan Gu, Tianhan Gao, Huimin Liu","doi":"10.1186/s42492-025-00210-0","DOIUrl":"10.1186/s42492-025-00210-0","url":null,"abstract":"<p><p>Text-to-3D scene generation is pivotal for digital content creation; however, existing methods often struggle with global consistency across views. We present 3DS-Gen, a modular \"generate-then-reconstruct\" framework that first produces a temporally coherent multi-view video prior and then reconstructs consistent 3D scenes using sparse geometry estimation and Gaussian optimization. A cascaded variational autoencoder (2D for spatial compression and 3D for temporal compression) provides a compact and coherent latent sequence that facilitates robust reconstruction. An adaptive density threshold improves detailed allocation in the Gaussian stage under a fixed computational budget. While explicit meshes can be extracted from the optimized representation when needed, our claims emphasize multiview consistency and reconstructability; the mesh quality depends on the video prior and the chosen explicitification backend. 3DS-Gen runs on a single GPU and yields coherent scene reconstructions across diverse prompts, thereby providing a practical bridge between text and 3D content creation.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"8 1","pages":"29"},"PeriodicalIF":6.0,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12712286/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145775790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Effective survival analysis is essential for identifying optimal preventive treatments within smart healthcare systems and leveraging digital health advancements; however, existing prediction models face limitations, primarily relying on ensemble classification techniques with suboptimal performance in both target detection and predictive accuracy. To address these gaps, this paper proposes a multimodal framework that integrates enhanced facial feature detection and temporal predictive modeling. For facial feature extraction, this study developed a lightweight face-region convolutional neural network (FRegNet) specialized in detecting key facial components, such as eyes and lips in clinical patients that incorporates a residual backbone (Rstem) to enhance feature representation and a facial path aggregated feature pyramid network for multi-resolution feature fusion; comparative experiments reveal that FRegNet outperforms state-of-the-art target detection algorithms, achieving average precision (AP) of 0.922, average recall of 0.933, mean average precision (mAP) of 0.987, and precision of 0.98-significantly surpassing other mask region-based convolutional neural networks (RCNN) variants, such as mask RCNN-ResNeXt with AP of 0.789 and mAP of 0.957. Based on the extracted facial features and clinical physiological indicators, this study proposes an enhanced temporal encoding-decoding (ETED) model that integrates an adaptive attention mechanism and a gated weighting mechanism to improve predictive performance, with comparative results demonstrating that the ETED variant incorporating facial features (ETEncoding-Decoding-Face) outperforms traditional models, achieving an accuracy of 0.916, precision of 0.850, recall of 0.895, F1 of 0.884, and area under the curve (AUC) of 0.947-outperforming gradient boosting with an accuracy of 0.922, but AUC of 0.669, and other classifiers in comprehensive metrics. The results confirm that the multimodal dataset (facial features + physiological indicators) significantly enhances the prediction accuracy of the seven-day survival conditions of patients. Correlation analysis reveals that chronic health evaluation and mean arterial pressure are positively correlated with survival, while temperature, Glasgow Coma Scale, and fibrinogen are negatively correlated.
{"title":"Enhanced temporal encoding-decoding for survival analysis of multimodal clinical data in smart healthcare.","authors":"Xiaofeng Zhang, Zijie Pan, Yuhang Tian, Lili Wang, Tingting Xu, Li Chen, Xiangyun Liao, Tianyu Jiang","doi":"10.1186/s42492-025-00209-7","DOIUrl":"10.1186/s42492-025-00209-7","url":null,"abstract":"<p><p>Effective survival analysis is essential for identifying optimal preventive treatments within smart healthcare systems and leveraging digital health advancements; however, existing prediction models face limitations, primarily relying on ensemble classification techniques with suboptimal performance in both target detection and predictive accuracy. To address these gaps, this paper proposes a multimodal framework that integrates enhanced facial feature detection and temporal predictive modeling. For facial feature extraction, this study developed a lightweight face-region convolutional neural network (FRegNet) specialized in detecting key facial components, such as eyes and lips in clinical patients that incorporates a residual backbone (Rstem) to enhance feature representation and a facial path aggregated feature pyramid network for multi-resolution feature fusion; comparative experiments reveal that FRegNet outperforms state-of-the-art target detection algorithms, achieving average precision (AP) of 0.922, average recall of 0.933, mean average precision (mAP) of 0.987, and precision of 0.98-significantly surpassing other mask region-based convolutional neural networks (RCNN) variants, such as mask RCNN-ResNeXt with AP of 0.789 and mAP of 0.957. Based on the extracted facial features and clinical physiological indicators, this study proposes an enhanced temporal encoding-decoding (ETED) model that integrates an adaptive attention mechanism and a gated weighting mechanism to improve predictive performance, with comparative results demonstrating that the ETED variant incorporating facial features (ETEncoding-Decoding-Face) outperforms traditional models, achieving an accuracy of 0.916, precision of 0.850, recall of 0.895, F1 of 0.884, and area under the curve (AUC) of 0.947-outperforming gradient boosting with an accuracy of 0.922, but AUC of 0.669, and other classifiers in comprehensive metrics. The results confirm that the multimodal dataset (facial features + physiological indicators) significantly enhances the prediction accuracy of the seven-day survival conditions of patients. Correlation analysis reveals that chronic health evaluation and mean arterial pressure are positively correlated with survival, while temperature, Glasgow Coma Scale, and fibrinogen are negatively correlated.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"8 1","pages":"28"},"PeriodicalIF":6.0,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12701133/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145744924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Epilepsy is a chronic neurological disorder characterized by recurrent seizures that can lead to death. Seizure treatment usually involves antiepileptic drugs and sometimes surgery, but patients with drug-resistant epilepsy often remain effectively untreated owing to the lack of targeted therapies. The development of a reliable technique for detecting and predicting epileptic seizures could significantly impact clinical treatment protocols and the care of patients with epilepsy. Over the years, researchers have developed various computational techniques using scalp electroencephalography (EEG), intracranial EEG, and other neuroimaging modalities, evolving from traditional signal processing methods (e.g., wavelet transforms and template matching) to advanced machine learning (ML, e.g., support vector machines and random forests) and deep learning (DL) algorithms (e.g., convolutional neural networks, recurrent neural networks, transformers, graph neural networks, and hybrid architectures). This review provides a detailed examination of epileptic seizure detection and prediction, covering the key aspects of signal processing, ML algorithms, and DL techniques applied to brainwave signals. We systematically categorized the techniques, analyzed key research trends, and identified critical challenges (e.g., data scarcity, model generalizability, and real-time processing). By highlighting the gaps in the literature, this review serves as a valuable resource for researchers and offers insights into future directions for improving the accuracy, interpretability, and clinical applicability of EEG-based seizure detection systems.
{"title":"Comprehensive review of machine learning and deep learning techniques for epileptic seizure detection and prediction based on neuroimaging modalities.","authors":"Khadija Slama, Ali Yahyaouy, Jamal Riffi, Mohamed Adnane Mahraz, Hamid Tairi","doi":"10.1186/s42492-025-00208-8","DOIUrl":"10.1186/s42492-025-00208-8","url":null,"abstract":"<p><p>Epilepsy is a chronic neurological disorder characterized by recurrent seizures that can lead to death. Seizure treatment usually involves antiepileptic drugs and sometimes surgery, but patients with drug-resistant epilepsy often remain effectively untreated owing to the lack of targeted therapies. The development of a reliable technique for detecting and predicting epileptic seizures could significantly impact clinical treatment protocols and the care of patients with epilepsy. Over the years, researchers have developed various computational techniques using scalp electroencephalography (EEG), intracranial EEG, and other neuroimaging modalities, evolving from traditional signal processing methods (e.g., wavelet transforms and template matching) to advanced machine learning (ML, e.g., support vector machines and random forests) and deep learning (DL) algorithms (e.g., convolutional neural networks, recurrent neural networks, transformers, graph neural networks, and hybrid architectures). This review provides a detailed examination of epileptic seizure detection and prediction, covering the key aspects of signal processing, ML algorithms, and DL techniques applied to brainwave signals. We systematically categorized the techniques, analyzed key research trends, and identified critical challenges (e.g., data scarcity, model generalizability, and real-time processing). By highlighting the gaps in the literature, this review serves as a valuable resource for researchers and offers insights into future directions for improving the accuracy, interpretability, and clinical applicability of EEG-based seizure detection systems.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"8 1","pages":"27"},"PeriodicalIF":6.0,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12696252/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145726481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-20DOI: 10.1186/s42492-025-00207-9
Rem RunGu Lin, Koo Yongen Ke, Kang Zhang
This study presents Body Cosmos 2.0, an embodied biofeedback system with an interactive interface situated at the intersection of dance, human-computer interaction, and bio-art. Building on the authors' prior work, "Body Cosmos: An Immersive Experience Driven by Real-time Bio-data," the system presents the concept of a 'bio-body'-a dynamic digital embodiment of a dancer's internal state-generated in real-time through electroencephalography, heart rate sensors, motion tracking, and visualization techniques. Dancers interact with the system through three distinct experiences "VR embodiment," which enables them to experience their internal states from a first-person perspective; "dancing within your bio-body," which immerses them in their internal physiological and emotional states; and "dancing with your bio-body," which creates a bio-digital reflection for expressive development and experiential exploration. To evaluate the system's effectiveness, a workshop was conducted with 24 experienced dancers to assess its impact on self-awareness, creativity, and dance expressions. This integration of biodata with artistic expression transcends traditional neurofeedback and delves into the realm of embodied cognition. The study explores the concept, development, and application of "Body Cosmos 2.0," highlighting its potential to amplify self-awareness, augment performance, and expand the expressive and creative possibilities of dance.
{"title":"Body Cosmos 2.0: embodied biofeedback interface for dancing.","authors":"Rem RunGu Lin, Koo Yongen Ke, Kang Zhang","doi":"10.1186/s42492-025-00207-9","DOIUrl":"10.1186/s42492-025-00207-9","url":null,"abstract":"<p><p>This study presents Body Cosmos 2.0, an embodied biofeedback system with an interactive interface situated at the intersection of dance, human-computer interaction, and bio-art. Building on the authors' prior work, \"Body Cosmos: An Immersive Experience Driven by Real-time Bio-data,\" the system presents the concept of a 'bio-body'-a dynamic digital embodiment of a dancer's internal state-generated in real-time through electroencephalography, heart rate sensors, motion tracking, and visualization techniques. Dancers interact with the system through three distinct experiences \"VR embodiment,\" which enables them to experience their internal states from a first-person perspective; \"dancing within your bio-body,\" which immerses them in their internal physiological and emotional states; and \"dancing with your bio-body,\" which creates a bio-digital reflection for expressive development and experiential exploration. To evaluate the system's effectiveness, a workshop was conducted with 24 experienced dancers to assess its impact on self-awareness, creativity, and dance expressions. This integration of biodata with artistic expression transcends traditional neurofeedback and delves into the realm of embodied cognition. The study explores the concept, development, and application of \"Body Cosmos 2.0,\" highlighting its potential to amplify self-awareness, augment performance, and expand the expressive and creative possibilities of dance.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"8 1","pages":"26"},"PeriodicalIF":6.0,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12634995/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145565441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-23DOI: 10.1186/s42492-025-00206-w
Alexander Somerville, Keith Joiner, Timothy Lynar, Graham Wild
The use of extended reality (XR) spectrum technologies as substitutes to augment traditional simulators in pilot flight training has received significant interest in recent times. A systematic review was conducted to evaluate the efficacy of XR technologies for this purpose and better understand the motivating factors for this use. The systematic review followed the QUOROM framework (adapted for educational studies), screening 1237 candidate articles to 67 eligible for thematic analysis, with 5 of these also meeting meta-analysis criteria. Existing literature emphasizes the benefits of these technologies, particularly as a result of immersiveness and spatial awareness, enabling the application of more modern educational theories. Although the existing literature is concerned with much of the industry, there is a specific focus on general aviation and the more ab initio skills of flight. The results of the meta-analysis indicate improvements in pilot performance, with an overall meta-analytic effect size estimate of 0.884 (z = 2.248, P = 0.025), which is positive, statistically significant, and moderately strong. The findings of this review indicate support for the use and intention for the use of XR in pilot flight training simulators. However, multiple serious research gaps exist, such as the potential higher occurrence of simulator sickness and cybersickness, and a lack of robust research trials that examine transfer of training across the full pilot skill set and curricular contexts. This novel systematic review and meta-analysis represent a significant attempt to shape and direct better research to help to direct flourishing technological XR development in a time of increasing pilot shortages and aviation growth.
近年来,在飞行员飞行训练中使用扩展现实(XR)频谱技术作为传统模拟器的替代品已经引起了人们的极大兴趣。我们进行了一项系统评价,以评估XR技术在这方面的功效,并更好地了解这种使用的激励因素。系统评价遵循QUOROM框架(适用于教育研究),筛选1237篇候选文章,其中67篇符合主题分析标准,其中5篇也符合元分析标准。现有文献强调了这些技术的好处,特别是由于沉浸式和空间意识,使更多现代教育理论的应用成为可能。虽然现有的文献是有关行业的大部分,有一个特别的重点是通用航空和更多的从头开始的飞行技能。meta分析结果表明,飞行员绩效有所改善,总体meta分析效应量估计为0.884 (z = 2.248, P = 0.025),具有统计学显著性,且中等强度。本综述的研究结果表明支持XR在飞行员飞行训练模拟器中的使用和意图。然而,存在许多严重的研究空白,例如模拟器病和晕动病的发生率可能更高,以及缺乏检查整个飞行员技能和课程背景下培训转移的强有力的研究试验。这项新颖的系统综述和荟萃分析代表了一项重要的尝试,旨在塑造和指导更好的研究,以帮助指导在飞行员短缺和航空业增长日益严重的情况下蓬勃发展的技术XR发展。
{"title":"Applications of extended reality in pilot flight simulator training: a systematic review with meta-analysis.","authors":"Alexander Somerville, Keith Joiner, Timothy Lynar, Graham Wild","doi":"10.1186/s42492-025-00206-w","DOIUrl":"10.1186/s42492-025-00206-w","url":null,"abstract":"<p><p>The use of extended reality (XR) spectrum technologies as substitutes to augment traditional simulators in pilot flight training has received significant interest in recent times. A systematic review was conducted to evaluate the efficacy of XR technologies for this purpose and better understand the motivating factors for this use. The systematic review followed the QUOROM framework (adapted for educational studies), screening 1237 candidate articles to 67 eligible for thematic analysis, with 5 of these also meeting meta-analysis criteria. Existing literature emphasizes the benefits of these technologies, particularly as a result of immersiveness and spatial awareness, enabling the application of more modern educational theories. Although the existing literature is concerned with much of the industry, there is a specific focus on general aviation and the more ab initio skills of flight. The results of the meta-analysis indicate improvements in pilot performance, with an overall meta-analytic effect size estimate of 0.884 (z = 2.248, P = 0.025), which is positive, statistically significant, and moderately strong. The findings of this review indicate support for the use and intention for the use of XR in pilot flight training simulators. However, multiple serious research gaps exist, such as the potential higher occurrence of simulator sickness and cybersickness, and a lack of robust research trials that examine transfer of training across the full pilot skill set and curricular contexts. This novel systematic review and meta-analysis represent a significant attempt to shape and direct better research to help to direct flourishing technological XR development in a time of increasing pilot shortages and aviation growth.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"8 1","pages":"25"},"PeriodicalIF":6.0,"publicationDate":"2025-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12546163/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145348911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-02DOI: 10.1186/s42492-025-00205-x
Yiping Wu, Yue Li, Eugene Ch'ng, Jiaxin Gao, Tao Hong
Gesture-based interactions in a virtual reality (VR) setting can enhance our experience of traditional practices as part of preserving and communicating heritage. Cultural experiences embodied within VR environments are suggested to be an effective approach for experiencing intangible cultural heritage. Ceremonies, rituals, and related ancestral enactments are important for preserving cultural heritage. Kāi Bǐ Lǐ, also known as the First Writing Ceremony, is traditionally held for Chinese children before their first year of elementary school. However, gesture-based immersive VR for learning this tradition is new, and have not been developed within the community. This study focused on how users experienced learning cultural practices using gesture-based interactive VR across different age groups and hardware platforms. We first conducted an experiment with 60 participants (30 young adults and 30 children) using the First Writing Ceremony as a case study in which gestural interactions were elicited, designed, implemented, and evaluated. The study showed significant differences in play time and presence between the head-mounted display VR and desktop VR. In addition, children were less likely to experience fatigue than young adults. Following this, we conducted another study after eight months to investigate the VR systems' long-term learning effectiveness. This showed that children outperformed young adults in demonstrating greater knowledge retention. Our results and findings contribute to the design of gesture-based VR for different age groups across different platforms for experiencing, learning, and practicing cultural activities.
虚拟现实(VR)环境中基于手势的互动可以增强我们对传统习俗的体验,作为保护和传播遗产的一部分。在VR环境中体现文化体验是体验非物质文化遗产的有效途径。仪式、仪式和相关的祖传法令对保护文化遗产很重要。Kāi b / l /,也被称为初笔礼,传统上是为中国孩子在小学一年级之前举行的。然而,用于学习这一传统的基于手势的沉浸式VR是新的,并且尚未在社区中开发出来。这项研究的重点是用户如何在不同年龄组和硬件平台上使用基于手势的交互式VR体验学习文化实践。我们首先对60名参与者(30名年轻人和30名儿童)进行了实验,以“第一次书写仪式”为例研究手势互动的引发、设计、实施和评估。研究显示,头戴式虚拟现实和桌面虚拟现实在游戏时间和存在感上存在显著差异。此外,儿童比年轻人更不容易感到疲劳。在此之后,我们在8个月后进行了另一项研究,以调查VR系统的长期学习效果。这表明,儿童在知识记忆方面比年轻人表现得更好。我们的研究结果和发现有助于设计基于手势的VR,适用于不同年龄段、不同平台的体验、学习和实践文化活动。
{"title":"KaiBiLi: gesture-based immersive virtual reality ceremony for traditional Chinese cultural activities.","authors":"Yiping Wu, Yue Li, Eugene Ch'ng, Jiaxin Gao, Tao Hong","doi":"10.1186/s42492-025-00205-x","DOIUrl":"10.1186/s42492-025-00205-x","url":null,"abstract":"<p><p>Gesture-based interactions in a virtual reality (VR) setting can enhance our experience of traditional practices as part of preserving and communicating heritage. Cultural experiences embodied within VR environments are suggested to be an effective approach for experiencing intangible cultural heritage. Ceremonies, rituals, and related ancestral enactments are important for preserving cultural heritage. Kāi Bǐ Lǐ, also known as the First Writing Ceremony, is traditionally held for Chinese children before their first year of elementary school. However, gesture-based immersive VR for learning this tradition is new, and have not been developed within the community. This study focused on how users experienced learning cultural practices using gesture-based interactive VR across different age groups and hardware platforms. We first conducted an experiment with 60 participants (30 young adults and 30 children) using the First Writing Ceremony as a case study in which gestural interactions were elicited, designed, implemented, and evaluated. The study showed significant differences in play time and presence between the head-mounted display VR and desktop VR. In addition, children were less likely to experience fatigue than young adults. Following this, we conducted another study after eight months to investigate the VR systems' long-term learning effectiveness. This showed that children outperformed young adults in demonstrating greater knowledge retention. Our results and findings contribute to the design of gesture-based VR for different age groups across different platforms for experiencing, learning, and practicing cultural activities.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"8 1","pages":"24"},"PeriodicalIF":6.0,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12491145/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145207799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}