Pub Date : 2025-03-26DOI: 10.1016/j.smhl.2025.100568
Kunning Shen , Huining Li
Brain–Computer Interface (BCI) technology has emerged as a promising solution for individuals with communication disorders. However, current electroencephalography (EEG) to speech systems typically require high-channel EEG equipment (64+ channels), limiting their accessibility in resource-constrained environments. This paper implements a novel low-channel EEG-to-speech framework that effectively operates with only 6 EEG channels. By leveraging a generator-discriminator architecture for speech reconstruction, our system achieves a Character Error Rate (CER) of 64.24%, outperforming baseline systems that utilize 64 channels (68.26% CER). We further integrate Undercomplete Independent Component Analysis (UICA) for channel reduction, maintaining comparable accuracy (64.99% CER) while reducing computational complexity from 6 channels to 4 channels. This breakthrough demonstrates the feasibility of efficient speech reconstruction from minimal EEG inputs, potentially enabling more widespread deployment of BCI technology in resource-limited healthcare settings.
{"title":"A low-channel EEG-to-speech conversion approach for assisting people with communication disorders","authors":"Kunning Shen , Huining Li","doi":"10.1016/j.smhl.2025.100568","DOIUrl":"10.1016/j.smhl.2025.100568","url":null,"abstract":"<div><div>Brain–Computer Interface (BCI) technology has emerged as a promising solution for individuals with communication disorders. However, current electroencephalography (EEG) to speech systems typically require high-channel EEG equipment (64+ channels), limiting their accessibility in resource-constrained environments. This paper implements a novel low-channel EEG-to-speech framework that effectively operates with only 6 EEG channels. By leveraging a generator-discriminator architecture for speech reconstruction, our system achieves a Character Error Rate (CER) of 64.24%, outperforming baseline systems that utilize 64 channels (68.26% CER). We further integrate Undercomplete Independent Component Analysis (UICA) for channel reduction, maintaining comparable accuracy (64.99% CER) while reducing computational complexity from 6 channels to 4 channels. This breakthrough demonstrates the feasibility of efficient speech reconstruction from minimal EEG inputs, potentially enabling more widespread deployment of BCI technology in resource-limited healthcare settings.</div></div>","PeriodicalId":37151,"journal":{"name":"Smart Health","volume":"36 ","pages":"Article 100568"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143725640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-26DOI: 10.1016/j.smhl.2025.100573
Xudong Liu , Christopher Scott , Imon Banerjee , Celine Vachon , Carrie Hruska
Background parenchymal uptake (BPU) in fibroglandular tissue on a molecular breast image (MBI) has been shown to be a strong risk factor for breast cancer and complementary to mammographic density. However, MBI is generally performed on women with dense breasts and only available at institutions with nuclear medicine capabilities, limiting the utility of this measure in routine breast screening and risk assessment. Digital mammography is used for routine breast screening. Our goal was to evaluate whether BPU features could be identified from digital mammograms (DMs) using deep transfer learning. Specifically, we identified a cohort of about 2000 women from a breast screening center who had DM and MBI performed at the same time period and trained models on DMs to classify BPU categories. We consider two types of classification problems in this work: a five-category classification of BPU and two combined classes. We designed and implemented machine learning algorithms leveraging state-of-the-art pre-trained deep neural networks, evaluated these algorithms on the collected data based using metrics such as accuracy, F1-score, and AUROC, and provided visual explanations using saliency mapping and gradient-weighted class activation mapping (GradCAM). Our results show that, among the experimented models, WideResNet-50 demonstrates the best performance on a hold-out test set with 58% accuracy, 0.82 micro-average AUROC and 0.72 macro-average AUROC on the five-category classification, while ResNet-18 comes out on top with 77% accuracy, 0.86 AUROC and 0.77 F1-score on the binary categorization. We also found that incorporating age, body mass index (BMI) and menopausal status improved classification of BPU compared to DM alone.
{"title":"Background parenchymal uptake classification using deep transfer learning on digital mammograms","authors":"Xudong Liu , Christopher Scott , Imon Banerjee , Celine Vachon , Carrie Hruska","doi":"10.1016/j.smhl.2025.100573","DOIUrl":"10.1016/j.smhl.2025.100573","url":null,"abstract":"<div><div>Background parenchymal uptake (BPU) in fibroglandular tissue on a molecular breast image (MBI) has been shown to be a strong risk factor for breast cancer and complementary to mammographic density. However, MBI is generally performed on women with dense breasts and only available at institutions with nuclear medicine capabilities, limiting the utility of this measure in routine breast screening and risk assessment. Digital mammography is used for routine breast screening. Our goal was to evaluate whether BPU features could be identified from digital mammograms (DMs) using deep transfer learning. Specifically, we identified a cohort of about 2000 women from a breast screening center who had DM and MBI performed at the same time period and trained models on DMs to classify BPU categories. We consider two types of classification problems in this work: a five-category classification of BPU and two combined classes. We designed and implemented machine learning algorithms leveraging state-of-the-art pre-trained deep neural networks, evaluated these algorithms on the collected data based using metrics such as accuracy, F1-score, and AUROC, and provided visual explanations using saliency mapping and gradient-weighted class activation mapping (GradCAM). Our results show that, among the experimented models, WideResNet-50 demonstrates the best performance on a hold-out test set with 58% accuracy, 0.82 micro-average AUROC and 0.72 macro-average AUROC on the five-category classification, while ResNet-18 comes out on top with 77% accuracy, 0.86 AUROC and 0.77 F1-score on the binary categorization. We also found that incorporating age, body mass index (BMI) and menopausal status improved classification of BPU compared to DM alone.</div></div>","PeriodicalId":37151,"journal":{"name":"Smart Health","volume":"36 ","pages":"Article 100573"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143747323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-26DOI: 10.1016/j.smhl.2025.100558
Honglu Li , Bin Han , Cong Shi , Yan Wang , Tammy Chung , Yingying Chen
Cannabis use has become increasingly prevalent due to evolving legal and societal attitudes, raising concerns about its influence on public safety, particularly in driving. Existing studies mostly rely on simulators or specialized equipment, which do not capture the complexities of real-world driving and pose cost and scalability issues. In this paper, we investigate the effects of cannabis on driving behavior using participants’ smartphones to gather data in natural settings. Our method focuses on three critical behaviors: weaving & swerving, wide turning, and hard braking. We propose a two-step segmentation algorithm for processing continuous motion sensor data and use threshold-based methods for efficient detection. A custom application autonomously records driving events during actual road scenarios. On-road experiments with 9 participants who consumed cannabis under controlled conditions reveal a correlation between cannabis use and altered driving behaviors, with significant effects emerging approximately 23 h after consumption.
{"title":"Mobile app-based study of driving behaviors under the influence of cannabis","authors":"Honglu Li , Bin Han , Cong Shi , Yan Wang , Tammy Chung , Yingying Chen","doi":"10.1016/j.smhl.2025.100558","DOIUrl":"10.1016/j.smhl.2025.100558","url":null,"abstract":"<div><div>Cannabis use has become increasingly prevalent due to evolving legal and societal attitudes, raising concerns about its influence on public safety, particularly in driving. Existing studies mostly rely on simulators or specialized equipment, which do not capture the complexities of real-world driving and pose cost and scalability issues. In this paper, we investigate the effects of cannabis on driving behavior using participants’ smartphones to gather data in natural settings. Our method focuses on three critical behaviors: weaving & swerving, wide turning, and hard braking. We propose a two-step segmentation algorithm for processing continuous motion sensor data and use threshold-based methods for efficient detection. A custom application autonomously records driving events during actual road scenarios. On-road experiments with 9 participants who consumed cannabis under controlled conditions reveal a correlation between cannabis use and altered driving behaviors, with significant effects emerging approximately 2<span><math><mo>∼</mo></math></span>3 h after consumption.</div></div>","PeriodicalId":37151,"journal":{"name":"Smart Health","volume":"36 ","pages":"Article 100558"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143738103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-26DOI: 10.1016/j.smhl.2025.100560
Bofan He, Jerry Q. Cheng, Huanying Gu
SNOMED CT is a widely recognized healthcare terminology designed to comprehensively represent clinical knowledge. Identifying missing or incorrect relationships between medical concepts is crucial for enhancing the scope and quality of this ontology, thereby improving healthcare analytics and decision support. In this study, we propose a novel multi-link prediction approach that utilizes knowledge graph embeddings and neural networks to infer missing relationships within the SNOMED CT knowledge graph. By utilizing TransE, we train embeddings for triples (concept, relation, concept) and develop a multi-head classifier to predict relationship types based solely on concept pairs. With an embedding dimension of 200, a batch size of 128, and 10 epochs, we achieved the highest test accuracy of 91.96% in relationships prediction tasks. This study demonstrates an optimal balance between efficiency, generalization, and representational capacity. By expanding on existing methodologies, this work offers insights into practical applications for ontology enrichment and contributes to the ongoing advancement of predictive models in healthcare informatics. Furthermore, it highlights the potential scalability of the approach, providing a framework that can be extended to other knowledge graphs and domains.
{"title":"SNOMED CT ontology multi-relation classification by using knowledge embedding in neural network","authors":"Bofan He, Jerry Q. Cheng, Huanying Gu","doi":"10.1016/j.smhl.2025.100560","DOIUrl":"10.1016/j.smhl.2025.100560","url":null,"abstract":"<div><div>SNOMED CT is a widely recognized healthcare terminology designed to comprehensively represent clinical knowledge. Identifying missing or incorrect relationships between medical concepts is crucial for enhancing the scope and quality of this ontology, thereby improving healthcare analytics and decision support. In this study, we propose a novel multi-link prediction approach that utilizes knowledge graph embeddings and neural networks to infer missing relationships within the SNOMED CT knowledge graph. By utilizing TransE, we train embeddings for triples (concept, relation, concept) and develop a multi-head classifier to predict relationship types based solely on concept pairs. With an embedding dimension of 200, a batch size of 128, and 10 epochs, we achieved the highest test accuracy of 91.96% in relationships prediction tasks. This study demonstrates an optimal balance between efficiency, generalization, and representational capacity. By expanding on existing methodologies, this work offers insights into practical applications for ontology enrichment and contributes to the ongoing advancement of predictive models in healthcare informatics. Furthermore, it highlights the potential scalability of the approach, providing a framework that can be extended to other knowledge graphs and domains.</div></div>","PeriodicalId":37151,"journal":{"name":"Smart Health","volume":"36 ","pages":"Article 100560"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143760190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-26DOI: 10.1016/j.smhl.2025.100557
Adrian Florea, Xilin Jiang, Nima Mesgarani, Xiaofan Jiang
Large language models (LLMs) for audio have excelled in recognizing and analyzing human speech, music, and environmental sounds. However, their potential for understanding other types of sounds, particularly biomedical sounds, remains largely underexplored despite significant scientific interest. In this study, we focus on diagnosing cardiovascular diseases using phonocardiograms, i.e., heart sounds. Most existing deep neural network (DNN) paradigms are restricted to heart murmur classification (healthy vs unhealthy) and do not predict other acoustic features of the murmur such as grading, harshness, pitch, and quality, which are important in helping physicians diagnose the underlying heart conditions. We propose to finetune an audio LLM, Qwen2-Audio, on the PhysioNet CirCor DigiScope phonocardiogram (PCG) dataset and evaluate its performance in classifying 11 expert-labeled features. Additionally, we aim to achieve more noise-robust and generalizable system by exploring a preprocessing segmentation algorithm using an audio representation model, SSAMBA. Our results indicate that the LLM-based model outperforms state-of-the-art methods in 10 of the 11 tasks. Moreover, the LLM successfully classifies long-tail features with limited training data, a task that all previous methods have failed to classify. These findings underscore the potential of audio LLMs as assistants to human cardiologists in enhancing heart disease diagnosis.
{"title":"Exploring finetuned audio-LLM on heart murmur features","authors":"Adrian Florea, Xilin Jiang, Nima Mesgarani, Xiaofan Jiang","doi":"10.1016/j.smhl.2025.100557","DOIUrl":"10.1016/j.smhl.2025.100557","url":null,"abstract":"<div><div>Large language models (LLMs) for audio have excelled in recognizing and analyzing human speech, music, and environmental sounds. However, their potential for understanding other types of sounds, particularly biomedical sounds, remains largely underexplored despite significant scientific interest. In this study, we focus on diagnosing cardiovascular diseases using phonocardiograms, i.e., heart sounds. Most existing deep neural network (DNN) paradigms are restricted to heart murmur classification (healthy vs unhealthy) and do not predict other acoustic features of the murmur such as grading, harshness, pitch, and quality, which are important in helping physicians diagnose the underlying heart conditions. We propose to finetune an audio LLM, Qwen2-Audio, on the PhysioNet CirCor DigiScope phonocardiogram (PCG) dataset and evaluate its performance in classifying 11 expert-labeled features. Additionally, we aim to achieve more noise-robust and generalizable system by exploring a preprocessing segmentation algorithm using an audio representation model, SSAMBA. Our results indicate that the LLM-based model outperforms state-of-the-art methods in 10 of the 11 tasks. Moreover, the LLM successfully classifies long-tail features with limited training data, a task that all previous methods have failed to classify. These findings underscore the potential of audio LLMs as assistants to human cardiologists in enhancing heart disease diagnosis.</div></div>","PeriodicalId":37151,"journal":{"name":"Smart Health","volume":"36 ","pages":"Article 100557"},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143714430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-25DOI: 10.1016/j.smhl.2025.100561
Chongxin Zhong , Jinyuan Jia , Huining Li
Ensuring medication adherence for Parkinson’s disease (PD) patients is crucial to relieve patients’ symptoms and better customizing regimens according to patient’s clinical responses. However, traditional self-management approaches are often error-prone and have limited effectiveness in improving adherence. While smartphone-based solutions have been introduced to monitor various PD metrics, including medication adherence, these methods often rely on single-modality data or fail to fully leverage the advantages of multimodal integration. To address the issues, we present an adaptive multimodal fusion framework for monitoring medication adherence of PD based on a smartphone. Specifically, we segment and transform raw data from sensors to spectrograms. Then, we integrate multimodal data with quantification of their qualities and perform gradient modulation based on the contribution of each modality. Afterward, we monitor medication adherence in PD patients by detecting their medicine intake status. We evaluate the performance with the dataset from daily-life scenarios involving 455 patients. The results show that our work can achieve around 94% accuracy in medication adherence monitoring, indicating that our proposed framework is a promising tool to facilitate medication adherence monitoring in PD patients’ daily lives.
{"title":"An adaptive multimodal fusion framework for smartphone-based medication adherence monitoring of Parkinson’s disease","authors":"Chongxin Zhong , Jinyuan Jia , Huining Li","doi":"10.1016/j.smhl.2025.100561","DOIUrl":"10.1016/j.smhl.2025.100561","url":null,"abstract":"<div><div>Ensuring medication adherence for Parkinson’s disease (PD) patients is crucial to relieve patients’ symptoms and better customizing regimens according to patient’s clinical responses. However, traditional self-management approaches are often error-prone and have limited effectiveness in improving adherence. While smartphone-based solutions have been introduced to monitor various PD metrics, including medication adherence, these methods often rely on single-modality data or fail to fully leverage the advantages of multimodal integration. To address the issues, we present an adaptive multimodal fusion framework for monitoring medication adherence of PD based on a smartphone. Specifically, we segment and transform raw data from sensors to spectrograms. Then, we integrate multimodal data with quantification of their qualities and perform gradient modulation based on the contribution of each modality. Afterward, we monitor medication adherence in PD patients by detecting their medicine intake status. We evaluate the performance with the dataset from daily-life scenarios involving 455 patients. The results show that our work can achieve around 94% accuracy in medication adherence monitoring, indicating that our proposed framework is a promising tool to facilitate medication adherence monitoring in PD patients’ daily lives.</div></div>","PeriodicalId":37151,"journal":{"name":"Smart Health","volume":"36 ","pages":"Article 100561"},"PeriodicalIF":0.0,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143704184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-25DOI: 10.1016/j.smhl.2025.100567
Ali Abbasi , Jiaqi Gong , Soroush Korivand
Stroop tasks, recognized for their cognitively demanding nature, hold promise for diagnosing and monitoring neurodegenerative diseases. Understanding how humans allocate attention and resolve interference in the Stroop test remains a challenge; yet addressing this gap could reveal key opportunities for early-stage detection. Traditional approaches overlook the interplay between overt behavior and underlying neural processes, limiting insights into the complex color-word associations at play. To tackle this, we propose a framework that applies Inverse Reinforcement Learning (IRL) to fuse electroencephalography (EEG) signals with eye-tracking data, bridging the gap between neural and behavioral markers of cognition. We designed a Stroop experiment featuring congruent and incongruent conditions to evaluate attention allocation under varying levels of interference. By framing gaze as actions guided by an internally derived reward, IRL uncovers hidden motivations behind scanning patterns, while EEG data — processed with advanced feature extraction — reveals task-specific neural dynamics under high conflict. We validate our approach by measuring Probability Mismatch, Target Fixation Probability-Area Under the Curve, Sequence Score, and MultiMatch metrics. Results show that the IRL-EEG model outperforms an IRL-Image baseline, demonstrating improved alignment with human scanpaths and heightened sensitivity to attentional shifts in incongruent trials. These findings highlight the value of integrating neural data into computational models of cognition and illuminate possibilities for early detection of neurodegenerative disorders, where subclinical deficits may first emerge. Our IRL-based integration of EEG and eye-tracking further supports personalized cognitive assessments and adaptive user interfaces.
{"title":"Transforming stroop task cognitive assessments with multimodal inverse reinforcement learning","authors":"Ali Abbasi , Jiaqi Gong , Soroush Korivand","doi":"10.1016/j.smhl.2025.100567","DOIUrl":"10.1016/j.smhl.2025.100567","url":null,"abstract":"<div><div>Stroop tasks, recognized for their cognitively demanding nature, hold promise for diagnosing and monitoring neurodegenerative diseases. Understanding how humans allocate attention and resolve interference in the Stroop test remains a challenge; yet addressing this gap could reveal key opportunities for early-stage detection. Traditional approaches overlook the interplay between overt behavior and underlying neural processes, limiting insights into the complex color-word associations at play. To tackle this, we propose a framework that applies Inverse Reinforcement Learning (IRL) to fuse electroencephalography (EEG) signals with eye-tracking data, bridging the gap between neural and behavioral markers of cognition. We designed a Stroop experiment featuring congruent and incongruent conditions to evaluate attention allocation under varying levels of interference. By framing gaze as actions guided by an internally derived reward, IRL uncovers hidden motivations behind scanning patterns, while EEG data — processed with advanced feature extraction — reveals task-specific neural dynamics under high conflict. We validate our approach by measuring Probability Mismatch, Target Fixation Probability-Area Under the Curve, Sequence Score, and MultiMatch metrics. Results show that the IRL-EEG model outperforms an IRL-Image baseline, demonstrating improved alignment with human scanpaths and heightened sensitivity to attentional shifts in incongruent trials. These findings highlight the value of integrating neural data into computational models of cognition and illuminate possibilities for early detection of neurodegenerative disorders, where subclinical deficits may first emerge. Our IRL-based integration of EEG and eye-tracking further supports personalized cognitive assessments and adaptive user interfaces.</div></div>","PeriodicalId":37151,"journal":{"name":"Smart Health","volume":"36 ","pages":"Article 100567"},"PeriodicalIF":0.0,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143725638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-25DOI: 10.1016/j.smhl.2025.100569
Yanhua Si , Yingyun Yang , Qilei Chen , Zinan Xiong , Yu Cao , Xinwen Fu , Benyuan Liu , Aiming Yang
In the application of deep learning for gastric cancer detection, the quality of the data set is as important as, if not more, the design of the network architecture. However, obtaining labeled data, especially in fields such as medical imaging to detect gastric cancer, can be expensive and challenging. This scarcity is exacerbated by stringent privacy regulations and the need for annotations by specialists. Conventional methods of data augmentation fall short due to the complexities of medical imagery. In this paper, we explore the use of diffusion models to generate synthetic medical images for the detection of gastric cancer. We evaluate their capability to produce realistic images that can augment small datasets, potentially enhancing the accuracy and robustness of detection algorithms. By training diffusion models on existing gastric cancer data and producing new images, our aim is to expand these datasets, thereby enhancing the efficiency of deep learning model training to achieve better precision and generalization in lesion detection. Our findings indicate that images generated by diffusion models significantly mitigate the issue of data scarcity, advancing the field of deep learning in medical imaging.
{"title":"Improving gastric lesion detection with synthetic images from diffusion models","authors":"Yanhua Si , Yingyun Yang , Qilei Chen , Zinan Xiong , Yu Cao , Xinwen Fu , Benyuan Liu , Aiming Yang","doi":"10.1016/j.smhl.2025.100569","DOIUrl":"10.1016/j.smhl.2025.100569","url":null,"abstract":"<div><div>In the application of deep learning for gastric cancer detection, the quality of the data set is as important as, if not more, the design of the network architecture. However, obtaining labeled data, especially in fields such as medical imaging to detect gastric cancer, can be expensive and challenging. This scarcity is exacerbated by stringent privacy regulations and the need for annotations by specialists. Conventional methods of data augmentation fall short due to the complexities of medical imagery. In this paper, we explore the use of diffusion models to generate synthetic medical images for the detection of gastric cancer. We evaluate their capability to produce realistic images that can augment small datasets, potentially enhancing the accuracy and robustness of detection algorithms. By training diffusion models on existing gastric cancer data and producing new images, our aim is to expand these datasets, thereby enhancing the efficiency of deep learning model training to achieve better precision and generalization in lesion detection. Our findings indicate that images generated by diffusion models significantly mitigate the issue of data scarcity, advancing the field of deep learning in medical imaging.</div></div>","PeriodicalId":37151,"journal":{"name":"Smart Health","volume":"36 ","pages":"Article 100569"},"PeriodicalIF":0.0,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143738102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-25DOI: 10.1016/j.smhl.2025.100565
Pinxiang Wang , Hanqi Chen , Zhouyu Li , Wenyao Xu , Yu-Ping Chang , Huining Li
Mental health intervention can help to release individuals’ mental symptoms like anxiety and depression. A typical mental health intervention program can last for several months, people may lose interests along with the time and cannot insist till the end. Accurately predicting user dropout is crucial for delivering timely measures to address user disengagement and reduce its adverse effects on treatment. We develop a temporal deep learning approach to accurately predict dropout, leveraging advanced data augmentation and feature engineering techniques. By integrating interaction metrics from user behavior logs and semantic features from user self-reflections over a nine-week intervention program, our approach effectively characterizes user’s mental health intervention behavior patterns. The results validate the efficacy of temporal models for continuous dropout prediction.
{"title":"Continuous prediction of user dropout in a mobile mental health intervention program: An exploratory machine learning approach","authors":"Pinxiang Wang , Hanqi Chen , Zhouyu Li , Wenyao Xu , Yu-Ping Chang , Huining Li","doi":"10.1016/j.smhl.2025.100565","DOIUrl":"10.1016/j.smhl.2025.100565","url":null,"abstract":"<div><div>Mental health intervention can help to release individuals’ mental symptoms like anxiety and depression. A typical mental health intervention program can last for several months, people may lose interests along with the time and cannot insist till the end. Accurately predicting user dropout is crucial for delivering timely measures to address user disengagement and reduce its adverse effects on treatment. We develop a temporal deep learning approach to accurately predict dropout, leveraging advanced data augmentation and feature engineering techniques. By integrating interaction metrics from user behavior logs and semantic features from user self-reflections over a nine-week intervention program, our approach effectively characterizes user’s mental health intervention behavior patterns. The results validate the efficacy of temporal models for continuous dropout prediction.</div></div>","PeriodicalId":37151,"journal":{"name":"Smart Health","volume":"36 ","pages":"Article 100565"},"PeriodicalIF":0.0,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143704185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-25DOI: 10.1016/j.smhl.2025.100570
Ziyu Wang , Hao Li , Di Huang , Hye-Sung Kim , Chae-Won Shin , Amir M. Rahmani
Effective patient care in digital healthcare requires large language models (LLMs) that not only answer questions but also actively gather critical information through well-crafted inquiries. This paper introduces HealthQ, a novel framework for evaluating the questioning capabilities of LLM healthcare chains. By implementing advanced LLM chains, including Retrieval-Augmented Generation (RAG), Chain of Thought (CoT), and reflective chains, HealthQ assesses how effectively these chains elicit comprehensive and relevant patient information. To achieve this, we integrate an LLM judge to evaluate generated questions across metrics such as specificity, relevance, and usefulness, while aligning these evaluations with traditional Natural Language Processing (NLP) metrics like ROUGE and Named Entity Recognition (NER)-based set comparisons. We validate HealthQ using two custom datasets constructed from public medical datasets, ChatDoctor and MTS-Dialog, and demonstrate its robustness across multiple LLM judge models, including GPT-3.5, GPT-4, and Claude. Our contributions are threefold: we present the first systematic framework for assessing questioning capabilities in healthcare conversations, establish a model-agnostic evaluation methodology, and provide empirical evidence linking high-quality questions to improved patient information elicitation.
{"title":"HealthQ: Unveiling questioning capabilities of LLM chains in healthcare conversations","authors":"Ziyu Wang , Hao Li , Di Huang , Hye-Sung Kim , Chae-Won Shin , Amir M. Rahmani","doi":"10.1016/j.smhl.2025.100570","DOIUrl":"10.1016/j.smhl.2025.100570","url":null,"abstract":"<div><div>Effective patient care in digital healthcare requires large language models (LLMs) that not only answer questions but also actively gather critical information through well-crafted inquiries. This paper introduces HealthQ, a novel framework for evaluating the questioning capabilities of LLM healthcare chains. By implementing advanced LLM chains, including Retrieval-Augmented Generation (RAG), Chain of Thought (CoT), and reflective chains, HealthQ assesses how effectively these chains elicit comprehensive and relevant patient information. To achieve this, we integrate an LLM judge to evaluate generated questions across metrics such as specificity, relevance, and usefulness, while aligning these evaluations with traditional Natural Language Processing (NLP) metrics like ROUGE and Named Entity Recognition (NER)-based set comparisons. We validate HealthQ using two custom datasets constructed from public medical datasets, ChatDoctor and MTS-Dialog, and demonstrate its robustness across multiple LLM judge models, including GPT-3.5, GPT-4, and Claude. Our contributions are threefold: we present the first systematic framework for assessing questioning capabilities in healthcare conversations, establish a model-agnostic evaluation methodology, and provide empirical evidence linking high-quality questions to improved patient information elicitation.</div></div>","PeriodicalId":37151,"journal":{"name":"Smart Health","volume":"36 ","pages":"Article 100570"},"PeriodicalIF":0.0,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143704085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}