ACM transactions on computing for healthcare最新文献

Domain-Invariant Representation Learning and Sleep Dynamics Modeling for Automatic Sleep Staging. 自动睡眠分期的域不变表示学习和睡眠动力学建模。

IF 8

ACM transactions on computing for healthcare

Pub Date : 2025-10-01 Epub Date: 2025-10-13 DOI: 10.1145/3757066

Seungyeon Lee, Thai-Hoang Pham, Zhao Cheng, Ping Zhang

Sleep staging has become a critical task in diagnosing and treating sleep disorders to prevent sleep-related diseases. With growing large-scale sleep databases, significant progress has been made toward automatic sleep staging. However, previous studies face critical problems in sleep studies; the heterogeneity of subjects' physiological signals, the inability to extract meaningful information from unlabeled data to improve predictive performances, the difficulty in modeling correlations between sleep stages, and the lack of an effective mechanism to quantify predictive uncertainty. In this study, we propose a neural network-based sleep staging model, DREAM, to learn domain generalized representations from physiological signals and model sleep dynamics. DREAM learns sleep-related and subject-invariant representations from diverse subjects' sleep signals and models sleep dynamics by capturing interactions between sequential signal segments and between sleep stages. We conducted a comprehensive empirical study to demonstrate the superiority of DREAM, including sleep stage prediction experiments, a case study, the usage of unlabeled data, and uncertainty. Notably, the case study validates DREAM's ability to learn the generalized decision function for new subjects, especially in case there are differences between testing and training subjects. Uncertainty quantification shows that DREAM provides prediction uncertainty, making the model reliable and helping sleep experts in real-world applications.

睡眠分期已成为诊断和治疗睡眠障碍以预防睡眠相关疾病的关键任务。随着大规模睡眠数据库的增长，在自动睡眠分期方面取得了重大进展。然而，之前的研究在睡眠研究中面临着关键问题；研究对象生理信号的异质性，无法从未标记的数据中提取有意义的信息以提高预测性能，难以建立睡眠阶段之间的相关性模型，以及缺乏量化预测不确定性的有效机制。在这项研究中，我们提出了一个基于神经网络的睡眠分期模型，DREAM，从生理信号中学习领域广义表征，并模拟睡眠动力学。DREAM从不同受试者的睡眠信号中学习睡眠相关的和主体不变的表征，并通过捕获顺序信号段和睡眠阶段之间的相互作用来建模睡眠动力学。我们进行了全面的实证研究，包括睡眠阶段预测实验、案例研究、未标记数据的使用和不确定性，以证明DREAM的优越性。值得注意的是，案例研究验证了DREAM对新受试者学习广义决策函数的能力，特别是在测试和训练受试者之间存在差异的情况下。不确定性量化表明，DREAM提供了预测不确定性，使模型可靠，有助于睡眠专家在现实世界的应用。

{"title":"Domain-Invariant Representation Learning and Sleep Dynamics Modeling for Automatic Sleep Staging.","authors":"Seungyeon Lee, Thai-Hoang Pham, Zhao Cheng, Ping Zhang","doi":"10.1145/3757066","DOIUrl":"10.1145/3757066","url":null,"abstract":"Sleep staging has become a critical task in diagnosing and treating sleep disorders to prevent sleep-related diseases. With growing large-scale sleep databases, significant progress has been made toward automatic sleep staging. However, previous studies face critical problems in sleep studies; the heterogeneity of subjects' physiological signals, the inability to extract meaningful information from unlabeled data to improve predictive performances, the difficulty in modeling correlations between sleep stages, and the lack of an effective mechanism to quantify predictive uncertainty. In this study, we propose a neural network-based sleep staging model, DREAM, to learn domain generalized representations from physiological signals and model sleep dynamics. DREAM learns sleep-related and subject-invariant representations from diverse subjects' sleep signals and models sleep dynamics by capturing interactions between sequential signal segments and between sleep stages. We conducted a comprehensive empirical study to demonstrate the superiority of DREAM, including sleep stage prediction experiments, a case study, the usage of unlabeled data, and uncertainty. Notably, the case study validates DREAM's ability to learn the generalized decision function for new subjects, especially in case there are differences between testing and training subjects. Uncertainty quantification shows that DREAM provides prediction uncertainty, making the model reliable and helping sleep experts in real-world applications.","PeriodicalId":72043,"journal":{"name":"ACM transactions on computing for healthcare","volume":"6 4","pages":""},"PeriodicalIF":8.0,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629632/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Video-based Intake Gesture Recognition Using Meal-length Context. 基于视频的进食手势识别。

IF 8

ACM transactions on computing for healthcare

Pub Date : 2025-04-01 Epub Date: 2025-02-17 DOI: 10.1145/3709151

Zeyu Tang, Adam Hoover

This article explores video analysis methods for monitoring eating behaviors, a critical factor in approximately 70% of global deaths due to illnesses like cancer, diabetes, and heart disease. Automated monitoring quantifies aspects such as meal duration, food types, and intake gestures (bite and drink gestures). Previous deep-learning methods segment videos into short clips (e.g., 16 frames at 8 Hz) for analysis, but this approach overlooks common meal-length patterns in gesture distribution across different individuals and sessions, which can enhance detection accuracy. Our study introduces a novel pipeline that analyzes the entire meal context (5-40 minutes). We propose a framework allowing a global detector to learn meal-length patterns with manageable computational demands. Additionally, we introduced a new augmentation technique to generate hundreds of meal-length feature samples per video, facilitating effective training of a global detector with limited video availability. Experimental results on two datasets (Clemson Cafeteria and EatSense) demonstrate that our pipeline significantly enhances the performance of state-of-the-art window-based networks, particularly in reducing false positives in gesture detection. On the Clemson Cafeteria dataset of 486 meal videos (the largest dataset to date), our method achieves F1 scores of 0.93 for bite gestures and 0.88 for drink gestures, substantially outperforming existing methodologies.

这篇文章探讨了监控饮食行为的视频分析方法，这是全球约70%因癌症、糖尿病和心脏病等疾病导致的死亡的关键因素。自动监控量化诸如用餐时间、食物类型和摄入手势（咬和喝手势）等方面。以前的深度学习方法将视频分割成短片段（例如，8 Hz的16帧）进行分析，但这种方法忽略了不同个体和会话之间手势分布的常见用餐长度模式，这可以提高检测准确性。我们的研究引入了一种新的管道来分析整个用餐环境（5-40分钟）。我们提出了一个框架，允许全局检测器以可管理的计算需求来学习膳食长度模式。此外，我们引入了一种新的增强技术，可以在每个视频中生成数百个膳食长度的特征样本，从而在视频可用性有限的情况下促进全局检测器的有效训练。在两个数据集（Clemson自助餐厅和EatSense）上的实验结果表明，我们的管道显著提高了最先进的基于窗口的网络的性能，特别是在减少手势检测中的误报方面。在克莱姆森自助餐厅的486个用餐视频数据集（迄今为止最大的数据集）上，我们的方法在咬手势和喝手势上的F1得分分别为0.93和0.88，大大优于现有的方法。

{"title":"Video-based Intake Gesture Recognition Using Meal-length Context.","authors":"Zeyu Tang, Adam Hoover","doi":"10.1145/3709151","DOIUrl":"10.1145/3709151","url":null,"abstract":"This article explores video analysis methods for monitoring eating behaviors, a critical factor in approximately 70% of global deaths due to illnesses like cancer, diabetes, and heart disease. Automated monitoring quantifies aspects such as meal duration, food types, and intake gestures (bite and drink gestures). Previous deep-learning methods segment videos into short clips (e.g., 16 frames at 8 Hz) for analysis, but this approach overlooks common meal-length patterns in gesture distribution across different individuals and sessions, which can enhance detection accuracy. Our study introduces a novel pipeline that analyzes the entire meal context (5-40 minutes). We propose a framework allowing a global detector to learn meal-length patterns with manageable computational demands. Additionally, we introduced a new augmentation technique to generate hundreds of meal-length feature samples per video, facilitating effective training of a global detector with limited video availability. Experimental results on two datasets (Clemson Cafeteria and EatSense) demonstrate that our pipeline significantly enhances the performance of state-of-the-art window-based networks, particularly in reducing false positives in gesture detection. On the Clemson Cafeteria dataset of 486 meal videos (the largest dataset to date), our method achieves F1 scores of 0.93 for bite gestures and 0.88 for drink gestures, substantially outperforming existing methodologies.","PeriodicalId":72043,"journal":{"name":"ACM transactions on computing for healthcare","volume":"6 2","pages":""},"PeriodicalIF":8.0,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12872170/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CogProg: Utilizing Large Language Models to Forecast In-the-moment Health Assessment. CogProg：利用大型语言模型预测即时健康评估。

IF 8

ACM transactions on computing for healthcare

Pub Date : 2025-04-01 Epub Date: 2025-04-24 DOI: 10.1145/3709153

Gina Sprint, Maureen Schmitter-Edgecombe, Raven Weaver, Lisa Wiese, Diane J Cook

Forecasting future health status is beneficial for understanding health patterns and providing anticipatory support for cognitive and physical health difficulties. In recent years, generative large language models (LLMs) have shown promise as forecasters. Though not traditionally considered strong candidates for numeric tasks, LLMs demonstrate emerging abilities to address various forecasting problems. They also provide the ability to incorporate unstructured information and explain their reasoning process. In this paper, we explore whether LLMs can effectively forecast future self-reported health state. To do this, we utilized in-the-moment assessments of mental sharpness, fatigue, and stress from multiple studies, utilizing daily responses (N=106 participants) and responses that are accompanied by text descriptions of activities (N=32 participants). With these data, we constructed prompt/response pairs to predict a participant's next answer. We fine-tuned several LLMs and applied chain-of-thought prompting evaluating forecasting accuracy and prediction explainability. Notably, we found that LLMs achieved the lowest mean absolute error (MAE) overall (0.851), while gradient boosting achieved the lowest overall root mean squared error (RMSE) (1.356). When additional text context was provided, LLM forecasts achieved the lowest MAE for predicting mental sharpness (0.862), fatigue (1.000), and stress (0.414). These multimodal LLMs further outperformed the numeric baselines in terms of RMSE when predicting stress (0.947), although numeric algorithms achieved the best RMSE results for mental sharpness (1.246) and fatigue (1.587). This study offers valuable insights for future applications of LLMs in health-based forecasting. The findings suggest that LLMs, when supplemented with additional text information, can be effective tools for improving health forecasting accuracy.

预测未来的健康状况有助于了解健康模式，并为认知和身体健康困难提供预期支持。近年来，生成式大型语言模型（llm）作为预测者显示出了希望。虽然传统上不认为法学硕士是数字任务的有力候选人，但法学硕士在解决各种预测问题方面表现出了新兴的能力。它们还提供了整合非结构化信息和解释其推理过程的能力。在本文中，我们探讨LLMs是否可以有效地预测未来的自我报告健康状态。为了做到这一点，我们利用了来自多个研究的心理敏锐度、疲劳和压力的即时评估，利用了日常回复（N=106参与者）和附有活动文字描述的回复（N=32参与者）。有了这些数据，我们构建了提示/回答对来预测参与者的下一个答案。我们对几个法学硕士进行了微调，并应用了思维链来评估预测的准确性和预测的可解释性。值得注意的是，我们发现llm总体上实现了最低的平均绝对误差（MAE）（0.851），而梯度增强实现了最低的总体均方根误差（RMSE）（1.356）。当提供额外的文本上下文时，LLM预测在预测精神敏锐度（0.862）、疲劳（1.000）和压力（0.414）方面达到了最低的MAE。这些多模态llm在预测压力（0.947）方面的RMSE进一步优于数值基线，尽管数值算法在心理敏锐度（1.246）和疲劳（1.587）方面的RMSE结果最好。该研究为法学硕士在基于健康的预测中的未来应用提供了有价值的见解。研究结果表明，llm在补充了额外的文本信息后，可以有效地提高健康预测的准确性。

{"title":"CogProg: Utilizing Large Language Models to Forecast In-the-moment Health Assessment.","authors":"Gina Sprint, Maureen Schmitter-Edgecombe, Raven Weaver, Lisa Wiese, Diane J Cook","doi":"10.1145/3709153","DOIUrl":"10.1145/3709153","url":null,"abstract":"Forecasting future health status is beneficial for understanding health patterns and providing anticipatory support for cognitive and physical health difficulties. In recent years, generative large language models (LLMs) have shown promise as forecasters. Though not traditionally considered strong candidates for numeric tasks, LLMs demonstrate emerging abilities to address various forecasting problems. They also provide the ability to incorporate unstructured information and explain their reasoning process. In this paper, we explore whether LLMs can effectively forecast future self-reported health state. To do this, we utilized in-the-moment assessments of mental sharpness, fatigue, and stress from multiple studies, utilizing daily responses (N=106 participants) and responses that are accompanied by text descriptions of activities (N=32 participants). With these data, we constructed prompt/response pairs to predict a participant's next answer. We fine-tuned several LLMs and applied chain-of-thought prompting evaluating forecasting accuracy and prediction explainability. Notably, we found that LLMs achieved the lowest mean absolute error (MAE) overall (0.851), while gradient boosting achieved the lowest overall root mean squared error (RMSE) (1.356). When additional text context was provided, LLM forecasts achieved the lowest MAE for predicting mental sharpness (0.862), fatigue (1.000), and stress (0.414). These multimodal LLMs further outperformed the numeric baselines in terms of RMSE when predicting stress (0.947), although numeric algorithms achieved the best RMSE results for mental sharpness (1.246) and fatigue (1.587). This study offers valuable insights for future applications of LLMs in health-based forecasting. The findings suggest that LLMs, when supplemented with additional text information, can be effective tools for improving health forecasting accuracy.","PeriodicalId":72043,"journal":{"name":"ACM transactions on computing for healthcare","volume":"6 2","pages":""},"PeriodicalIF":8.0,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12330958/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144801058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A method for comparing time series by untangling time-dependent and independent variations in biological processes 通过消除生物过程中与时间相关的独立变化来比较时间序列的方法

ACM transactions on computing for healthcare

Pub Date : 2024-07-26 DOI: 10.1145/3681795

A. J. Thottupattu, J. Sivaswamy

Biological processes like growth, aging, and disease progression are generally studied with follow-up scans taken at different time points, i.e., image time series (TS) based analysis. Image time series represents the evolution of anatomy over time, but different anatomies may have different structural characteristics and temporal paths. Therefore, separating the time-dependent path difference and time-independent basic anatomy/shape changes is important when comparing two image time series to understand the causes of the observed differences better. A method to untangle and quantify the path and shape difference between the TS is presented in this paper. The proposed method is evaluated with simulated and adult and fetal neuro templates. Results show that the metric can separate and quantify the path and shape differences between TS.

对生长、衰老和疾病进展等生物过程的研究通常采用在不同时间点进行的随访扫描，即基于图像时间序列（TS）的分析。图像时间序列代表了解剖结构随时间的演变，但不同的解剖结构可能具有不同的结构特征和时间路径。因此，在比较两个图像时间序列时，必须将与时间相关的路径差异和与时间无关的基本解剖/形状变化区分开来，以便更好地理解观察到的差异的原因。本文提出了一种对 TS 之间的路径和形状差异进行分离和量化的方法。本文使用模拟的成人和胎儿神经模板对所提出的方法进行了评估。结果表明，该指标可以分离和量化 TS 之间的路径和形状差异。

引用次数: 0

AI-assisted Diagnosing, Monitoring, and Treatment of Mental Disorders: A Survey 人工智能辅助诊断、监测和治疗精神障碍：一项调查

ACM transactions on computing for healthcare

Pub Date : 2024-07-25 DOI: 10.1145/3681794

Faustino Muetunda, Soumaya Sabry, M. Jamil, Sebastião Pais, Gael Dias, João Cordeiro

Globally, 1 in 7 people has some kind of mental or substance use disorder that affects their thinking, feelings, and behaviour in everyday life. People with mental health disorders can continue their normal lives with proper treatment and support. Mental well-being is vital for physical health. The use of AI in mental health areas has grown exponentially in the last decade. However, mental disorders are still complex to diagnose due to similar and common symptoms for numerous mental illnesses, with a minute difference. Intelligent systems can help us identify mental diseases precisely, which is a critical step in diagnosing. Using these systems efficiently can improve the treatment and rapid recovery of patients. We survey different artificial intelligence systems used in mental healthcare, such as mobile applications, machine learning and deep learning methods, and multimodal systems and draw comparisons from recent developments and related challenges. Also, we discuss types of mental disorders and how these different techniques can support the therapist in diagnosing, monitoring, and treating patients with mental disorders.

在全球范围内，每 7 人中就有 1 人患有某种精神障碍或药物使用障碍，影响着他们在日常生活中的思维、情感和行为。有精神障碍的人只要得到适当的治疗和支持，就可以继续正常生活。心理健康对身体健康至关重要。近十年来，人工智能在精神健康领域的应用呈指数级增长。然而，由于众多精神疾病的症状相似且常见，但又存在细微差别，因此精神障碍的诊断仍然十分复杂。智能系统可以帮助我们精确识别精神疾病，这是诊断的关键一步。有效利用这些系统可以提高治疗效果，使患者迅速康复。我们调查了用于精神医疗的各种人工智能系统，如移动应用、机器学习和深度学习方法以及多模态系统，并对近期的发展和相关挑战进行了比较。此外，我们还讨论了精神障碍的类型，以及这些不同的技术如何支持治疗师诊断、监控和治疗精神障碍患者。

{"title":"AI-assisted Diagnosing, Monitoring, and Treatment of Mental Disorders: A Survey","authors":"Faustino Muetunda, Soumaya Sabry, M. Jamil, Sebastião Pais, Gael Dias, João Cordeiro","doi":"10.1145/3681794","DOIUrl":"https://doi.org/10.1145/3681794","url":null,"abstract":"Globally, 1 in 7 people has some kind of mental or substance use disorder that affects their thinking, feelings, and behaviour in everyday life. People with mental health disorders can continue their normal lives with proper treatment and support. Mental well-being is vital for physical health. The use of AI in mental health areas has grown exponentially in the last decade. However, mental disorders are still complex to diagnose due to similar and common symptoms for numerous mental illnesses, with a minute difference. Intelligent systems can help us identify mental diseases precisely, which is a critical step in diagnosing. Using these systems efficiently can improve the treatment and rapid recovery of patients. We survey different artificial intelligence systems used in mental healthcare, such as mobile applications, machine learning and deep learning methods, and multimodal systems and draw comparisons from recent developments and related challenges. Also, we discuss types of mental disorders and how these different techniques can support the therapist in diagnosing, monitoring, and treating patients with mental disorders.","PeriodicalId":72043,"journal":{"name":"ACM transactions on computing for healthcare","volume":"16 17","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141803001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

HEalthRecordBERT (HERBERT): leveraging transformers on electronic health records for chronic kidney disease risk stratification HEalthRecordBERT (HERBERT)：利用电子健康记录转换器进行慢性肾病风险分层

ACM transactions on computing for healthcare

Pub Date : 2024-07-19 DOI: 10.1145/3665899

Alex Moore, B. Orset, A. Yassaee, Benjamin Irving, Davide Morelli

Risk stratification is an essential tool in the fight against many diseases, including chronic kidney disease. Recent work has focused on applying techniques from machine learning and leveraging the information contained in a patient’s electronic health record (EHR). Irregular intervals between data entries and the large number of variables tracked in EHR datasets can make them challenging to work with. Many of the difficulties associated with these datasets can be overcome by using large language models, such as bidirectional encoder representations from transformers (BERT). Previous attempts to apply BERT to EHR for risk stratification have shown promise. In this work we propose HERBERT, a novel application of BERT to EHR data. We identify two key areas where BERT models must be modified to adapt them to EHR data, namely: the embedding layer and the pretraining task. We show how changes to these can lead to improved performance, relative to the previous state of the art. We evaluate our model by predicting the transition of chronic kidney disease patients to end stage renal disease. The strong performance of our model justifies our architectural changes and suggests that large language models could play an important role in future renal risk stratification.

风险分层是防治包括慢性肾病在内的多种疾病的重要工具。近期的工作重点是应用机器学习技术和利用患者电子健康记录（EHR）中包含的信息。电子病历数据集的数据输入间隔不规则，跟踪的变量数量庞大，这些都给工作带来了挑战。与这些数据集相关的许多困难都可以通过使用大型语言模型来克服，例如转换器的双向编码器表示法（BERT）。之前将 BERT 应用于电子病历进行风险分层的尝试已显示出良好的前景。在这项工作中，我们提出了将 BERT 应用于电子病历数据的新方法 HERBERT。我们确定了 BERT 模型必须修改以适应电子病历数据的两个关键领域，即：嵌入层和预训练任务。我们展示了与之前的技术水平相比，对这两个方面的修改如何提高性能。我们通过预测慢性肾病患者向终末期肾病的转变来评估我们的模型。我们模型的强大性能证明了我们的架构改变是正确的，并表明大型语言模型在未来的肾脏风险分层中可以发挥重要作用。

{"title":"HEalthRecordBERT (HERBERT): leveraging transformers on electronic health records for chronic kidney disease risk stratification","authors":"Alex Moore, B. Orset, A. Yassaee, Benjamin Irving, Davide Morelli","doi":"10.1145/3665899","DOIUrl":"https://doi.org/10.1145/3665899","url":null,"abstract":"Risk stratification is an essential tool in the fight against many diseases, including chronic kidney disease. Recent work has focused on applying techniques from machine learning and leveraging the information contained in a patient’s electronic health record (EHR). Irregular intervals between data entries and the large number of variables tracked in EHR datasets can make them challenging to work with. Many of the difficulties associated with these datasets can be overcome by using large language models, such as bidirectional encoder representations from transformers (BERT). Previous attempts to apply BERT to EHR for risk stratification have shown promise. In this work we propose HERBERT, a novel application of BERT to EHR data. We identify two key areas where BERT models must be modified to adapt them to EHR data, namely: the embedding layer and the pretraining task. We show how changes to these can lead to improved performance, relative to the previous state of the art. We evaluate our model by predicting the transition of chronic kidney disease patients to end stage renal disease. The strong performance of our model justifies our architectural changes and suggests that large language models could play an important role in future renal risk stratification.","PeriodicalId":72043,"journal":{"name":"ACM transactions on computing for healthcare","volume":"115 46","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141822246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Computation Model to Estimate Interaction Intensity through Non-verbal Behavioral Cues: A Case Study of Intimate Couples under the Impact of Acute Alcohol Consumption 通过非语言行为线索估计互动强度的计算模型：急性酒精中毒影响下亲密伴侣的案例研究

ACM transactions on computing for healthcare

Pub Date : 2024-07-19 DOI: 10.1145/3664826

Zhiwei, Z.Y. Yu, Cory, C.C. Crane, Linlin, L.C. Chen, Maria, M.T. Testa, Zhi, Z.Z. Zheng

This work introduced a novel analysis method to estimate interaction intensity, i.e., the level of positivity/negativity of an interaction, for intimate couples (married and heterosexual) under the impact of alcohol, which has great influences on behavioral health. Non-verbal behaviors are critical in interpersonal interactions. However, whether computer vision-detected non-verbal behaviors can effectively estimate interaction intensity of intimate couples is still unexplored. In this work, we proposed novel measurements and investigated their feasibility to estimate interaction intensities through machine learning regression models. Analyses were conducted based on a conflict-resolution conversation video dataset of intimate couples before and after acute alcohol consumption. Results showed the estimation error was at the lowest in the no-alcohol state but significantly increased if the model trained using no-alcohol data was applied to after-alcohol data, indicating that alcohol altered the interaction data in the feature space. While training a model using rich after-alcohol data is ideal to address the performance decrease, data collection in such a risky state is challenging in real life. Thus, we proposed a new State-Induced Domain Adaptation (SIDA) framework, which allows for improving estimation performance using only a small after-alcohol training dataset, pointing to a future direction of addressing data scarcity issues.

这项工作引入了一种新颖的分析方法，用于估算酒精影响下亲密伴侣（已婚和异性恋）的互动强度，即互动的积极/消极程度，酒精对行为健康有很大影响。非语言行为在人际交往中至关重要。然而，计算机视觉检测到的非语言行为是否能有效估计亲密情侣的互动强度，目前仍有待探索。在这项工作中，我们提出了新的测量方法，并研究了其通过机器学习回归模型估计互动强度的可行性。我们基于亲密情侣在急性饮酒前后的冲突解决对话视频数据集进行了分析。结果表明，在未饮酒状态下，估计误差最小，但如果将使用未饮酒数据训练的模型应用于饮酒后数据，则估计误差会显著增加，这表明酒精改变了特征空间中的互动数据。虽然使用丰富的酒后数据训练模型是解决性能下降问题的理想方法，但在这种危险状态下收集数据在现实生活中具有挑战性。因此，我们提出了一种新的状态诱导领域适应（SIDA）框架，只需使用少量酒后训练数据集即可提高估计性能，为解决数据稀缺问题指明了未来的方向。

{"title":"A Computation Model to Estimate Interaction Intensity through Non-verbal Behavioral Cues: A Case Study of Intimate Couples under the Impact of Acute Alcohol Consumption","authors":"Zhiwei, Z.Y. Yu, Cory, C.C. Crane, Linlin, L.C. Chen, Maria, M.T. Testa, Zhi, Z.Z. Zheng","doi":"10.1145/3664826","DOIUrl":"https://doi.org/10.1145/3664826","url":null,"abstract":"This work introduced a novel analysis method to estimate interaction intensity, i.e., the level of positivity/negativity of an interaction, for intimate couples (married and heterosexual) under the impact of alcohol, which has great influences on behavioral health. Non-verbal behaviors are critical in interpersonal interactions. However, whether computer vision-detected non-verbal behaviors can effectively estimate interaction intensity of intimate couples is still unexplored. In this work, we proposed novel measurements and investigated their feasibility to estimate interaction intensities through machine learning regression models. Analyses were conducted based on a conflict-resolution conversation video dataset of intimate couples before and after acute alcohol consumption. Results showed the estimation error was at the lowest in the no-alcohol state but significantly increased if the model trained using no-alcohol data was applied to after-alcohol data, indicating that alcohol altered the interaction data in the feature space. While training a model using rich after-alcohol data is ideal to address the performance decrease, data collection in such a risky state is challenging in real life. Thus, we proposed a new State-Induced Domain Adaptation (SIDA) framework, which allows for improving estimation performance using only a small after-alcohol training dataset, pointing to a future direction of addressing data scarcity issues.","PeriodicalId":72043,"journal":{"name":"ACM transactions on computing for healthcare","volume":" 1092","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141823338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mapping Distributed Ledger Technology Characteristics to Use Cases in Healthcare: A Structured Literature Review 将分布式账本技术的特点映射到医疗保健领域的用例：结构化文献综述

ACM transactions on computing for healthcare

Pub Date : 2024-07-19 DOI: 10.1145/3653076

Shanshan Hu, Manuel Schmidt-Kraepelin, Scott Thiebes, A. Sunyaev

Following the success of the Bitcoin blockchain, distributed ledger technology (DLT) has received extensive attention in health informatics research. Yet, the healthcare industry is highly complex with many different stakeholders, information systems, regulations, and challenges. Thus, DLT may be used in various settings and for different purposes. First surveys have started to synthesize our knowledge of the different use cases, in which healthcare may benefit from DLT implementations. However, an in-depth understanding of whether and how these use cases differ concerning their requirements of DLT characteristics (i.e., technical or administrative design features) is still lacking. In this work, we conducted a structured review of 185 studies on DLT-based applications in healthcare. The results reveal six pertinent use cases, each with its own combination of different purposes that DLT is used for. Furthermore, our study shows that each of these use cases has a unique set of requirements with regard to the most important DLT characteristics. In doing so, we seek to guide practitioners in the development of highly effective DLT-based applications in various healthcare settings and pave the way for future research to investigate the understudied areas of DLT-based applications in healthcare.

继比特币区块链取得成功后，分布式账本技术（DLT）在医疗信息学研究中受到广泛关注。然而，医疗保健行业非常复杂，有许多不同的利益相关者、信息系统、法规和挑战。因此，DLT 可用于各种环境和不同目的。首次调查已开始综合我们对不同用例的了解，在这些用例中，医疗保健可能会受益于 DLT 的实施。然而，对于这些使用案例对数字签名技术特征（即技术或管理设计特征）的要求是否不同以及如何不同，我们还缺乏深入的了解。在这项工作中，我们对基于 DLT 的医疗保健应用的 185 项研究进行了结构化回顾。研究结果显示了六种相关的使用案例，每种案例都结合了 DLT 的不同用途。此外，我们的研究还表明，这些用例中的每一种都对最重要的 DLT 特性有一套独特的要求。这样，我们就能指导从业人员在各种医疗保健环境中开发基于数字签名技术的高效应用，并为未来研究医疗保健中基于数字签名技术的应用铺平道路。

{"title":"Mapping Distributed Ledger Technology Characteristics to Use Cases in Healthcare: A Structured Literature Review","authors":"Shanshan Hu, Manuel Schmidt-Kraepelin, Scott Thiebes, A. Sunyaev","doi":"10.1145/3653076","DOIUrl":"https://doi.org/10.1145/3653076","url":null,"abstract":"Following the success of the Bitcoin blockchain, distributed ledger technology (DLT) has received extensive attention in health informatics research. Yet, the healthcare industry is highly complex with many different stakeholders, information systems, regulations, and challenges. Thus, DLT may be used in various settings and for different purposes. First surveys have started to synthesize our knowledge of the different use cases, in which healthcare may benefit from DLT implementations. However, an in-depth understanding of whether and how these use cases differ concerning their requirements of DLT characteristics (i.e., technical or administrative design features) is still lacking. In this work, we conducted a structured review of 185 studies on DLT-based applications in healthcare. The results reveal six pertinent use cases, each with its own combination of different purposes that DLT is used for. Furthermore, our study shows that each of these use cases has a unique set of requirements with regard to the most important DLT characteristics. In doing so, we seek to guide practitioners in the development of highly effective DLT-based applications in various healthcare settings and pave the way for future research to investigate the understudied areas of DLT-based applications in healthcare.","PeriodicalId":72043,"journal":{"name":"ACM transactions on computing for healthcare","volume":" November","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141823614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

iScan: Detection of Colorectal Cancer From CT Scan Images Using Deep Learning iScan：利用深度学习从 CT 扫描图像中检测结直肠癌

ACM transactions on computing for healthcare

Pub Date : 2024-07-19 DOI: 10.1145/3676282

Sagnik Ghosal, Debanjan Das, Jay Kumar Rai, Akanksha Singh Pandaw, Sakshi Verma

Colorectal cancer, a highly lethal form of cancer, can be treated effectively if detected early. However, the current diagnosis process involves a time-consuming and manual review of CT scans to identify cancerous regions and behavior, leading to resource consumption, subjectivity, and dependency on manual assessment. We propose a 3-phase deep neural system for automated colorectal cancer detection using CT scan images to address these challenges. It includes a SegNet network to identify tumor locations, an InceptionResNet V2 network to classify tumors as benign or malignant, and an analysis of tumor area cum perimeter to predict the cancer stage. The proposed model offers a fully automated solution by combining these functionalities under a single umbrella. In real-life CT scans from 37 patients, the proposed model achieved 95.8 (%) ROI segmentation accuracy, a dice coefficient of 0.6214, 69.75 (%) IoU score, and 95.83 (%) tumor classification accuracy. The unique approach using Radial Length (RL) and Circularity (C) parameters predicted the T-stage with close to 85 (%) accuracy. Based on these outcomes, the proposed system establishes itself as a reliable and suitable alternative to traditional cancer diagnosis techniques by leveraging the power of automation, deep learning, and innovative parameter analysis.

大肠癌是一种致死率极高的癌症，如果能及早发现，就能得到有效治疗。然而，目前的诊断过程需要对 CT 扫描图像进行耗时的人工检查，以确定癌变区域和癌变行为，这导致了资源消耗、主观性和对人工评估的依赖。为了应对这些挑战，我们提出了一种利用 CT 扫描图像自动检测结直肠癌的三阶段深度神经系统。该系统包括用于识别肿瘤位置的 SegNet 网络、用于将肿瘤分为良性和恶性的 InceptionResNet V2 网络，以及用于预测癌症分期的肿瘤面积和周长分析。所提出的模型将这些功能整合在一起，提供了一个全自动的解决方案。在37名患者的真实CT扫描中，所提出的模型达到了95.8的ROI分割准确率、0.6214的骰子系数、69.75的IoU得分和95.83的肿瘤分类准确率。使用径向长度（RL）和圆周率（C）参数的独特方法预测T期的准确率接近85%。基于这些结果，所提出的系统通过利用自动化、深度学习和创新参数分析的力量，成为传统癌症诊断技术的可靠和合适的替代方案。

{"title":"iScan: Detection of Colorectal Cancer From CT Scan Images Using Deep Learning","authors":"Sagnik Ghosal, Debanjan Das, Jay Kumar Rai, Akanksha Singh Pandaw, Sakshi Verma","doi":"10.1145/3676282","DOIUrl":"https://doi.org/10.1145/3676282","url":null,"abstract":"\u0000 Colorectal cancer, a highly lethal form of cancer, can be treated effectively if detected early. However, the current diagnosis process involves a time-consuming and manual review of CT scans to identify cancerous regions and behavior, leading to resource consumption, subjectivity, and dependency on manual assessment. We propose a 3-phase deep neural system for automated colorectal cancer detection using CT scan images to address these challenges. It includes a SegNet network to identify tumor locations, an InceptionResNet V2 network to classify tumors as benign or malignant, and an analysis of tumor area cum perimeter to predict the cancer stage. The proposed model offers a fully automated solution by combining these functionalities under a single umbrella. In real-life CT scans from 37 patients, the proposed model achieved 95.8\u0000 \u0000 (%)\u0000 \u0000 ROI segmentation accuracy, a dice coefficient of 0.6214, 69.75\u0000 \u0000 (%)\u0000 \u0000 IoU score, and 95.83\u0000 \u0000 (%)\u0000 \u0000 tumor classification accuracy. The unique approach using Radial Length (RL) and Circularity (C) parameters predicted the T-stage with close to 85\u0000 \u0000 (%)\u0000 \u0000 accuracy. Based on these outcomes, the proposed system establishes itself as a reliable and suitable alternative to traditional cancer diagnosis techniques by leveraging the power of automation, deep learning, and innovative parameter analysis.\u0000","PeriodicalId":72043,"journal":{"name":"ACM transactions on computing for healthcare","volume":"8 23","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141822359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Loss Relaxation Strategy for Noisy Facial Video-based Automatic Depression Recognition 基于噪声面部视频的自动抑郁识别的损失松弛策略

ACM transactions on computing for healthcare

Pub Date : 2024-03-04 DOI: 10.1145/3648696

Siyang Song, Yi-Xiang Luo, Tugba Tumer, Michel Valstar, Hatice Gunes

Automatic depression analysis has been widely investigated on face videos that have been carefully collected and annotated in lab conditions. However, videos collected under real-world conditions may suffer from various types of noises due to challenging data acquisition conditions and lack of annotators. Although deep learning (DL) models frequently show excellent depression analysis performances on datasets collected in controlled lab conditions, such noise may degrade their generalization abilities for real-world depression analysis tasks. In this paper, we uncovered that noisy facial data and annotations consistently change the distribution of training losses for facial depression DL models, i.e., noisy data-label pairs cause larger loss values compared to clean data-label pairs. Since different loss functions could be applied depending on the employed model and task, we propose a generic loss function relaxation strategy that can jointly reduce the negative impact of various noisy data and annotation problems occurring in both classification and regression loss functions, for face video-based depression analysis, where the parameters of the proposed strategy can be automatically adapted during depression model training. The experimental results on 25 different artificially created noisy depression conditions (i.e., five noise types with five different noise levels) show that our loss relaxation strategy can clearly enhance both classification and regression loss functions, enabling the generation of superior face video-based depression analysis models under almost all noisy conditions. Our approach is robust to its main variable settings, and can adaptively and automatically obtain its parameters during training.

自动抑郁分析已在实验室条件下仔细采集和标注的人脸视频中得到广泛研究。然而，由于数据采集条件具有挑战性且缺乏注释者，在真实世界条件下采集的视频可能会受到各种噪音的影响。虽然深度学习（DL）模型经常在受控实验室条件下收集的数据集上显示出出色的抑郁分析性能，但这些噪声可能会降低它们在真实世界抑郁分析任务中的泛化能力。在本文中，我们发现有噪声的面部数据和注释会持续改变面部抑郁深度学习模型的训练损失分布，也就是说，与干净的数据标签对相比，有噪声的数据标签对会导致更大的损失值。由于不同的模型和任务可以使用不同的损失函数，我们提出了一种通用的损失函数松弛策略，可以共同减少分类和回归损失函数中出现的各种噪声数据和标注问题对基于人脸视频的抑郁分析的负面影响，该策略的参数可以在抑郁模型训练过程中自动调整。在 25 种不同的人为噪声抑郁条件（即五种噪声类型和五种不同的噪声水平）下的实验结果表明，我们的损失松弛策略可以明显增强分类和回归损失函数，从而在几乎所有噪声条件下生成卓越的基于人脸视频的抑郁分析模型。我们的方法对其主要变量设置具有鲁棒性，并能在训练过程中自适应地自动获取参数。

{"title":"Loss Relaxation Strategy for Noisy Facial Video-based Automatic Depression Recognition","authors":"Siyang Song, Yi-Xiang Luo, Tugba Tumer, Michel Valstar, Hatice Gunes","doi":"10.1145/3648696","DOIUrl":"https://doi.org/10.1145/3648696","url":null,"abstract":"Automatic depression analysis has been widely investigated on face videos that have been carefully collected and annotated in lab conditions. However, videos collected under real-world conditions may suffer from various types of noises due to challenging data acquisition conditions and lack of annotators. Although deep learning (DL) models frequently show excellent depression analysis performances on datasets collected in controlled lab conditions, such noise may degrade their generalization abilities for real-world depression analysis tasks. In this paper, we uncovered that noisy facial data and annotations consistently change the distribution of training losses for facial depression DL models, i.e., noisy data-label pairs cause larger loss values compared to clean data-label pairs. Since different loss functions could be applied depending on the employed model and task, we propose a generic loss function relaxation strategy that can jointly reduce the negative impact of various noisy data and annotation problems occurring in both classification and regression loss functions, for face video-based depression analysis, where the parameters of the proposed strategy can be automatically adapted during depression model training. The experimental results on 25 different artificially created noisy depression conditions (i.e., five noise types with five different noise levels) show that our loss relaxation strategy can clearly enhance both classification and regression loss functions, enabling the generation of superior face video-based depression analysis models under almost all noisy conditions. Our approach is robust to its main variable settings, and can adaptively and automatically obtain its parameters during training.","PeriodicalId":72043,"journal":{"name":"ACM transactions on computing for healthcare","volume":"12 s2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140266193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0