Wearable Data From Subjects Playing Super Mario, Taking University Exams, or Performing Physical Exercise Help Detect Acute Mood Disorder Episodes via Self-Supervised Learning: Prospective, Exploratory, Observational Study.

IF 5.4 2区医学 Q1 HEALTH CARE SCIENCES & SERVICES JMIR mHealth and uHealth Pub Date : 2024-07-17 DOI:10.2196/55094

Filippo Corponi, Bryan M Li, Gerard Anmella, Clàudia Valenzuela-Pascual, Ariadna Mas, Isabella Pacchiarotti, Marc Valentí, Iria Grande, Antoni Benabarre, Marina Garriga, Eduard Vieta, Allan H Young, Stephen M Lawrie, Heather C Whalley, Diego Hidalgo-Mazzei, Antonio Vergari

{"title":"Wearable Data From Subjects Playing Super Mario, Taking University Exams, or Performing Physical Exercise Help Detect Acute Mood Disorder Episodes via Self-Supervised Learning: Prospective, Exploratory, Observational Study.","authors":"Filippo Corponi, Bryan M Li, Gerard Anmella, Clàudia Valenzuela-Pascual, Ariadna Mas, Isabella Pacchiarotti, Marc Valentí, Iria Grande, Antoni Benabarre, Marina Garriga, Eduard Vieta, Allan H Young, Stephen M Lawrie, Heather C Whalley, Diego Hidalgo-Mazzei, Antonio Vergari","doi":"10.2196/55094","DOIUrl":null,"url":null,"abstract":"Background: Personal sensing, leveraging data passively and near-continuously collected with wearables from patients in their ecological environment, is a promising paradigm to monitor mood disorders (MDs), a major determinant of the worldwide disease burden. However, collecting and annotating wearable data is resource intensive. Studies of this kind can thus typically afford to recruit only a few dozen patients. This constitutes one of the major obstacles to applying modern supervised machine learning techniques to MD detection.Objective: In this paper, we overcame this data bottleneck and advanced the detection of acute MD episodes from wearables' data on the back of recent advances in self-supervised learning (SSL). This approach leverages unlabeled data to learn representations during pretraining, subsequently exploited for a supervised task.Methods: We collected open access data sets recording with the Empatica E4 wristband spanning different, unrelated to MD monitoring, personal sensing tasks-from emotion recognition in Super Mario players to stress detection in undergraduates-and devised a preprocessing pipeline performing on-/off-body detection, sleep/wake detection, segmentation, and (optionally) feature extraction. With 161 E4-recorded subjects, we introduced E4SelfLearning, the largest-to-date open access collection, and its preprocessing pipeline. We developed a novel E4-tailored transformer (E4mer) architecture, serving as the blueprint for both SSL and fully supervised learning; we assessed whether and under which conditions self-supervised pretraining led to an improvement over fully supervised baselines (ie, the fully supervised E4mer and pre-deep learning algorithms) in detecting acute MD episodes from recording segments taken in 64 (n=32, 50%, acute, n=32, 50%, stable) patients.Results: SSL significantly outperformed fully supervised pipelines using either our novel E4mer or extreme gradient boosting (XGBoost): n=3353 (81.23%) against n=3110 (75.35%; E4mer) and n=2973 (72.02%; XGBoost) correctly classified recording segments from a total of 4128 segments. SSL performance was strongly associated with the specific surrogate task used for pretraining, as well as with unlabeled data availability.Conclusions: We showed that SSL, a paradigm where a model is pretrained on unlabeled data with no need for human annotations before deployment on the supervised target task of interest, helps overcome the annotation bottleneck; the choice of the pretraining surrogate task and the size of unlabeled data for pretraining are key determinants of SSL success. We introduced E4mer, which can be used for SSL, and shared the E4SelfLearning collection, along with its preprocessing pipeline, which can foster and expedite future research into SSL for personal sensing.","PeriodicalId":14756,"journal":{"name":"JMIR mHealth and uHealth","volume":"12 ","pages":"e55094"},"PeriodicalIF":5.4000,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11292167/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR mHealth and uHealth","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/55094","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Personal sensing, leveraging data passively and near-continuously collected with wearables from patients in their ecological environment, is a promising paradigm to monitor mood disorders (MDs), a major determinant of the worldwide disease burden. However, collecting and annotating wearable data is resource intensive. Studies of this kind can thus typically afford to recruit only a few dozen patients. This constitutes one of the major obstacles to applying modern supervised machine learning techniques to MD detection.

Objective: In this paper, we overcame this data bottleneck and advanced the detection of acute MD episodes from wearables' data on the back of recent advances in self-supervised learning (SSL). This approach leverages unlabeled data to learn representations during pretraining, subsequently exploited for a supervised task.

Methods: We collected open access data sets recording with the Empatica E4 wristband spanning different, unrelated to MD monitoring, personal sensing tasks-from emotion recognition in Super Mario players to stress detection in undergraduates-and devised a preprocessing pipeline performing on-/off-body detection, sleep/wake detection, segmentation, and (optionally) feature extraction. With 161 E4-recorded subjects, we introduced E4SelfLearning, the largest-to-date open access collection, and its preprocessing pipeline. We developed a novel E4-tailored transformer (E4mer) architecture, serving as the blueprint for both SSL and fully supervised learning; we assessed whether and under which conditions self-supervised pretraining led to an improvement over fully supervised baselines (ie, the fully supervised E4mer and pre-deep learning algorithms) in detecting acute MD episodes from recording segments taken in 64 (n=32, 50%, acute, n=32, 50%, stable) patients.

Results: SSL significantly outperformed fully supervised pipelines using either our novel E4mer or extreme gradient boosting (XGBoost): n=3353 (81.23%) against n=3110 (75.35%; E4mer) and n=2973 (72.02%; XGBoost) correctly classified recording segments from a total of 4128 segments. SSL performance was strongly associated with the specific surrogate task used for pretraining, as well as with unlabeled data availability.

Conclusions: We showed that SSL, a paradigm where a model is pretrained on unlabeled data with no need for human annotations before deployment on the supervised target task of interest, helps overcome the annotation bottleneck; the choice of the pretraining surrogate task and the size of unlabeled data for pretraining are key determinants of SSL success. We introduced E4mer, which can be used for SSL, and shared the E4SelfLearning collection, along with its preprocessing pipeline, which can foster and expedite future research into SSL for personal sensing.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

来自玩超级马里奥、参加大学考试或进行体育锻炼的受试者的可穿戴数据有助于通过自我监督学习检测急性情绪障碍发作：前瞻性、探索性、观察性研究。

背景：利用可穿戴设备从患者生态环境中被动和近乎持续地收集到的数据进行个人传感，是监测情绪障碍（MDs）的一种很有前途的模式，而情绪障碍是全球疾病负担的一个主要决定因素。然而，收集和注释可穿戴设备数据需要耗费大量资源。因此，此类研究通常只能招募几十名患者。这成为将现代监督机器学习技术应用于 MD 检测的主要障碍之一：在本文中，我们克服了这一数据瓶颈，并以自我监督学习（SSL）的最新进展为基础，推进了从可穿戴设备数据中检测急性心肌梗死发作的工作。这种方法在预训练过程中利用未标记数据学习表征，随后在监督任务中加以利用：我们收集了使用 Empatica E4 腕带记录的开放存取数据集，这些数据集跨越了不同的、与 MD 监测无关的个人传感任务--从超级马里奥玩家的情绪识别到大学生的压力检测--并设计了一个预处理管道，用于进行身体开/关检测、睡眠/觉醒检测、分割和（可选）特征提取。通过 161 个 E4 记录对象，我们推出了迄今为止最大的开放式 E4SelfLearning 及其预处理管道。我们开发了一种新颖的 E4 定制转换器（E4mer）架构，作为 SSL 和完全监督学习的蓝图；我们评估了自监督预训练是否以及在哪些条件下，在从 64 名患者（n=32，50%，急性；n=32，50%，稳定）的记录片段中检测急性 MD 发作方面，比完全监督基线（即完全监督的 E4mer 和预深度学习算法）有所改进：SSL的表现明显优于使用我们的新型E4mer或极端梯度提升算法（XGBoost）的完全监督管道：在总共4128个记录片段中，正确分类了3353个（81.23%），而E4mer为3110个（75.35%），XGBoost为2973个（72.02%）。SSL 的表现与用于预训练的特定代用任务以及未标记数据的可用性密切相关：我们的研究表明，SSL--一种不需要人工标注就能在无标注数据上对模型进行预训练的模式--有助于克服标注瓶颈；预训练代用任务的选择和用于预训练的无标注数据的大小是决定 SSL 成功与否的关键因素。我们介绍了可用于 SSL 的 E4mer，并分享了 E4SelfLearning 套件及其预处理管道，这可以促进和加快个人传感 SSL 的未来研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

JMIR mHealth and uHealth Medicine-Health Informatics

CiteScore

12.60

自引率

4.00%

发文量

159

审稿时长

10 weeks

期刊介绍： JMIR mHealth and uHealth (JMU, ISSN 2291-5222) is a spin-off journal of JMIR, the leading eHealth journal (Impact Factor 2016: 5.175). JMIR mHealth and uHealth is indexed in PubMed, PubMed Central, and Science Citation Index Expanded (SCIE), and in June 2017 received a stunning inaugural Impact Factor of 4.636. The journal focusses on health and biomedical applications in mobile and tablet computing, pervasive and ubiquitous computing, wearable computing and domotics. JMIR mHealth and uHealth publishes since 2013 and was the first mhealth journal in Pubmed. It publishes even faster and has a broader scope with including papers which are more technical or more formative/developmental than what would be published in the Journal of Medical Internet Research.