TranSenseFusers: A temporal CNN-Transformer neural network family for explainable PPG-based stress detection

IF 4.9 2区医学 Q1 ENGINEERING, BIOMEDICAL Biomedical Signal Processing and Control Pub Date : 2024-12-03 DOI:10.1016/j.bspc.2024.107248

Panagiotis Kasnesis , Christos Chatzigeorgiou , Michalis Feidakis , Álvaro Gutiérrez , Charalampos Z. Patrikakis

{"title":"TranSenseFusers: A temporal CNN-Transformer neural network family for explainable PPG-based stress detection","authors":"Panagiotis Kasnesis , Christos Chatzigeorgiou , Michalis Feidakis , Álvaro Gutiérrez , Charalampos Z. Patrikakis","doi":"10.1016/j.bspc.2024.107248","DOIUrl":null,"url":null,"abstract":"<div><div>Stress is a common everyday emotional state in modern society contributing to both physical and mental illnesses. Thus, detecting and managing the degree of stress is crucial to improve well-being. Wearable devices equipped with biosensors, such as PhotoPlethysmoGraphy (PPG), can measure reliably a person’s affective state. However, PPG-based approaches suffer from the presence of Motion Artifacts (MA) affecting their overall performance. Classical machine learning and deep learning approaches have been proposed over the years for PPG-based stress detection, exploiting signal processing techniques to remove the recorded noise, but lack explainability or their performance fails to generalize across subjects. In the current work, we present a novel architecture, <em>TranSenseFuser</em> comprised of temporal convolutions followed by feature-level or sequence-level multi-head attention to improve sensor fusion’s effectiveness and exploit the provided attention maps as a form of explainability. The developed models are evaluated on highly benchmarked public dataset, namely WESAD, achieving state-of-the-art results (up to 98.46% accuracy and 97.03% F1-score) using different window sizes and cross-validation set-ups. Moreover, we demonstrate the explainability of the model towards filtering out the motion artifacts by visualizing the obtained attention maps and quantify the performance of this artifact segmentation feature in a zeros-shot manner.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"102 ","pages":"Article 107248"},"PeriodicalIF":4.9000,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809424013065","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Stress is a common everyday emotional state in modern society contributing to both physical and mental illnesses. Thus, detecting and managing the degree of stress is crucial to improve well-being. Wearable devices equipped with biosensors, such as PhotoPlethysmoGraphy (PPG), can measure reliably a person’s affective state. However, PPG-based approaches suffer from the presence of Motion Artifacts (MA) affecting their overall performance. Classical machine learning and deep learning approaches have been proposed over the years for PPG-based stress detection, exploiting signal processing techniques to remove the recorded noise, but lack explainability or their performance fails to generalize across subjects. In the current work, we present a novel architecture, TranSenseFuser comprised of temporal convolutions followed by feature-level or sequence-level multi-head attention to improve sensor fusion’s effectiveness and exploit the provided attention maps as a form of explainability. The developed models are evaluated on highly benchmarked public dataset, namely WESAD, achieving state-of-the-art results (up to 98.46% accuracy and 97.03% F1-score) using different window sizes and cross-validation set-ups. Moreover, we demonstrate the explainability of the model towards filtering out the motion artifacts by visualizing the obtained attention maps and quantify the performance of this artifact segmentation feature in a zeros-shot manner.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Biomedical Signal Processing and Control 工程技术-工程：生物医学

CiteScore

9.80

自引率

13.70%

发文量

822

审稿时长

4 months

期刊介绍： Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management. Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.