Tianqi Liu , Hanguang Xiao , Yisha Sun , Kun Zuo , Zhipeng Li , Zhiying Yang , Shihong Liu
{"title":"PhysKANNet: A KAN-based model for multiscale feature extraction and contextual fusion in remote physiological measurement","authors":"Tianqi Liu , Hanguang Xiao , Yisha Sun , Kun Zuo , Zhipeng Li , Zhiying Yang , Shihong Liu","doi":"10.1016/j.bspc.2024.107111","DOIUrl":null,"url":null,"abstract":"<div><div>Physiological indicator reflects the health status of the human body, and remote photoplethysmography (rPPG) is a highly promising technology for contactless measurement of these indicators through facial video. However, current deep learning methods mainly rely on traditional neural networks with limited spatiotemporal receptive fields, overlooking the importance of multi-scale features and noise resistance in rPPG signal modeling. This results in challenges when addressing subtle color changes and noise interference. To overcome these limitations, we leverage the advantages of the Kolmogorov-Arnold Network (KAN) in handling sparse data and propose PhysKANNet, a novel KAN-based encoder–decoder architecture that integrates multi-scale feature extraction and contextual information fusion to enhance rPPG signal extraction. We introduce three new plug-and-play modules for PhysKANNet: the rPPG-Aware Convolutional Attention Block, which extracts features at different scales through a multi-branch structure and enhances multi-scale representation using KAN’s nonlinear modeling capabilities; the Multi-Dimensional Feature Fusion Block, which combines high-dimensional features from the encoder with low-dimensional features from the decoder; and the rPPG Edge Sampling Block, which fuses edge and semantic information to further optimize signal extraction accuracy. We employ unsupervised learning for training PhysKANNet and conducted comprehensive experiments on multiple benchmark datasets. The results show that PhysKANNet significantly improves feature learning from unlabeled data, achieving excellent performance across various testing scenarios.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107111"},"PeriodicalIF":4.9000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809424011698","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Physiological indicator reflects the health status of the human body, and remote photoplethysmography (rPPG) is a highly promising technology for contactless measurement of these indicators through facial video. However, current deep learning methods mainly rely on traditional neural networks with limited spatiotemporal receptive fields, overlooking the importance of multi-scale features and noise resistance in rPPG signal modeling. This results in challenges when addressing subtle color changes and noise interference. To overcome these limitations, we leverage the advantages of the Kolmogorov-Arnold Network (KAN) in handling sparse data and propose PhysKANNet, a novel KAN-based encoder–decoder architecture that integrates multi-scale feature extraction and contextual information fusion to enhance rPPG signal extraction. We introduce three new plug-and-play modules for PhysKANNet: the rPPG-Aware Convolutional Attention Block, which extracts features at different scales through a multi-branch structure and enhances multi-scale representation using KAN’s nonlinear modeling capabilities; the Multi-Dimensional Feature Fusion Block, which combines high-dimensional features from the encoder with low-dimensional features from the decoder; and the rPPG Edge Sampling Block, which fuses edge and semantic information to further optimize signal extraction accuracy. We employ unsupervised learning for training PhysKANNet and conducted comprehensive experiments on multiple benchmark datasets. The results show that PhysKANNet significantly improves feature learning from unlabeled data, achieving excellent performance across various testing scenarios.
期刊介绍:
Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management.
Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.