利用面部视频进行基于远程光敏血压计的生理测量的双路径令牌学习器

IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, CYBERNETICS IEEE Transactions on Computational Social Systems Pub Date : 2024-02-27 DOI:10.1109/TCSS.2024.3356713
Wei Qian;Dan Guo;Kun Li;Xiaowei Zhang;Xilan Tian;Xun Yang;Meng Wang
{"title":"利用面部视频进行基于远程光敏血压计的生理测量的双路径令牌学习器","authors":"Wei Qian;Dan Guo;Kun Li;Xiaowei Zhang;Xilan Tian;Xun Yang;Meng Wang","doi":"10.1109/TCSS.2024.3356713","DOIUrl":null,"url":null,"abstract":"Remote photoplethysmography (rPPG)-based physiological measurement is an emerging yet crucial vision task, whose challenge lies in exploring accurate rPPG prediction from facial videos accompanied by noises of illumination variations, facial occlusions, head movements, etc., in a noncontact manner. Existing mainstream convolutional neural network (CNN)-based models make efforts to detect physiological signals by capturing subtle color changes in facial regions of interest (ROI) caused by heartbeats. However, such models are constrained by the limited local spatial or temporal receptive fields in the neural units. Unlike them, a native transformer-based framework called dual-path TokenLearner (dual-TL) is proposed in this article, which utilizes the concept of learnable tokens to integrate both spatial and temporal informative contexts from the global perspective of the video. Specifically, the proposed dual-TL uses a spatial TokenLearner (S-TL) to explore associations in different facial ROIs, which promises the rPPG prediction far away from noisy ROI disturbances. Complementarily, a temporal TokenLearner (T-TL) is designed to infer the quasi-periodic pattern of heartbeats, which eliminates temporal disturbances such as head movements. The two TokenLearners, S-TL and T-TL, are executed in a dual-path mode. This enables the model to reduce noise disturbances for final rPPG signal prediction. Extensive experiments on four physiological measurement benchmark datasets are conducted. The dual-TL achieves state-of-the-art performances in both intra and cross-dataset testings, demonstrating its immense potential as a basic backbone for rPPG measurement.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":null,"pages":null},"PeriodicalIF":4.5000,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dual-Path TokenLearner for Remote Photoplethysmography-Based Physiological Measurement With Facial Videos\",\"authors\":\"Wei Qian;Dan Guo;Kun Li;Xiaowei Zhang;Xilan Tian;Xun Yang;Meng Wang\",\"doi\":\"10.1109/TCSS.2024.3356713\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Remote photoplethysmography (rPPG)-based physiological measurement is an emerging yet crucial vision task, whose challenge lies in exploring accurate rPPG prediction from facial videos accompanied by noises of illumination variations, facial occlusions, head movements, etc., in a noncontact manner. Existing mainstream convolutional neural network (CNN)-based models make efforts to detect physiological signals by capturing subtle color changes in facial regions of interest (ROI) caused by heartbeats. However, such models are constrained by the limited local spatial or temporal receptive fields in the neural units. Unlike them, a native transformer-based framework called dual-path TokenLearner (dual-TL) is proposed in this article, which utilizes the concept of learnable tokens to integrate both spatial and temporal informative contexts from the global perspective of the video. Specifically, the proposed dual-TL uses a spatial TokenLearner (S-TL) to explore associations in different facial ROIs, which promises the rPPG prediction far away from noisy ROI disturbances. Complementarily, a temporal TokenLearner (T-TL) is designed to infer the quasi-periodic pattern of heartbeats, which eliminates temporal disturbances such as head movements. The two TokenLearners, S-TL and T-TL, are executed in a dual-path mode. This enables the model to reduce noise disturbances for final rPPG signal prediction. Extensive experiments on four physiological measurement benchmark datasets are conducted. The dual-TL achieves state-of-the-art performances in both intra and cross-dataset testings, demonstrating its immense potential as a basic backbone for rPPG measurement.\",\"PeriodicalId\":13044,\"journal\":{\"name\":\"IEEE Transactions on Computational Social Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2024-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computational Social Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10445699/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, CYBERNETICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computational Social Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10445699/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 0

摘要

基于远程血压计(rPPG)的生理测量是一项新兴而又关键的视觉任务,其挑战在于以非接触方式从伴有光照变化、面部遮挡、头部运动等噪声的面部视频中探索准确的 rPPG 预测。现有的基于卷积神经网络(CNN)的主流模型通过捕捉心跳引起的面部感兴趣区(ROI)的细微颜色变化来检测生理信号。然而,这些模型受到神经单元有限的局部空间或时间感受野的限制。与之不同的是,本文提出了一种基于本机变换器的框架,称为双路标记学习器(dual-TL),它利用可学习标记的概念,从视频的全局角度整合空间和时间信息上下文。具体来说,所提出的双 TL 使用空间代币学习器(S-TL)来探索不同面部区域的关联,从而保证 rPPG 预测远离嘈杂的区域干扰。作为补充,设计了一个时间令牌学习器(T-TL)来推断心跳的准周期模式,从而消除头部运动等时间干扰。S-TL 和 T-TL 这两个令牌学习器以双路径模式执行。这使得模型在最终预测 rPPG 信号时能够减少噪音干扰。在四个生理测量基准数据集上进行了广泛的实验。双 TL 在数据集内和跨数据集测试中都取得了最先进的性能,证明了它作为 rPPG 测量基本骨干的巨大潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Dual-Path TokenLearner for Remote Photoplethysmography-Based Physiological Measurement With Facial Videos
Remote photoplethysmography (rPPG)-based physiological measurement is an emerging yet crucial vision task, whose challenge lies in exploring accurate rPPG prediction from facial videos accompanied by noises of illumination variations, facial occlusions, head movements, etc., in a noncontact manner. Existing mainstream convolutional neural network (CNN)-based models make efforts to detect physiological signals by capturing subtle color changes in facial regions of interest (ROI) caused by heartbeats. However, such models are constrained by the limited local spatial or temporal receptive fields in the neural units. Unlike them, a native transformer-based framework called dual-path TokenLearner (dual-TL) is proposed in this article, which utilizes the concept of learnable tokens to integrate both spatial and temporal informative contexts from the global perspective of the video. Specifically, the proposed dual-TL uses a spatial TokenLearner (S-TL) to explore associations in different facial ROIs, which promises the rPPG prediction far away from noisy ROI disturbances. Complementarily, a temporal TokenLearner (T-TL) is designed to infer the quasi-periodic pattern of heartbeats, which eliminates temporal disturbances such as head movements. The two TokenLearners, S-TL and T-TL, are executed in a dual-path mode. This enables the model to reduce noise disturbances for final rPPG signal prediction. Extensive experiments on four physiological measurement benchmark datasets are conducted. The dual-TL achieves state-of-the-art performances in both intra and cross-dataset testings, demonstrating its immense potential as a basic backbone for rPPG measurement.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Computational Social Systems
IEEE Transactions on Computational Social Systems Social Sciences-Social Sciences (miscellaneous)
CiteScore
10.00
自引率
20.00%
发文量
316
期刊介绍: IEEE Transactions on Computational Social Systems focuses on such topics as modeling, simulation, analysis and understanding of social systems from the quantitative and/or computational perspective. "Systems" include man-man, man-machine and machine-machine organizations and adversarial situations as well as social media structures and their dynamics. More specifically, the proposed transactions publishes articles on modeling the dynamics of social systems, methodologies for incorporating and representing socio-cultural and behavioral aspects in computational modeling, analysis of social system behavior and structure, and paradigms for social systems modeling and simulation. The journal also features articles on social network dynamics, social intelligence and cognition, social systems design and architectures, socio-cultural modeling and representation, and computational behavior modeling, and their applications.
期刊最新文献
Table of Contents Guest Editorial: Special Issue on Dark Side of the Socio-Cyber World: Media Manipulation, Fake News, and Misinformation IEEE Transactions on Computational Social Systems Publication Information IEEE Transactions on Computational Social Systems Information for Authors IEEE Systems, Man, and Cybernetics Society Information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1