{"title":"FML-Vit:使用 FMCW 雷达进行人类活动识别的轻量级视觉变换器算法","authors":"Minhao Ding;Guangxin Dongye;Ping Lv;Yipeng Ding","doi":"10.1109/JSEN.2024.3473890","DOIUrl":null,"url":null,"abstract":"In recent years, human activity recognition (HAR) using frequency module continuous wave (FMCW) radar is an effective tool that has been widely used in the fields of healthcare, smart driving, and smart living due to its convenience, inexpensiveness, and accuracy. Past studies have mainly investigated the improvement of the accuracy of HAR models while neglecting the deployment of the models. Therefore, we propose a model named FMCW lightweight vision transformer (FML-Vit) for HAR, primarily consisting of the FML-Vit block and FML-Vit subsample modules. The FML-Vit block, by incorporating a cascaded linear self-attention mechanism in place of the traditional multi-head attention mechanism, can transform the time complexity from \n<inline-formula> <tex-math>${O}\\text {(} {k}^{{2}} \\text {)}$ </tex-math></inline-formula>\n to \n<inline-formula> <tex-math>${O}\\text {(}{k}\\text {)}$ </tex-math></inline-formula>\n. The FML-Vit subsampling modules perform dimension reduction and feature reallocation, while the context broadcasting (CB) module is used to reduce the density in the original attention maps, thereby increasing both the capacity and generalizability of the ViT. The proposed algorithm is compared with nine different state-of-the-art methods on self-datasets and open-source datasets. The results demonstrate that FML-Vit outperforms other current lightweight networks with the fastest inference.","PeriodicalId":447,"journal":{"name":"IEEE Sensors Journal","volume":"24 22","pages":"38518-38526"},"PeriodicalIF":4.3000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FML-Vit: A Lightweight Vision Transformer Algorithm for Human Activity Recognition Using FMCW Radar\",\"authors\":\"Minhao Ding;Guangxin Dongye;Ping Lv;Yipeng Ding\",\"doi\":\"10.1109/JSEN.2024.3473890\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, human activity recognition (HAR) using frequency module continuous wave (FMCW) radar is an effective tool that has been widely used in the fields of healthcare, smart driving, and smart living due to its convenience, inexpensiveness, and accuracy. Past studies have mainly investigated the improvement of the accuracy of HAR models while neglecting the deployment of the models. Therefore, we propose a model named FMCW lightweight vision transformer (FML-Vit) for HAR, primarily consisting of the FML-Vit block and FML-Vit subsample modules. The FML-Vit block, by incorporating a cascaded linear self-attention mechanism in place of the traditional multi-head attention mechanism, can transform the time complexity from \\n<inline-formula> <tex-math>${O}\\\\text {(} {k}^{{2}} \\\\text {)}$ </tex-math></inline-formula>\\n to \\n<inline-formula> <tex-math>${O}\\\\text {(}{k}\\\\text {)}$ </tex-math></inline-formula>\\n. The FML-Vit subsampling modules perform dimension reduction and feature reallocation, while the context broadcasting (CB) module is used to reduce the density in the original attention maps, thereby increasing both the capacity and generalizability of the ViT. The proposed algorithm is compared with nine different state-of-the-art methods on self-datasets and open-source datasets. The results demonstrate that FML-Vit outperforms other current lightweight networks with the fastest inference.\",\"PeriodicalId\":447,\"journal\":{\"name\":\"IEEE Sensors Journal\",\"volume\":\"24 22\",\"pages\":\"38518-38526\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Sensors Journal\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10713094/\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Sensors Journal","FirstCategoryId":"103","ListUrlMain":"https://ieeexplore.ieee.org/document/10713094/","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
摘要
近年来,使用频率模块连续波(FMCW)雷达进行人类活动识别(HAR)是一种有效的工具,因其便捷、低敏感性和准确性,已被广泛应用于医疗保健、智能驾驶和智能生活等领域。以往的研究主要研究如何提高 HAR 模型的准确性,而忽略了模型的部署。因此,我们提出了一种名为 FMCW 轻量级视觉转换器(FML-Vit)的 HAR 模型,主要由 FML-Vit 模块和 FML-Vit 子样本模块组成。FML-Vit 模块采用级联线性自注意机制代替传统的多头注意机制,可将时间复杂度从 ${O}\text {(} {k}^{{2}} \text {)}$ 降低到 ${O}\text {(}{k}\text {)}$。FML-Vit 子采样模块执行降维和特征重新分配,而上下文广播(CB)模块用于降低原始注意力图的密度,从而提高 ViT 的容量和通用性。在自数据集和开源数据集上,将所提出的算法与九种不同的先进方法进行了比较。结果表明,FML-Vit 以最快的推理速度超越了当前其他轻量级网络。
FML-Vit: A Lightweight Vision Transformer Algorithm for Human Activity Recognition Using FMCW Radar
In recent years, human activity recognition (HAR) using frequency module continuous wave (FMCW) radar is an effective tool that has been widely used in the fields of healthcare, smart driving, and smart living due to its convenience, inexpensiveness, and accuracy. Past studies have mainly investigated the improvement of the accuracy of HAR models while neglecting the deployment of the models. Therefore, we propose a model named FMCW lightweight vision transformer (FML-Vit) for HAR, primarily consisting of the FML-Vit block and FML-Vit subsample modules. The FML-Vit block, by incorporating a cascaded linear self-attention mechanism in place of the traditional multi-head attention mechanism, can transform the time complexity from
${O}\text {(} {k}^{{2}} \text {)}$
to
${O}\text {(}{k}\text {)}$
. The FML-Vit subsampling modules perform dimension reduction and feature reallocation, while the context broadcasting (CB) module is used to reduce the density in the original attention maps, thereby increasing both the capacity and generalizability of the ViT. The proposed algorithm is compared with nine different state-of-the-art methods on self-datasets and open-source datasets. The results demonstrate that FML-Vit outperforms other current lightweight networks with the fastest inference.
期刊介绍:
The fields of interest of the IEEE Sensors Journal are the theory, design , fabrication, manufacturing and applications of devices for sensing and transducing physical, chemical and biological phenomena, with emphasis on the electronics and physics aspect of sensors and integrated sensors-actuators. IEEE Sensors Journal deals with the following:
-Sensor Phenomenology, Modelling, and Evaluation
-Sensor Materials, Processing, and Fabrication
-Chemical and Gas Sensors
-Microfluidics and Biosensors
-Optical Sensors
-Physical Sensors: Temperature, Mechanical, Magnetic, and others
-Acoustic and Ultrasonic Sensors
-Sensor Packaging
-Sensor Networks
-Sensor Applications
-Sensor Systems: Signals, Processing, and Interfaces
-Actuators and Sensor Power Systems
-Sensor Signal Processing for high precision and stability (amplification, filtering, linearization, modulation/demodulation) and under harsh conditions (EMC, radiation, humidity, temperature); energy consumption/harvesting
-Sensor Data Processing (soft computing with sensor data, e.g., pattern recognition, machine learning, evolutionary computation; sensor data fusion, processing of wave e.g., electromagnetic and acoustic; and non-wave, e.g., chemical, gravity, particle, thermal, radiative and non-radiative sensor data, detection, estimation and classification based on sensor data)
-Sensors in Industrial Practice