Behavioral Data Categorization for Transformers-based Models in Digital Health

C. Siebra, Igor Matias, K. Wac
{"title":"Behavioral Data Categorization for Transformers-based Models in Digital Health","authors":"C. Siebra, Igor Matias, K. Wac","doi":"10.1109/BHI56158.2022.9926938","DOIUrl":null,"url":null,"abstract":"Transformers are recent deep learning (DL) models used to capture the dependence between parts of sequential data. While their potential was already demonstrated in the natural language processing (NLP) domain, emerging research shows transformers can also be an adequate modeling approach to relate longitudinal multi-featured continuous behavioral data to future health outcomes. As transformers-based predictions are based on a domain lexicon, the use of categories, commonly used in specialized areas to cluster values, is the likely way to compose lexica. However, the number of categories may influence the transformer prediction accuracy, mainly when the categorization process creates imbalanced datasets, or the search space is very restricted to generate optimal feasible solutions. This paper analyzes the relationship between models' accuracy and the sparsity of behavioral data categories that compose the lexicon. This analysis relies on a case example that uses mQoL-Transformer to model the influence of physical activity behavior on sleep health. Results show that the number of categories shall be treated as a further transformer's hyperparameter, which can balance the literature-based categorization and optimization aspects. Thus, DL processes could also obtain similar accuracies compared to traditional approaches, such as long short-term memory, when used to process short behavioral data sequences.","PeriodicalId":347210,"journal":{"name":"2022 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BHI56158.2022.9926938","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Transformers are recent deep learning (DL) models used to capture the dependence between parts of sequential data. While their potential was already demonstrated in the natural language processing (NLP) domain, emerging research shows transformers can also be an adequate modeling approach to relate longitudinal multi-featured continuous behavioral data to future health outcomes. As transformers-based predictions are based on a domain lexicon, the use of categories, commonly used in specialized areas to cluster values, is the likely way to compose lexica. However, the number of categories may influence the transformer prediction accuracy, mainly when the categorization process creates imbalanced datasets, or the search space is very restricted to generate optimal feasible solutions. This paper analyzes the relationship between models' accuracy and the sparsity of behavioral data categories that compose the lexicon. This analysis relies on a case example that uses mQoL-Transformer to model the influence of physical activity behavior on sleep health. Results show that the number of categories shall be treated as a further transformer's hyperparameter, which can balance the literature-based categorization and optimization aspects. Thus, DL processes could also obtain similar accuracies compared to traditional approaches, such as long short-term memory, when used to process short behavioral data sequences.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
数字健康中基于变压器模型的行为数据分类
变形器是最近的深度学习(DL)模型,用于捕获序列数据各部分之间的依赖性。虽然它们的潜力已经在自然语言处理(NLP)领域得到了证明,但新兴研究表明,变压器也可以作为一种适当的建模方法,将纵向多特征连续行为数据与未来的健康结果联系起来。由于基于转换器的预测是基于领域词典的,因此使用类别(通常用于专门领域对值进行聚类)是组成词典的可能方法。然而,类别的数量可能会影响变压器的预测精度,主要是当分类过程产生不平衡的数据集,或者搜索空间非常有限,无法产生最优可行解时。本文分析了模型的准确性与构成词典的行为数据类别的稀疏度之间的关系。此分析依赖于使用mQoL-Transformer对身体活动行为对睡眠健康的影响进行建模的案例示例。结果表明,类别数量应被视为进一步的变压器的超参数,它可以平衡基于文献的分类和优化方面。因此,与传统方法(如长短期记忆)相比,深度学习过程在处理短行为数据序列时也可以获得相似的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
BEBOP: Bidirectional dEep Brain cOnnectivity maPping Stabilizing Skeletal Pose Estimation using mmWave Radar via Dynamic Model and Filtering Behavioral Data Categorization for Transformers-based Models in Digital Health Gender Difference in Prognosis of Patients with Heart Failure: A Propensity Score Matching Analysis Influence of Sensor Position and Body Movements on Radar-Based Heart Rate Monitoring
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1