End-to-End Multi-Modal Behavioral Context Recognition in a Real-Life Setting

Aaqib Saeed, T. Ozcelebi, S. Trajanovski, J. Lukkien
{"title":"End-to-End Multi-Modal Behavioral Context Recognition in a Real-Life Setting","authors":"Aaqib Saeed, T. Ozcelebi, S. Trajanovski, J. Lukkien","doi":"10.23919/fusion43075.2019.9011194","DOIUrl":null,"url":null,"abstract":"Smart devices of everyday use (such as smartphones and wearables) are increasingly integrated with sensors that provide immense amounts of information about a person's daily life. The automatic and unobtrusive sensing of human behavioral context can help develop solutions for assisted living, fitness tracking, sleep monitoring, and several other fields. Towards addressing this issue, we raise the question: can a machine learn to recognize a diverse set of contexts and activities in a real-life through jointly learning from raw multi-modal signals (e.g., accelerometer, gyroscope and audio)? In this paper, we propose a multi-stream network comprising of temporal convolution and fully-connected layers to address the problem of multi-label behavioral context recognition. A four-stream network architecture handles learning from each modality with a contextualization module which incorporates extracted representations to infer a user's context. Our empirical evaluation suggests that a deep convolutional network trained end-to-end achieves comparable performance to manual feature engineering with minimal effort. Furthermore, the presented architecture can be extended to include similar sensors for performance improvements and handles missing modalities through multi-task learning on a highly imbalanced and sparsely labeled dataset.","PeriodicalId":348881,"journal":{"name":"2019 22th International Conference on Information Fusion (FUSION)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 22th International Conference on Information Fusion (FUSION)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/fusion43075.2019.9011194","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Smart devices of everyday use (such as smartphones and wearables) are increasingly integrated with sensors that provide immense amounts of information about a person's daily life. The automatic and unobtrusive sensing of human behavioral context can help develop solutions for assisted living, fitness tracking, sleep monitoring, and several other fields. Towards addressing this issue, we raise the question: can a machine learn to recognize a diverse set of contexts and activities in a real-life through jointly learning from raw multi-modal signals (e.g., accelerometer, gyroscope and audio)? In this paper, we propose a multi-stream network comprising of temporal convolution and fully-connected layers to address the problem of multi-label behavioral context recognition. A four-stream network architecture handles learning from each modality with a contextualization module which incorporates extracted representations to infer a user's context. Our empirical evaluation suggests that a deep convolutional network trained end-to-end achieves comparable performance to manual feature engineering with minimal effort. Furthermore, the presented architecture can be extended to include similar sensors for performance improvements and handles missing modalities through multi-task learning on a highly imbalanced and sparsely labeled dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
端到端多模态行为上下文识别在现实生活中的应用
日常使用的智能设备(如智能手机和可穿戴设备)越来越多地集成了传感器,这些传感器可以提供有关个人日常生活的大量信息。对人类行为环境的自动和不显眼的感知可以帮助开发辅助生活、健身跟踪、睡眠监测和其他几个领域的解决方案。为了解决这个问题,我们提出了一个问题:机器能否通过共同学习原始的多模态信号(例如,加速度计、陀螺仪和音频)来学习识别现实生活中的各种环境和活动?在本文中,我们提出了一个由时间卷积和全连接层组成的多流网络来解决多标签行为上下文识别问题。一个四流网络架构处理每个模态的学习,其中包含一个上下文化模块,该模块包含提取的表示来推断用户的上下文。我们的经验评估表明,经过端到端训练的深度卷积网络可以以最小的努力实现与手动特征工程相当的性能。此外,所提出的架构可以扩展到包括类似的传感器以提高性能,并通过在高度不平衡和稀疏标记的数据集上进行多任务学习来处理缺失模式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Continuum Detection and Predictive-Corrective Classification of Crack Networks Adaptive BM3D Algorithm for Image Denoising Using Coefficient of Variation A Latent Variable Model State Estimation System for Image Sequences Adaptive Approximate Bayesian Computational Particle Filters for Underwater Terrain Aided Navigation Pooling Tweets by Fine-Grained Emotions to Uncover Topic Trends in Social Media
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1