使用概率建模的第一人称视觉活动预测

Day 1 Tue, October 23, 2018 Pub Date : 2018-10-01 DOI:10.22581/MUET1982.1804.09

Shaheena Noor, Vali Uddin

{"title":"使用概率建模的第一人称视觉活动预测","authors":"Shaheena Noor, Vali Uddin","doi":"10.22581/MUET1982.1804.09","DOIUrl":null,"url":null,"abstract":"Identifying activities of daily living is an important area of research with applications in smart-homes and healthcare for elderly people. It is challenging due to reasons like human self-occlusion, complex natural environment and the human behavior when performing a complicated task. From psychological studies, we know that human gaze is closely linked with the thought process and we tend to “look” at the objects before acting on them. Hence, we have used the object information present in gaze images as the context and formed the basis for activity prediction. Our system is based on HMM (Hidden Markov Models) and trained using ANN (Artificial Neural Network). We begin with extracting motion information from TPV (Third Person Vision) streams and object information from FPV (First Person Vision) cameras. The advantage of having FPV is that the object information forms the context of the scene. When context is included as input to the HMM for activity recognition, the precision increases. For testing, we used two standard datasets from TUM (Technische Universitaet Muenchen) and GTEA Gaze+ (Georgia Tech Egocentric Activities). In the first round, we trained our ANNs only with activity information and in the second round added the object information as well. We saw a significant increase in the precision (and accuracy) of predicted activities from 55.21% (respectively 85.25%) to 77.61% (respectively 93.5%). This confirmed our initial hypothesis that including the focus of attention of the actor in the form of object seen in FPV can help in predicting activities better.","PeriodicalId":11240,"journal":{"name":"Day 1 Tue, October 23, 2018","volume":"21 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"First Person Vision for Activity Prediction Using Probabilistic Modeling\",\"authors\":\"Shaheena Noor, Vali Uddin\",\"doi\":\"10.22581/MUET1982.1804.09\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Identifying activities of daily living is an important area of research with applications in smart-homes and healthcare for elderly people. It is challenging due to reasons like human self-occlusion, complex natural environment and the human behavior when performing a complicated task. From psychological studies, we know that human gaze is closely linked with the thought process and we tend to “look” at the objects before acting on them. Hence, we have used the object information present in gaze images as the context and formed the basis for activity prediction. Our system is based on HMM (Hidden Markov Models) and trained using ANN (Artificial Neural Network). We begin with extracting motion information from TPV (Third Person Vision) streams and object information from FPV (First Person Vision) cameras. The advantage of having FPV is that the object information forms the context of the scene. When context is included as input to the HMM for activity recognition, the precision increases. For testing, we used two standard datasets from TUM (Technische Universitaet Muenchen) and GTEA Gaze+ (Georgia Tech Egocentric Activities). In the first round, we trained our ANNs only with activity information and in the second round added the object information as well. We saw a significant increase in the precision (and accuracy) of predicted activities from 55.21% (respectively 85.25%) to 77.61% (respectively 93.5%). This confirmed our initial hypothesis that including the focus of attention of the actor in the form of object seen in FPV can help in predicting activities better.\",\"PeriodicalId\":11240,\"journal\":{\"name\":\"Day 1 Tue, October 23, 2018\",\"volume\":\"21 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Day 1 Tue, October 23, 2018\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22581/MUET1982.1804.09\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 1 Tue, October 23, 2018","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22581/MUET1982.1804.09","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

识别日常生活活动是一个重要的研究领域，可用于智能家居和老年人医疗保健。由于人类的自我封闭，复杂的自然环境和人类在执行复杂任务时的行为等原因，这是具有挑战性的。从心理学研究中，我们知道人类的目光与思维过程密切相关，我们倾向于在对物体采取行动之前“看”一下。因此，我们使用凝视图像中存在的对象信息作为上下文，并形成活动预测的基础。该系统基于隐马尔可夫模型(HMM)，并使用人工神经网络(ANN)进行训练。我们首先从TPV(第三人称视觉)流中提取运动信息，从FPV(第一人称视觉)相机中提取物体信息。拥有FPV的优势在于物体信息形成了场景的背景。当上下文作为HMM的输入用于活动识别时，精度会提高。为了进行测试，我们使用了来自慕尼黑工业大学(TUM)和佐治亚理工学院(Georgia Tech)的两个标准数据集Gaze+。在第一轮中，我们只使用活动信息训练我们的ann，在第二轮中也添加了对象信息。我们看到预测活动的精度(和准确度)从55.21%(分别为85.25%)显著提高到77.61%(分别为93.5%)。这证实了我们最初的假设，即在FPV中看到的物体形式中包括行为者的注意力焦点可以帮助更好地预测活动。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

First Person Vision for Activity Prediction Using Probabilistic Modeling

Identifying activities of daily living is an important area of research with applications in smart-homes and healthcare for elderly people. It is challenging due to reasons like human self-occlusion, complex natural environment and the human behavior when performing a complicated task. From psychological studies, we know that human gaze is closely linked with the thought process and we tend to “look” at the objects before acting on them. Hence, we have used the object information present in gaze images as the context and formed the basis for activity prediction. Our system is based on HMM (Hidden Markov Models) and trained using ANN (Artificial Neural Network). We begin with extracting motion information from TPV (Third Person Vision) streams and object information from FPV (First Person Vision) cameras. The advantage of having FPV is that the object information forms the context of the scene. When context is included as input to the HMM for activity recognition, the precision increases. For testing, we used two standard datasets from TUM (Technische Universitaet Muenchen) and GTEA Gaze+ (Georgia Tech Egocentric Activities). In the first round, we trained our ANNs only with activity information and in the second round added the object information as well. We saw a significant increase in the precision (and accuracy) of predicted activities from 55.21% (respectively 85.25%) to 77.61% (respectively 93.5%). This confirmed our initial hypothesis that including the focus of attention of the actor in the form of object seen in FPV can help in predicting activities better.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Day 1 Tue, October 23, 2018

自引率

0.00%

发文量