用于细粒度烹饪活动识别的食材序列变换信息学习

Atsushi Okamoto, Katsufumi Inoue, M. Yoshioka
{"title":"用于细粒度烹饪活动识别的食材序列变换信息学习","authors":"Atsushi Okamoto, Katsufumi Inoue, M. Yoshioka","doi":"10.1145/3552485.3554940","DOIUrl":null,"url":null,"abstract":"The goal of our research is to recognize the fine-grained cooking activities (e.g., dicing or mincing in cutting) in the egocentric videos from the sequential transformation of ingredients that are processed by the camera-wearer; these types of activities are classified according to the state of ingredients after processing, and we often utilize the same cooking utensils and similar motions in such activities. Due to the above conditions, the recognition of such activities is a challenging task in computer vision and multimedia analysis. To tackle this problem, we need to perceive the sequential state transformation of ingredients precisely. In this research, to realize this, we propose a new GAN-based network whose characteristic points are 1) we crop images around the ingredient as a preprocessing to remove the environmental information, 2) we generate intermediate images from the past and future images to obtain the sequential information in the generator network, 3) the adversarial network is employed as a discriminator to classify whether the input image is generated one or not, and 4) we employ the temporally coherent network to check the temporal smoothness of input images and to predict cooking activities by comparing the original sequential images and the generated ones. To investigate the effectiveness of our proposed method, for the first step, we especially focus on \"\\textitcutting activities \". From the experimental results with our originally prepared dataset, in this paper, we report the effectiveness of our proposed method.","PeriodicalId":338126,"journal":{"name":"Proceedings of the 1st International Workshop on Multimedia for Cooking, Eating, and related APPlications","volume":"272 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning Sequential Transformation Information of Ingredients for Fine-Grained Cooking Activity Recognition\",\"authors\":\"Atsushi Okamoto, Katsufumi Inoue, M. Yoshioka\",\"doi\":\"10.1145/3552485.3554940\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The goal of our research is to recognize the fine-grained cooking activities (e.g., dicing or mincing in cutting) in the egocentric videos from the sequential transformation of ingredients that are processed by the camera-wearer; these types of activities are classified according to the state of ingredients after processing, and we often utilize the same cooking utensils and similar motions in such activities. Due to the above conditions, the recognition of such activities is a challenging task in computer vision and multimedia analysis. To tackle this problem, we need to perceive the sequential state transformation of ingredients precisely. In this research, to realize this, we propose a new GAN-based network whose characteristic points are 1) we crop images around the ingredient as a preprocessing to remove the environmental information, 2) we generate intermediate images from the past and future images to obtain the sequential information in the generator network, 3) the adversarial network is employed as a discriminator to classify whether the input image is generated one or not, and 4) we employ the temporally coherent network to check the temporal smoothness of input images and to predict cooking activities by comparing the original sequential images and the generated ones. To investigate the effectiveness of our proposed method, for the first step, we especially focus on \\\"\\\\textitcutting activities \\\". From the experimental results with our originally prepared dataset, in this paper, we report the effectiveness of our proposed method.\",\"PeriodicalId\":338126,\"journal\":{\"name\":\"Proceedings of the 1st International Workshop on Multimedia for Cooking, Eating, and related APPlications\",\"volume\":\"272 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 1st International Workshop on Multimedia for Cooking, Eating, and related APPlications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3552485.3554940\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st International Workshop on Multimedia for Cooking, Eating, and related APPlications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3552485.3554940","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们的研究目标是在以自我为中心的视频中识别精细的烹饪活动(例如,切丁或切碎),这些视频来自于相机佩戴者处理的食材的顺序转换;这些类型的活动是根据原料加工后的状态来分类的,我们经常在这些活动中使用相同的烹饪器具和类似的动作。由于上述条件,这些活动的识别在计算机视觉和多媒体分析中是一项具有挑战性的任务。为了解决这个问题,我们需要精确地感知成分的顺序状态转换。在本研究中,为了实现这一点,我们提出了一种新的基于gan的网络,其特征点是:1)我们裁剪成分周围的图像作为预处理以去除环境信息;2)我们从过去和未来的图像中生成中间图像以获得生成器网络中的顺序信息;3)使用对抗网络作为判别器来分类输入图像是否为生成图像。4)使用时间相干网络检查输入图像的时间平滑性,并通过对比原始序列图像和生成的序列图像来预测烹饪活动。为了研究我们提出的方法的有效性,第一步,我们特别关注“文本切割活动”。根据我们最初准备的数据集的实验结果,在本文中,我们报告了我们提出的方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Learning Sequential Transformation Information of Ingredients for Fine-Grained Cooking Activity Recognition
The goal of our research is to recognize the fine-grained cooking activities (e.g., dicing or mincing in cutting) in the egocentric videos from the sequential transformation of ingredients that are processed by the camera-wearer; these types of activities are classified according to the state of ingredients after processing, and we often utilize the same cooking utensils and similar motions in such activities. Due to the above conditions, the recognition of such activities is a challenging task in computer vision and multimedia analysis. To tackle this problem, we need to perceive the sequential state transformation of ingredients precisely. In this research, to realize this, we propose a new GAN-based network whose characteristic points are 1) we crop images around the ingredient as a preprocessing to remove the environmental information, 2) we generate intermediate images from the past and future images to obtain the sequential information in the generator network, 3) the adversarial network is employed as a discriminator to classify whether the input image is generated one or not, and 4) we employ the temporally coherent network to check the temporal smoothness of input images and to predict cooking activities by comparing the original sequential images and the generated ones. To investigate the effectiveness of our proposed method, for the first step, we especially focus on "\textitcutting activities ". From the experimental results with our originally prepared dataset, in this paper, we report the effectiveness of our proposed method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Multimodal Dish Pairing: Predicting Side Dishes to Serve with a Main Dish Prediction of Mental State from Food Images Learning Sequential Transformation Information of Ingredients for Fine-Grained Cooking Activity Recognition Recipe Recording by Duplicating and Editing Standard Recipe Recipe Recommendation for Balancing Ingredient Preference and Daily Nutrients
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1