结合开放模糊最小-最大神经网络的三维卷积神经网络开放式动作识别

Chia-Ying Wu, Y. Tsay, A. C. Shih
{"title":"结合开放模糊最小-最大神经网络的三维卷积神经网络开放式动作识别","authors":"Chia-Ying Wu, Y. Tsay, A. C. Shih","doi":"10.1109/ARIS56205.2022.9910444","DOIUrl":null,"url":null,"abstract":"The 3-dimensional convolution neural network (3D CNN) has demonstrated a high prediction power for action recognition, when the inputs belong to the known classes. In a real application, however, if considering the inputs from unknown classes, previous studies have revealed that some prediction results can have high softmax scores falsely for known classes. That is called the open set recognition problem. Recently, a series of statistical methods based on an openmax approach have been proposed to solve the problem in 2D image data. However, how to apply the approach to video data is still unknown. Without using a prior statistical model, we propose a two-stage approach for open action recognition in this paper. A 3D CNN model is trained in the first stage. Then, the activation vector data, the output from the activation layer, are extracted as the feature data for training a fuzzy min-max neural network (FMMNN) as a classifier in the second stage. Since the value ranges of an activation vector are not limited between 0 and 1, an open FMMNN with a new fuzzy membership function without the normalization of input data is proposed and then constructed by the feature data. Finally, the prediction output is selected by the class with the maximum membership value. In the results, two separated datasets of mouse action videos were used for the training and the prediction test, respectively. We found that the proposed method can indeed improve the prediction performance. Moreover, using the human action and random background videos as two unknown datasets, we also demonstrated that the prediction outputs from known and unknown sets can be distinguished by a single threshold. In short, the proposed open FNNMM can not only improve the prediction performance from the inputs from known classes but also detect the inputs from unknown classes.","PeriodicalId":254572,"journal":{"name":"2022 International Conference on Advanced Robotics and Intelligent Systems (ARIS)","volume":"11 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Open Action Recognition by A 3D Convolutional Neural Network Combining with An Open Fuzzy Min-Max Neural Network\",\"authors\":\"Chia-Ying Wu, Y. Tsay, A. C. Shih\",\"doi\":\"10.1109/ARIS56205.2022.9910444\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The 3-dimensional convolution neural network (3D CNN) has demonstrated a high prediction power for action recognition, when the inputs belong to the known classes. In a real application, however, if considering the inputs from unknown classes, previous studies have revealed that some prediction results can have high softmax scores falsely for known classes. That is called the open set recognition problem. Recently, a series of statistical methods based on an openmax approach have been proposed to solve the problem in 2D image data. However, how to apply the approach to video data is still unknown. Without using a prior statistical model, we propose a two-stage approach for open action recognition in this paper. A 3D CNN model is trained in the first stage. Then, the activation vector data, the output from the activation layer, are extracted as the feature data for training a fuzzy min-max neural network (FMMNN) as a classifier in the second stage. Since the value ranges of an activation vector are not limited between 0 and 1, an open FMMNN with a new fuzzy membership function without the normalization of input data is proposed and then constructed by the feature data. Finally, the prediction output is selected by the class with the maximum membership value. In the results, two separated datasets of mouse action videos were used for the training and the prediction test, respectively. We found that the proposed method can indeed improve the prediction performance. Moreover, using the human action and random background videos as two unknown datasets, we also demonstrated that the prediction outputs from known and unknown sets can be distinguished by a single threshold. In short, the proposed open FNNMM can not only improve the prediction performance from the inputs from known classes but also detect the inputs from unknown classes.\",\"PeriodicalId\":254572,\"journal\":{\"name\":\"2022 International Conference on Advanced Robotics and Intelligent Systems (ARIS)\",\"volume\":\"11 3\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Advanced Robotics and Intelligent Systems (ARIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ARIS56205.2022.9910444\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Advanced Robotics and Intelligent Systems (ARIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARIS56205.2022.9910444","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

当输入属于已知类别时,三维卷积神经网络(3D CNN)对动作识别具有很高的预测能力。然而,在实际应用中,如果考虑未知类的输入,先前的研究表明,对于已知类,一些预测结果可能会错误地具有较高的softmax分数。这被称为开集识别问题。最近,人们提出了一系列基于openmax方法的统计方法来解决二维图像数据中的这一问题。然而,如何将这种方法应用到视频数据中仍然是一个未知的问题。在不使用先验统计模型的情况下,我们提出了一种两阶段的开放式动作识别方法。第一阶段训练三维CNN模型。然后,提取激活层输出的激活向量数据作为特征数据,用于训练模糊最小-最大神经网络(FMMNN)作为第二阶段的分类器。由于激活向量的取值范围不受0 ~ 1的限制,提出了一种不需要对输入数据进行归一化处理的开放式模糊隶属度函数FMMNN。最后,由隶属度值最大的类选择预测输出。在结果中,分别使用两个独立的鼠标动作视频数据集进行训练和预测测试。我们发现,该方法确实可以提高预测性能。此外,使用人类动作和随机背景视频作为两个未知数据集,我们还证明了已知集和未知集的预测输出可以通过单一阈值进行区分。简而言之,所提出的开放式FNNMM不仅可以提高已知类输入的预测性能,还可以检测未知类输入。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Open Action Recognition by A 3D Convolutional Neural Network Combining with An Open Fuzzy Min-Max Neural Network
The 3-dimensional convolution neural network (3D CNN) has demonstrated a high prediction power for action recognition, when the inputs belong to the known classes. In a real application, however, if considering the inputs from unknown classes, previous studies have revealed that some prediction results can have high softmax scores falsely for known classes. That is called the open set recognition problem. Recently, a series of statistical methods based on an openmax approach have been proposed to solve the problem in 2D image data. However, how to apply the approach to video data is still unknown. Without using a prior statistical model, we propose a two-stage approach for open action recognition in this paper. A 3D CNN model is trained in the first stage. Then, the activation vector data, the output from the activation layer, are extracted as the feature data for training a fuzzy min-max neural network (FMMNN) as a classifier in the second stage. Since the value ranges of an activation vector are not limited between 0 and 1, an open FMMNN with a new fuzzy membership function without the normalization of input data is proposed and then constructed by the feature data. Finally, the prediction output is selected by the class with the maximum membership value. In the results, two separated datasets of mouse action videos were used for the training and the prediction test, respectively. We found that the proposed method can indeed improve the prediction performance. Moreover, using the human action and random background videos as two unknown datasets, we also demonstrated that the prediction outputs from known and unknown sets can be distinguished by a single threshold. In short, the proposed open FNNMM can not only improve the prediction performance from the inputs from known classes but also detect the inputs from unknown classes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Calibration of a Robot's Tool Center Point Using a Laser Displacement Sensor Rock Climbing Benchmark for Humanoid Robots Genetic algorithm-determined artificial neural network architecture for predicting power usage effectiveness (PUE) in a data center Evidential Sensory Fusion of 2D Feature and 3D Shape Information for 3D Occluded Object Recognition in Robotics Applications Design and Implementation of Wire-Driven Multi-Joint Robotic Arm
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1