多模态检测中动态贝叶斯网络的增强学习

Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997) Pub Date : 2002-07-08 DOI:10.1109/ICIF.2002.1021202

Tanzeem Chaodhury, James M. Rehg, V. Pavlovic, A. Pentland

{"title":"多模态检测中动态贝叶斯网络的增强学习","authors":"Tanzeem Chaodhury, James M. Rehg, V. Pavlovic, A. Pentland","doi":"10.1109/ICIF.2002.1021202","DOIUrl":null,"url":null,"abstract":"Bayesian networks are an attractive modeling tool for human sensing, as they combine an intuitive graphical representation with efficient algorithms for inference and learning. Temporal fusion of multiple sensors can be efficiently formulated using dynamic Bayesian networks (DBNs) which allow the power of statistical inference and learning to be combined with contextual knowledge of the problem. Unfortunately, simple learning methods can cause such appealing models to fail when the data exhibits complex behavior We first demonstrate how boosted parameter learning could be used to improve the performance of Bayesian network classifiers for complex multimodal inference problems. As an example we apply the framework to the problem of audiovisual speaker detection in an interactive environment using \"off-the-shelf\" visual and audio sensors (face, skin, texture, mouth motion, and silence detectors). We then introduce a boosted structure learning algorithm. Given labeled data, our algorithm modifies both the network structure and parameters so as to improve classification accuracy. We compare its performance to both standard structure learning and boosted parameter learning. We present results for speaker detection and for datasets from the UCI repository.","PeriodicalId":399150,"journal":{"name":"Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Boosted learning in dynamic Bayesian networks for multimodal detection\",\"authors\":\"Tanzeem Chaodhury, James M. Rehg, V. Pavlovic, A. Pentland\",\"doi\":\"10.1109/ICIF.2002.1021202\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bayesian networks are an attractive modeling tool for human sensing, as they combine an intuitive graphical representation with efficient algorithms for inference and learning. Temporal fusion of multiple sensors can be efficiently formulated using dynamic Bayesian networks (DBNs) which allow the power of statistical inference and learning to be combined with contextual knowledge of the problem. Unfortunately, simple learning methods can cause such appealing models to fail when the data exhibits complex behavior We first demonstrate how boosted parameter learning could be used to improve the performance of Bayesian network classifiers for complex multimodal inference problems. As an example we apply the framework to the problem of audiovisual speaker detection in an interactive environment using \\\"off-the-shelf\\\" visual and audio sensors (face, skin, texture, mouth motion, and silence detectors). We then introduce a boosted structure learning algorithm. Given labeled data, our algorithm modifies both the network structure and parameters so as to improve classification accuracy. We compare its performance to both standard structure learning and boosted parameter learning. We present results for speaker detection and for datasets from the UCI repository.\",\"PeriodicalId\":399150,\"journal\":{\"name\":\"Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997)\",\"volume\":\"146 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIF.2002.1021202\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIF.2002.1021202","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

贝叶斯网络是人类感知的一个有吸引力的建模工具，因为它们将直观的图形表示与有效的推理和学习算法相结合。使用动态贝叶斯网络(dbn)可以有效地制定多个传感器的时间融合，该网络允许统计推断和学习的力量与问题的上下文知识相结合。不幸的是，当数据表现出复杂的行为时，简单的学习方法可能会导致这些吸引人的模型失败。我们首先展示了如何使用增强参数学习来提高贝叶斯网络分类器在复杂多模态推理问题上的性能。作为一个例子，我们将该框架应用于使用“现成的”视觉和音频传感器(面部、皮肤、纹理、嘴部运动和沉默检测器)的交互式环境中的视听说话者检测问题。然后，我们介绍了一种增强结构学习算法。在给定标记数据的情况下，我们的算法通过修改网络结构和参数来提高分类精度。我们将其性能与标准结构学习和增强参数学习进行了比较。我们展示了说话人检测和UCI存储库数据集的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Boosted learning in dynamic Bayesian networks for multimodal detection

Bayesian networks are an attractive modeling tool for human sensing, as they combine an intuitive graphical representation with efficient algorithms for inference and learning. Temporal fusion of multiple sensors can be efficiently formulated using dynamic Bayesian networks (DBNs) which allow the power of statistical inference and learning to be combined with contextual knowledge of the problem. Unfortunately, simple learning methods can cause such appealing models to fail when the data exhibits complex behavior We first demonstrate how boosted parameter learning could be used to improve the performance of Bayesian network classifiers for complex multimodal inference problems. As an example we apply the framework to the problem of audiovisual speaker detection in an interactive environment using "off-the-shelf" visual and audio sensors (face, skin, texture, mouth motion, and silence detectors). We then introduce a boosted structure learning algorithm. Given labeled data, our algorithm modifies both the network structure and parameters so as to improve classification accuracy. We compare its performance to both standard structure learning and boosted parameter learning. We present results for speaker detection and for datasets from the UCI repository.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997)

自引率

0.00%

发文量

期刊最新文献

Approximating fuzzy measures by hierarchically decomposable ones Tracking and fusion for wireless sensor networks A dynamic communication model for loosely coupled hybrid tracking systems On platform-based sensor management An improved Bayes fusion algorithm with the Parzen window method