Tanzeem Chaodhury, James M. Rehg, V. Pavlovic, A. Pentland
{"title":"多模态检测中动态贝叶斯网络的增强学习","authors":"Tanzeem Chaodhury, James M. Rehg, V. Pavlovic, A. Pentland","doi":"10.1109/ICIF.2002.1021202","DOIUrl":null,"url":null,"abstract":"Bayesian networks are an attractive modeling tool for human sensing, as they combine an intuitive graphical representation with efficient algorithms for inference and learning. Temporal fusion of multiple sensors can be efficiently formulated using dynamic Bayesian networks (DBNs) which allow the power of statistical inference and learning to be combined with contextual knowledge of the problem. Unfortunately, simple learning methods can cause such appealing models to fail when the data exhibits complex behavior We first demonstrate how boosted parameter learning could be used to improve the performance of Bayesian network classifiers for complex multimodal inference problems. As an example we apply the framework to the problem of audiovisual speaker detection in an interactive environment using \"off-the-shelf\" visual and audio sensors (face, skin, texture, mouth motion, and silence detectors). We then introduce a boosted structure learning algorithm. Given labeled data, our algorithm modifies both the network structure and parameters so as to improve classification accuracy. We compare its performance to both standard structure learning and boosted parameter learning. We present results for speaker detection and for datasets from the UCI repository.","PeriodicalId":399150,"journal":{"name":"Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Boosted learning in dynamic Bayesian networks for multimodal detection\",\"authors\":\"Tanzeem Chaodhury, James M. Rehg, V. Pavlovic, A. Pentland\",\"doi\":\"10.1109/ICIF.2002.1021202\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bayesian networks are an attractive modeling tool for human sensing, as they combine an intuitive graphical representation with efficient algorithms for inference and learning. Temporal fusion of multiple sensors can be efficiently formulated using dynamic Bayesian networks (DBNs) which allow the power of statistical inference and learning to be combined with contextual knowledge of the problem. Unfortunately, simple learning methods can cause such appealing models to fail when the data exhibits complex behavior We first demonstrate how boosted parameter learning could be used to improve the performance of Bayesian network classifiers for complex multimodal inference problems. As an example we apply the framework to the problem of audiovisual speaker detection in an interactive environment using \\\"off-the-shelf\\\" visual and audio sensors (face, skin, texture, mouth motion, and silence detectors). We then introduce a boosted structure learning algorithm. Given labeled data, our algorithm modifies both the network structure and parameters so as to improve classification accuracy. We compare its performance to both standard structure learning and boosted parameter learning. We present results for speaker detection and for datasets from the UCI repository.\",\"PeriodicalId\":399150,\"journal\":{\"name\":\"Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997)\",\"volume\":\"146 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIF.2002.1021202\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIF.2002.1021202","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Boosted learning in dynamic Bayesian networks for multimodal detection
Bayesian networks are an attractive modeling tool for human sensing, as they combine an intuitive graphical representation with efficient algorithms for inference and learning. Temporal fusion of multiple sensors can be efficiently formulated using dynamic Bayesian networks (DBNs) which allow the power of statistical inference and learning to be combined with contextual knowledge of the problem. Unfortunately, simple learning methods can cause such appealing models to fail when the data exhibits complex behavior We first demonstrate how boosted parameter learning could be used to improve the performance of Bayesian network classifiers for complex multimodal inference problems. As an example we apply the framework to the problem of audiovisual speaker detection in an interactive environment using "off-the-shelf" visual and audio sensors (face, skin, texture, mouth motion, and silence detectors). We then introduce a boosted structure learning algorithm. Given labeled data, our algorithm modifies both the network structure and parameters so as to improve classification accuracy. We compare its performance to both standard structure learning and boosted parameter learning. We present results for speaker detection and for datasets from the UCI repository.