{"title":"MEFDPN:用于评估数据不确定性的混合指数族分布后验网络","authors":"Xinlei Jin , Quan Qian","doi":"10.1016/j.eswa.2024.125593","DOIUrl":null,"url":null,"abstract":"<div><div>The computation of uncertainty are crucial for developing a reliable machine learning model. The natural posterior network (NatPN) provides uncertainty estimation for any single exponential family distribution, but real-world data is often complex. Therefore, we introduce a mixture exponential family posterior network (MEFDPN), which extends the prior distribution to a mixture of exponential family distributions, aiming to fit complex distributions that better represent real data. During network training, MEFDPN independently updates the posterior Bayesian estimates for each prior distribution, and the weights of these distributions are updated based on the forward propagation results. Furthermore, MEFDPN calculates two types of uncertainty (aleatoric and epistemic) and combines them using entropy weighting to obtain a comprehensive confidence measure for each data point. Theoretically, MEFDPN achieves higher prediction accuracy, and experimental results demonstrate its capability to compute high-quality data comprehensive confidence. Moreover, it shows encouraging accuracy in Out-of-Distribution(OOD) detection and validation experiments. Finally, we apply MEFDPN to a materials dataset, efficiently filtering out OOD data. This results in a significant enhancement of prediction accuracy for machine learning models. Specifically, removing only 5% of outlier data leads to a 2%–5% improvement in accuracy.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"262 ","pages":"Article 125593"},"PeriodicalIF":7.5000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MEFDPN: Mixture exponential family distribution posterior networks for evaluating data uncertainty\",\"authors\":\"Xinlei Jin , Quan Qian\",\"doi\":\"10.1016/j.eswa.2024.125593\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The computation of uncertainty are crucial for developing a reliable machine learning model. The natural posterior network (NatPN) provides uncertainty estimation for any single exponential family distribution, but real-world data is often complex. Therefore, we introduce a mixture exponential family posterior network (MEFDPN), which extends the prior distribution to a mixture of exponential family distributions, aiming to fit complex distributions that better represent real data. During network training, MEFDPN independently updates the posterior Bayesian estimates for each prior distribution, and the weights of these distributions are updated based on the forward propagation results. Furthermore, MEFDPN calculates two types of uncertainty (aleatoric and epistemic) and combines them using entropy weighting to obtain a comprehensive confidence measure for each data point. Theoretically, MEFDPN achieves higher prediction accuracy, and experimental results demonstrate its capability to compute high-quality data comprehensive confidence. Moreover, it shows encouraging accuracy in Out-of-Distribution(OOD) detection and validation experiments. Finally, we apply MEFDPN to a materials dataset, efficiently filtering out OOD data. This results in a significant enhancement of prediction accuracy for machine learning models. Specifically, removing only 5% of outlier data leads to a 2%–5% improvement in accuracy.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"262 \",\"pages\":\"Article 125593\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417424024606\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417424024606","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
MEFDPN: Mixture exponential family distribution posterior networks for evaluating data uncertainty
The computation of uncertainty are crucial for developing a reliable machine learning model. The natural posterior network (NatPN) provides uncertainty estimation for any single exponential family distribution, but real-world data is often complex. Therefore, we introduce a mixture exponential family posterior network (MEFDPN), which extends the prior distribution to a mixture of exponential family distributions, aiming to fit complex distributions that better represent real data. During network training, MEFDPN independently updates the posterior Bayesian estimates for each prior distribution, and the weights of these distributions are updated based on the forward propagation results. Furthermore, MEFDPN calculates two types of uncertainty (aleatoric and epistemic) and combines them using entropy weighting to obtain a comprehensive confidence measure for each data point. Theoretically, MEFDPN achieves higher prediction accuracy, and experimental results demonstrate its capability to compute high-quality data comprehensive confidence. Moreover, it shows encouraging accuracy in Out-of-Distribution(OOD) detection and validation experiments. Finally, we apply MEFDPN to a materials dataset, efficiently filtering out OOD data. This results in a significant enhancement of prediction accuracy for machine learning models. Specifically, removing only 5% of outlier data leads to a 2%–5% improvement in accuracy.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.