H. M. D. Kabir, Moloud Abdar, A. Khosravi, D. Nahavandi, S. Mondal, Sadia Khanam, Shady M. K. Mohamed, D. Srinivasan, Saeid Nahavandi, P. N. Suganthan
{"title":"Synthetic Datasets for Numeric Uncertainty Quantification: Proposing Datasets for Future Researchers","authors":"H. M. D. Kabir, Moloud Abdar, A. Khosravi, D. Nahavandi, S. Mondal, Sadia Khanam, Shady M. K. Mohamed, D. Srinivasan, Saeid Nahavandi, P. N. Suganthan","doi":"10.1109/MSMC.2022.3218423","DOIUrl":null,"url":null,"abstract":"In this article, we propose ten synthetic datasets for point prediction and numeric uncertainty quantification (UQ). These datasets are split into the train, validation, and test sets for model benchmarking. Equations and the description of each dataset are provided in detail. We also present representative shallow neural network (NN) training and random vector functional link (RVFL) training examples both of which are training models for the point prediction. We perform UQ with the consideration of a Gaussian and homoscedastic distribution. Distribution considerations and models are made quite simple for the following reasons: 1) much room exists for further explorations and improvements, 2) users of the dataset have simple training examples including the process of accessing data, and 3) users get an idea of probable result and the format of the result. The dataset and scripts are available at the following link: https://github.com/dipuk0506/UQ-Data.","PeriodicalId":43649,"journal":{"name":"IEEE Systems Man and Cybernetics Magazine","volume":"39 1","pages":"39-48"},"PeriodicalIF":1.9000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Systems Man and Cybernetics Magazine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSMC.2022.3218423","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 1
Abstract
In this article, we propose ten synthetic datasets for point prediction and numeric uncertainty quantification (UQ). These datasets are split into the train, validation, and test sets for model benchmarking. Equations and the description of each dataset are provided in detail. We also present representative shallow neural network (NN) training and random vector functional link (RVFL) training examples both of which are training models for the point prediction. We perform UQ with the consideration of a Gaussian and homoscedastic distribution. Distribution considerations and models are made quite simple for the following reasons: 1) much room exists for further explorations and improvements, 2) users of the dataset have simple training examples including the process of accessing data, and 3) users get an idea of probable result and the format of the result. The dataset and scripts are available at the following link: https://github.com/dipuk0506/UQ-Data.