{"title":"一种基于模型的半真实数据自动生成框架,用于评估数据分析技术","authors":"Guangming Li, R. Carvalho, Wil M.P. van der Aalst","doi":"10.5220/0007713702130220","DOIUrl":null,"url":null,"abstract":"As data analysis techniques progress, the focus shifts from simple tabular data to more complex data at the level of business objects. Therefore, the evaluation of such data analysis techniques is far from trivial. However, due to confidentiality, most researchers are facing problems collecting available real data to evaluate their techniques. One alternative approach is to use synthetic data instead of real data, which leads to unconvincing results. In this paper, we propose a framework to automatically operate information systems (supporting operational processes) to generate semi-real data (i.e., “operations related data” exclusive of images, sound, video, etc.). This data have the same structure as the real data and are more realistic than traditional simulated data. A plugin is implemented to realize the framework for automatic data generation.","PeriodicalId":271024,"journal":{"name":"International Conference on Enterprise Information Systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Model-based Framework to Automatically Generate Semi-real Data for Evaluating Data Analysis Techniques\",\"authors\":\"Guangming Li, R. Carvalho, Wil M.P. van der Aalst\",\"doi\":\"10.5220/0007713702130220\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As data analysis techniques progress, the focus shifts from simple tabular data to more complex data at the level of business objects. Therefore, the evaluation of such data analysis techniques is far from trivial. However, due to confidentiality, most researchers are facing problems collecting available real data to evaluate their techniques. One alternative approach is to use synthetic data instead of real data, which leads to unconvincing results. In this paper, we propose a framework to automatically operate information systems (supporting operational processes) to generate semi-real data (i.e., “operations related data” exclusive of images, sound, video, etc.). This data have the same structure as the real data and are more realistic than traditional simulated data. A plugin is implemented to realize the framework for automatic data generation.\",\"PeriodicalId\":271024,\"journal\":{\"name\":\"International Conference on Enterprise Information Systems\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Enterprise Information Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0007713702130220\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Enterprise Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0007713702130220","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Model-based Framework to Automatically Generate Semi-real Data for Evaluating Data Analysis Techniques
As data analysis techniques progress, the focus shifts from simple tabular data to more complex data at the level of business objects. Therefore, the evaluation of such data analysis techniques is far from trivial. However, due to confidentiality, most researchers are facing problems collecting available real data to evaluate their techniques. One alternative approach is to use synthetic data instead of real data, which leads to unconvincing results. In this paper, we propose a framework to automatically operate information systems (supporting operational processes) to generate semi-real data (i.e., “operations related data” exclusive of images, sound, video, etc.). This data have the same structure as the real data and are more realistic than traditional simulated data. A plugin is implemented to realize the framework for automatic data generation.