{"title":"异构和异构数据聚合与异常检测系统","authors":"Yunze Li, Yuxuan Wu, Ruisen Tang","doi":"10.1145/3520084.3520099","DOIUrl":null,"url":null,"abstract":"With the development of big data technology, data accessed by big data platforms maintain the features of mass, isomerism, heterogeneous, and streaming. Therefore, how to access the varied data sources of isomerism and heterogeneous data and how to process and analyze the data become the current challenges. In this paper, we design and implement a data aggregation and anomaly detection system for isomerism and heterogeneous data. The system proposes a novel isomerism and heterogeneous data access sub-system. The sub-system applies improved Avro as the unified data description format and presents different storage algorithms for data serialization to raise the data adaption efficiency. The system adopts Kafka as the message middleware for data aggregation and distribution. Also, we design the anomaly detection and alarming sub-system for detecting the anomalies of streaming data on time and notifying the users. The data aggregation and anomaly detection system has passed all the tests and applied in small and medium-sized enterprises.","PeriodicalId":444957,"journal":{"name":"Proceedings of the 2022 5th International Conference on Software Engineering and Information Management","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Data Aggregation and Anomaly Detection System for Isomerism and Heterogeneous Data\",\"authors\":\"Yunze Li, Yuxuan Wu, Ruisen Tang\",\"doi\":\"10.1145/3520084.3520099\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the development of big data technology, data accessed by big data platforms maintain the features of mass, isomerism, heterogeneous, and streaming. Therefore, how to access the varied data sources of isomerism and heterogeneous data and how to process and analyze the data become the current challenges. In this paper, we design and implement a data aggregation and anomaly detection system for isomerism and heterogeneous data. The system proposes a novel isomerism and heterogeneous data access sub-system. The sub-system applies improved Avro as the unified data description format and presents different storage algorithms for data serialization to raise the data adaption efficiency. The system adopts Kafka as the message middleware for data aggregation and distribution. Also, we design the anomaly detection and alarming sub-system for detecting the anomalies of streaming data on time and notifying the users. The data aggregation and anomaly detection system has passed all the tests and applied in small and medium-sized enterprises.\",\"PeriodicalId\":444957,\"journal\":{\"name\":\"Proceedings of the 2022 5th International Conference on Software Engineering and Information Management\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 5th International Conference on Software Engineering and Information Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3520084.3520099\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th International Conference on Software Engineering and Information Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3520084.3520099","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data Aggregation and Anomaly Detection System for Isomerism and Heterogeneous Data
With the development of big data technology, data accessed by big data platforms maintain the features of mass, isomerism, heterogeneous, and streaming. Therefore, how to access the varied data sources of isomerism and heterogeneous data and how to process and analyze the data become the current challenges. In this paper, we design and implement a data aggregation and anomaly detection system for isomerism and heterogeneous data. The system proposes a novel isomerism and heterogeneous data access sub-system. The sub-system applies improved Avro as the unified data description format and presents different storage algorithms for data serialization to raise the data adaption efficiency. The system adopts Kafka as the message middleware for data aggregation and distribution. Also, we design the anomaly detection and alarming sub-system for detecting the anomalies of streaming data on time and notifying the users. The data aggregation and anomaly detection system has passed all the tests and applied in small and medium-sized enterprises.