{"title":"过程工业中可扩展数据分析平台的数据集成","authors":"M. Sarnovský, P. Bednar, Miroslav Smatana","doi":"10.1109/INES.2017.8118553","DOIUrl":null,"url":null,"abstract":"The main objective of work presented in this paper is to introduce the architectural overview of the big data analytics platform for support of process industries. Our aim was to design and develop the cross-sectorial scalable environment, which will enable the data collection from different sources and support the development of predictive functions to help the process industries in optimizing of their production processes. This paper introduces the components of Big Data Storage and Analytics platform which is the core component of the developed cross-sectorial environment. Currently, it is built on top of the Apache Hadoop technology stack and relies on Hadoop distributed file system. On the other hand, we present the idea of integration of the data obtained from different production environments. Data integration is implemented using the Apache Nifi and we designed the workflows for processing both interval and real-time data from the production sites. In this case, we consider two pilot cases, an aluminium factory in France and a plastic molding factory in Portugal.","PeriodicalId":344933,"journal":{"name":"2017 IEEE 21st International Conference on Intelligent Engineering Systems (INES)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Data integration in scalable data analytics platform for process industries\",\"authors\":\"M. Sarnovský, P. Bednar, Miroslav Smatana\",\"doi\":\"10.1109/INES.2017.8118553\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The main objective of work presented in this paper is to introduce the architectural overview of the big data analytics platform for support of process industries. Our aim was to design and develop the cross-sectorial scalable environment, which will enable the data collection from different sources and support the development of predictive functions to help the process industries in optimizing of their production processes. This paper introduces the components of Big Data Storage and Analytics platform which is the core component of the developed cross-sectorial environment. Currently, it is built on top of the Apache Hadoop technology stack and relies on Hadoop distributed file system. On the other hand, we present the idea of integration of the data obtained from different production environments. Data integration is implemented using the Apache Nifi and we designed the workflows for processing both interval and real-time data from the production sites. In this case, we consider two pilot cases, an aluminium factory in France and a plastic molding factory in Portugal.\",\"PeriodicalId\":344933,\"journal\":{\"name\":\"2017 IEEE 21st International Conference on Intelligent Engineering Systems (INES)\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 21st International Conference on Intelligent Engineering Systems (INES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INES.2017.8118553\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 21st International Conference on Intelligent Engineering Systems (INES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INES.2017.8118553","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data integration in scalable data analytics platform for process industries
The main objective of work presented in this paper is to introduce the architectural overview of the big data analytics platform for support of process industries. Our aim was to design and develop the cross-sectorial scalable environment, which will enable the data collection from different sources and support the development of predictive functions to help the process industries in optimizing of their production processes. This paper introduces the components of Big Data Storage and Analytics platform which is the core component of the developed cross-sectorial environment. Currently, it is built on top of the Apache Hadoop technology stack and relies on Hadoop distributed file system. On the other hand, we present the idea of integration of the data obtained from different production environments. Data integration is implemented using the Apache Nifi and we designed the workflows for processing both interval and real-time data from the production sites. In this case, we consider two pilot cases, an aluminium factory in France and a plastic molding factory in Portugal.