{"title":"微服务架构下机器学习引擎资源监控","authors":"Nikunj Parekh, Swathi Kurunji, Alan Beck","doi":"10.1109/IEMCON.2018.8614791","DOIUrl":null,"url":null,"abstract":"Microservices architecture facilitates building distributed scalable software products, usually deployed in a cloud environment. Monitoring microservices deployed in a Kubernetes orchestrated distributed advanced analytics machine learning engines is at the heart of many cloud resource management solutions. In addition, measuring resource utilization at more granular level such as per query or sub-query basis in an MPP Machine Learning Engine (MLE) is key to resource planning and is also the focus of our work. In this paper we propose two mechanisms to measure resource utilization in Teradata Machine Learning Engine (MLE). First mechanism is the Cluster Resource Monitoring (CRM). CRM is a high-level resource measuring mechanism for IT administrators and analytics users to visualize, plot, generates alerts and perform live and historical-analytics on overall cluster usage statistics. Second mechanism is the Query Resource Monitoring (QRM). QRM enables IT administrators and MLE users to measure compute resource utilization per individual query and its sub-queries. When query takes long time, QRM provides insights. This is useful to identify expensive phases within a query that tax certain resources more and skew the work distribution. We show the results of proposed mechanisms and highlight use-cases.","PeriodicalId":368939,"journal":{"name":"2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)","volume":"295 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Monitoring Resources of Machine Learning Engine In Microservices Architecture\",\"authors\":\"Nikunj Parekh, Swathi Kurunji, Alan Beck\",\"doi\":\"10.1109/IEMCON.2018.8614791\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Microservices architecture facilitates building distributed scalable software products, usually deployed in a cloud environment. Monitoring microservices deployed in a Kubernetes orchestrated distributed advanced analytics machine learning engines is at the heart of many cloud resource management solutions. In addition, measuring resource utilization at more granular level such as per query or sub-query basis in an MPP Machine Learning Engine (MLE) is key to resource planning and is also the focus of our work. In this paper we propose two mechanisms to measure resource utilization in Teradata Machine Learning Engine (MLE). First mechanism is the Cluster Resource Monitoring (CRM). CRM is a high-level resource measuring mechanism for IT administrators and analytics users to visualize, plot, generates alerts and perform live and historical-analytics on overall cluster usage statistics. Second mechanism is the Query Resource Monitoring (QRM). QRM enables IT administrators and MLE users to measure compute resource utilization per individual query and its sub-queries. When query takes long time, QRM provides insights. This is useful to identify expensive phases within a query that tax certain resources more and skew the work distribution. We show the results of proposed mechanisms and highlight use-cases.\",\"PeriodicalId\":368939,\"journal\":{\"name\":\"2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)\",\"volume\":\"295 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IEMCON.2018.8614791\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEMCON.2018.8614791","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Monitoring Resources of Machine Learning Engine In Microservices Architecture
Microservices architecture facilitates building distributed scalable software products, usually deployed in a cloud environment. Monitoring microservices deployed in a Kubernetes orchestrated distributed advanced analytics machine learning engines is at the heart of many cloud resource management solutions. In addition, measuring resource utilization at more granular level such as per query or sub-query basis in an MPP Machine Learning Engine (MLE) is key to resource planning and is also the focus of our work. In this paper we propose two mechanisms to measure resource utilization in Teradata Machine Learning Engine (MLE). First mechanism is the Cluster Resource Monitoring (CRM). CRM is a high-level resource measuring mechanism for IT administrators and analytics users to visualize, plot, generates alerts and perform live and historical-analytics on overall cluster usage statistics. Second mechanism is the Query Resource Monitoring (QRM). QRM enables IT administrators and MLE users to measure compute resource utilization per individual query and its sub-queries. When query takes long time, QRM provides insights. This is useful to identify expensive phases within a query that tax certain resources more and skew the work distribution. We show the results of proposed mechanisms and highlight use-cases.