Cheng-Ying Wu, Qi Zhao, Cheng-Yu Cheng, Yuchen Yang, Muhammad Qureshi, Hang Liu, Genshe Chen
{"title":"基于机器学习的 Apache Storm 实时任务调度","authors":"Cheng-Ying Wu, Qi Zhao, Cheng-Yu Cheng, Yuchen Yang, Muhammad Qureshi, Hang Liu, Genshe Chen","doi":"10.1117/12.3021842","DOIUrl":null,"url":null,"abstract":"Apache Storm is a popular open-source distributed computing platform for real-time big-data processing. However, the existing task scheduling algorithms for Apache Storm do not adequately take into account the heterogeneity and dynamics of node computing resources and task demands, leading to high processing latency and suboptimal performance. In this thesis, we propose an innovative machine learning-based task scheduling scheme tailored for Apache Storm. The scheme leverages machine learning models to predict task performance and assigns a task to the computation node with the lowest predicted processing latency. In our design, each node operates a machine learning-based monitoring mechanism. When the master node schedules a new task, it queries the computation nodes obtains their available resources, and processes latency predictions to make the optimal assignment decision. We explored three machine learning models, including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN), and Deep Belief Networks (DBN). Our experiments showed that LSTM achieved the most accurate latency predictions. The evaluation results demonstrate that Apache Storm with the proposed LSTM-based scheduling scheme significantly improves the task processing delay and resource utilization, compared to the existing algorithms.","PeriodicalId":178341,"journal":{"name":"Defense + Commercial Sensing","volume":"110 8","pages":"130620I - 130620I-9"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning-based real-time task scheduling for Apache Storm\",\"authors\":\"Cheng-Ying Wu, Qi Zhao, Cheng-Yu Cheng, Yuchen Yang, Muhammad Qureshi, Hang Liu, Genshe Chen\",\"doi\":\"10.1117/12.3021842\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Apache Storm is a popular open-source distributed computing platform for real-time big-data processing. However, the existing task scheduling algorithms for Apache Storm do not adequately take into account the heterogeneity and dynamics of node computing resources and task demands, leading to high processing latency and suboptimal performance. In this thesis, we propose an innovative machine learning-based task scheduling scheme tailored for Apache Storm. The scheme leverages machine learning models to predict task performance and assigns a task to the computation node with the lowest predicted processing latency. In our design, each node operates a machine learning-based monitoring mechanism. When the master node schedules a new task, it queries the computation nodes obtains their available resources, and processes latency predictions to make the optimal assignment decision. We explored three machine learning models, including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN), and Deep Belief Networks (DBN). Our experiments showed that LSTM achieved the most accurate latency predictions. The evaluation results demonstrate that Apache Storm with the proposed LSTM-based scheduling scheme significantly improves the task processing delay and resource utilization, compared to the existing algorithms.\",\"PeriodicalId\":178341,\"journal\":{\"name\":\"Defense + Commercial Sensing\",\"volume\":\"110 8\",\"pages\":\"130620I - 130620I-9\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Defense + Commercial Sensing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.3021842\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Defense + Commercial Sensing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3021842","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Machine learning-based real-time task scheduling for Apache Storm
Apache Storm is a popular open-source distributed computing platform for real-time big-data processing. However, the existing task scheduling algorithms for Apache Storm do not adequately take into account the heterogeneity and dynamics of node computing resources and task demands, leading to high processing latency and suboptimal performance. In this thesis, we propose an innovative machine learning-based task scheduling scheme tailored for Apache Storm. The scheme leverages machine learning models to predict task performance and assigns a task to the computation node with the lowest predicted processing latency. In our design, each node operates a machine learning-based monitoring mechanism. When the master node schedules a new task, it queries the computation nodes obtains their available resources, and processes latency predictions to make the optimal assignment decision. We explored three machine learning models, including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN), and Deep Belief Networks (DBN). Our experiments showed that LSTM achieved the most accurate latency predictions. The evaluation results demonstrate that Apache Storm with the proposed LSTM-based scheduling scheme significantly improves the task processing delay and resource utilization, compared to the existing algorithms.