Design and implementation of a load shedding engine for solving starvation problems in Apache Kafka

NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium Pub Date : 2018-04-23 DOI:10.1109/NOMS.2018.8406306

Jiwon Bang, Siwoon Son, Hajin Kim, Yang-Sae Moon, Mi-Jung Choi

{"title":"Design and implementation of a load shedding engine for solving starvation problems in Apache Kafka","authors":"Jiwon Bang, Siwoon Son, Hajin Kim, Yang-Sae Moon, Mi-Jung Choi","doi":"10.1109/NOMS.2018.8406306","DOIUrl":null,"url":null,"abstract":"Real-time data stream processing technologies such as Apache Storm and Apache Spark are being actively studied to deal with large-capacity data streams that generated rapidly in real time. Because it is difficult to use most real-time processing techniques alone, it is common to use it with a messaging system that supports input and output of data streams. Apache Kafka is a representative distributed messaging system, specialized in delivering large amounts of real-time log data. However, if the production rate of data in Kafka is faster than the consumption rate, data starvation problem may arise. In order to solve the starvation problem, a load shedding technique is needed to limit the incoming data and maintain system performance when the system is under load. Thus, in this paper confirmed the starvation problem that can occur in Kafka, and we designed and implemented a load shedding engine to solve this problem and proposed a solution to the starvation problem in Kafka based on the performance experiment.","PeriodicalId":19331,"journal":{"name":"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium","volume":"14 1","pages":"1-4"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NOMS.2018.8406306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

Abstract

Real-time data stream processing technologies such as Apache Storm and Apache Spark are being actively studied to deal with large-capacity data streams that generated rapidly in real time. Because it is difficult to use most real-time processing techniques alone, it is common to use it with a messaging system that supports input and output of data streams. Apache Kafka is a representative distributed messaging system, specialized in delivering large amounts of real-time log data. However, if the production rate of data in Kafka is faster than the consumption rate, data starvation problem may arise. In order to solve the starvation problem, a load shedding technique is needed to limit the incoming data and maintain system performance when the system is under load. Thus, in this paper confirmed the starvation problem that can occur in Kafka, and we designed and implemented a load shedding engine to solve this problem and proposed a solution to the starvation problem in Kafka based on the performance experiment.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

为解决Apache Kafka中的饥饿问题设计并实现了一个减载引擎

实时数据流处理技术(如Apache Storm和Apache Spark)正在积极研究，以处理实时快速生成的大容量数据流。由于很难单独使用大多数实时处理技术，因此通常将其与支持数据流输入和输出的消息传递系统一起使用。Apache Kafka是一个典型的分布式消息传递系统，专门用于交付大量实时日志数据。但是，如果Kafka中数据的生产速度快于消费速度，就可能出现数据饥饿问题。为了解决饥饿问题，需要一种减载技术来限制输入数据并在系统处于负载状态时保持系统性能。因此，本文确认了Kafka中可能出现的饥饿问题，我们设计并实现了一个减载引擎来解决这个问题，并在性能实验的基础上提出了Kafka中饥饿问题的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium

自引率

0.00%

发文量

期刊最新文献

SSH Kernel: A Jupyter Extension Specifically for Remote Infrastructure Administration Visual emulation for Ethereum's virtual machine Analyzing throughput and stability in cellular networks Network events in a large commercial network: What can we learn? Economic incentives on DNSSEC deployment: Time to move from quantity to quality