A Novel Big Data Processing Approach to Feature Extraction for Electrical Discharge Machining based on Container Technology

Denata Rizky Alimadji, Min-Hsiung Hung, Yu-Chuan Lin, Benny Suryajaya, Chao-Chun Chen
{"title":"A Novel Big Data Processing Approach to Feature Extraction for Electrical Discharge Machining based on Container Technology","authors":"Denata Rizky Alimadji, Min-Hsiung Hung, Yu-Chuan Lin, Benny Suryajaya, Chao-Chun Chen","doi":"10.1109/SNPD51163.2021.9704989","DOIUrl":null,"url":null,"abstract":"EDM (Electrical Discharge Machining) is a process to remove metal from conductive materials using electrical sparks. To monitor the EDM process using virtual metrology (VM), we need to obtain the electrode’s voltage and current signals of a machine tool. Due to the nature of EDM, the sensors installed on the machine tool acquire the signals at a high sampling rate and generate a vast amount of data in a short time, thereby raising the big-data processing issue. Our previous work proposed an efficient approach called BEDPS to process the EDM big data in a Hadoop distributed cluster. This paper presents a novel big data processing approach to feature extraction for EDM by using container technology (i.e., Docker and Kubernetes). We re-implement some Spark algorithms of BEDPS in Python (originally in Scala) and then run the refined BEDPS in containers in a Kubernetes cluster. Testing results show that the refined BEDPS developed in this study can reduce the execution time by almost half, compared to the original Scala version (9.6577 minutes vs. 19.2735 minutes). The adoption of Python in Spark is also shown to have similar performance with Scala, although there are some cases where Python performance falls short, for example, parallel processing using Python parallel processing library. The results also show that the Kubernetes cluster is promising to be an alternative way, other than the Hadoop, for processing big data. At the same time, it can bring some advantages to the big data processing applications, such as easy deployment, robustly running, load balance, self-healing, failover, and horizontal auto-scaling for containerized applications.","PeriodicalId":235370,"journal":{"name":"2021 IEEE/ACIS 22nd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACIS 22nd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SNPD51163.2021.9704989","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

EDM (Electrical Discharge Machining) is a process to remove metal from conductive materials using electrical sparks. To monitor the EDM process using virtual metrology (VM), we need to obtain the electrode’s voltage and current signals of a machine tool. Due to the nature of EDM, the sensors installed on the machine tool acquire the signals at a high sampling rate and generate a vast amount of data in a short time, thereby raising the big-data processing issue. Our previous work proposed an efficient approach called BEDPS to process the EDM big data in a Hadoop distributed cluster. This paper presents a novel big data processing approach to feature extraction for EDM by using container technology (i.e., Docker and Kubernetes). We re-implement some Spark algorithms of BEDPS in Python (originally in Scala) and then run the refined BEDPS in containers in a Kubernetes cluster. Testing results show that the refined BEDPS developed in this study can reduce the execution time by almost half, compared to the original Scala version (9.6577 minutes vs. 19.2735 minutes). The adoption of Python in Spark is also shown to have similar performance with Scala, although there are some cases where Python performance falls short, for example, parallel processing using Python parallel processing library. The results also show that the Kubernetes cluster is promising to be an alternative way, other than the Hadoop, for processing big data. At the same time, it can bring some advantages to the big data processing applications, such as easy deployment, robustly running, load balance, self-healing, failover, and horizontal auto-scaling for containerized applications.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于容器技术的电火花加工特征提取大数据处理方法
电火花加工(EDM)是一种利用电火花从导电材料中去除金属的工艺。为了利用虚拟计量技术对电火花加工过程进行监控,需要获取机床电极的电压和电流信号。由于电火花加工的性质,安装在机床上的传感器以高采样率采集信号,并在短时间内产生大量数据,从而提出了大数据处理问题。我们之前的工作提出了一种称为BEDPS的高效方法来处理Hadoop分布式集群中的EDM大数据。本文提出了一种利用容器技术(即Docker和Kubernetes)进行EDM特征提取的新型大数据处理方法。我们在Python中重新实现了BEDPS的一些Spark算法(最初是在Scala中),然后在Kubernetes集群的容器中运行经过改进的BEDPS。测试结果表明,与原始Scala版本相比,本研究开发的改进BEDPS可以将执行时间减少近一半(9.6577分钟vs. 19.2735分钟)。在Spark中采用Python也显示出与Scala具有相似的性能,尽管在某些情况下Python的性能不足,例如使用Python并行处理库进行并行处理。结果还表明,Kubernetes集群有望成为除Hadoop之外的另一种处理大数据的方式。同时,它可以为大数据处理应用程序带来一些优势,例如易于部署、健壮运行、负载平衡、自修复、故障转移以及容器化应用程序的水平自动扩展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Quantum Annealing Approach for the Optimal Real-time Traffic Control using QUBO How to Enlighten Novice Users on Behavior of Machine Learning Models? Keynote Address: Deep Learning Networks for Medical Image Analysis: Its Past, Future, and Issues Web-based systems for inventory control in organizations: A Systematic Review Geometrical Schemes as Probabilistic and Entropic Tools to Estimate Duration and Peaks of Pandemic Waves
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1