Data and resource aware incremental ML training in support of pervasive applications

IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Computing Pub Date : 2024-08-16 DOI:10.1007/s00607-024-01338-2
Thanasis Moustakas, Athanasios Tziouvaras, Kostas Kolomvatsos
{"title":"Data and resource aware incremental ML training in support of pervasive applications","authors":"Thanasis Moustakas, Athanasios Tziouvaras, Kostas Kolomvatsos","doi":"10.1007/s00607-024-01338-2","DOIUrl":null,"url":null,"abstract":"<p>Nowadays, the use of Artificial Intelligence (AI) and Machine Learning (ML) algorithms is increasingly affecting the performance of innovative systems. At the same time, the advent of the Internet of Things (IoT) and the Edge Computing (EC) as means to place computational resources close to users create the need for new models in the training process of ML schemes due to the limited computational capabilities of the devices/nodes placed there. In any case, we should not forget that IoT devices or EC nodes exhibit less capabilities than the Cloud back end that could be adopted for a more complex training upon vast volumes of data. The ideal case is to have, at least, basic training capabilities at the IoT-EC ecosystem in order to reduce the latency and face the needs of near real time applications. In this paper, we are motivated by this need and propose a model that tries to save time in the training process by focusing on the training dataset and its statistical description. We do not dive into the architecture of any ML model as we target to provide a more generic scheme that can be applied upon any ML module. We monitor the statistics of the training dataset and the loss during the process and identify if there is a potential to stop it when not significant contribution is foreseen for the data not yet adopted in the model. We argue that our approach can be applied only when a negligibly decreased accuracy is acceptable by the application gaining time and resources from the training process. We provide two algorithms for applying this approach and an extensive experimental evaluation upon multiple supervised ML models to reveal the benefits of the proposed scheme and its constraints.</p>","PeriodicalId":10718,"journal":{"name":"Computing","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00607-024-01338-2","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Nowadays, the use of Artificial Intelligence (AI) and Machine Learning (ML) algorithms is increasingly affecting the performance of innovative systems. At the same time, the advent of the Internet of Things (IoT) and the Edge Computing (EC) as means to place computational resources close to users create the need for new models in the training process of ML schemes due to the limited computational capabilities of the devices/nodes placed there. In any case, we should not forget that IoT devices or EC nodes exhibit less capabilities than the Cloud back end that could be adopted for a more complex training upon vast volumes of data. The ideal case is to have, at least, basic training capabilities at the IoT-EC ecosystem in order to reduce the latency and face the needs of near real time applications. In this paper, we are motivated by this need and propose a model that tries to save time in the training process by focusing on the training dataset and its statistical description. We do not dive into the architecture of any ML model as we target to provide a more generic scheme that can be applied upon any ML module. We monitor the statistics of the training dataset and the loss during the process and identify if there is a potential to stop it when not significant contribution is foreseen for the data not yet adopted in the model. We argue that our approach can be applied only when a negligibly decreased accuracy is acceptable by the application gaining time and resources from the training process. We provide two algorithms for applying this approach and an extensive experimental evaluation upon multiple supervised ML models to reveal the benefits of the proposed scheme and its constraints.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
支持普适应用的数据和资源感知增量 ML 训练
如今,人工智能(AI)和机器学习(ML)算法的使用正日益影响创新系统的性能。同时,物联网(IoT)和边缘计算(EC)作为将计算资源放置在用户附近的手段,由于放置在那里的设备/节点的计算能力有限,因此在 ML 方案的训练过程中需要新的模型。无论如何,我们都不应忘记,物联网设备或 EC 节点的能力不如云后端,而云后端可用于对海量数据进行更复杂的训练。理想的情况是,物联网-EC 生态系统至少具备基本的训练能力,以减少延迟并满足近实时应用的需求。在本文中,我们正是基于这一需求,提出了一个模型,该模型试图通过关注训练数据集及其统计描述来节省训练过程中的时间。我们没有深入研究任何 ML 模型的架构,因为我们的目标是提供一种可应用于任何 ML 模块的通用方案。我们监控训练数据集的统计信息和过程中的损失,并确定在模型中尚未采用的数据预计不会有重大贡献时,是否有可能停止训练。我们认为,我们的方法只有在精确度下降到可以忽略不计的程度,并且应用程序可以从训练过程中获得时间和资源的情况下才能使用。我们提供了两种应用这种方法的算法,并对多个有监督的 ML 模型进行了广泛的实验评估,以揭示所提方案的优势及其限制因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computing
Computing 工程技术-计算机:理论方法
CiteScore
8.20
自引率
2.70%
发文量
107
审稿时长
3 months
期刊介绍: Computing publishes original papers, short communications and surveys on all fields of computing. The contributions should be written in English and may be of theoretical or applied nature, the essential criteria are computational relevance and systematic foundation of results.
期刊最新文献
Mapping and just-in-time traffic congestion mitigation for emergency vehicles in smart cities Fog intelligence for energy efficient management in smart street lamps Contextual authentication of users and devices using machine learning Multi-objective service composition optimization problem in IoT for agriculture 4.0 Robust evaluation of GPU compute instances for HPC and AI in the cloud: a TOPSIS approach with sensitivity, bootstrapping, and non-parametric analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1