机器学习中的模型压缩技术综述

IF 3.4 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Applied Intelligence Pub Date : 2024-09-02 DOI:10.1007/s10489-024-05747-w

Pierre Vilar Dantas, Waldir Sabino da Silva Jr, Lucas Carvalho Cordeiro, Celso Barbosa Carvalho

{"title":"机器学习中的模型压缩技术综述","authors":"Pierre Vilar Dantas, Waldir Sabino da Silva Jr, Lucas Carvalho Cordeiro, Celso Barbosa Carvalho","doi":"10.1007/s10489-024-05747-w","DOIUrl":null,"url":null,"abstract":"<p>This paper critically examines model compression techniques within the machine learning (ML) domain, emphasizing their role in enhancing model efficiency for deployment in resource-constrained environments, such as mobile devices, edge computing, and Internet of Things (IoT) systems. By systematically exploring compression techniques and lightweight design architectures, it is provided a comprehensive understanding of their operational contexts and effectiveness. The synthesis of these strategies reveals a dynamic interplay between model performance and computational demand, highlighting the balance required for optimal application. As machine learning (ML) models grow increasingly complex and data-intensive, the demand for computational resources and memory has surged accordingly. This escalation presents significant challenges for the deployment of artificial intelligence (AI) systems in real-world applications, particularly where hardware capabilities are limited. Therefore, model compression techniques are not merely advantageous but essential for ensuring that these models can be utilized across various domains, maintaining high performance without prohibitive resource requirements. Furthermore, this review underscores the importance of model compression in sustainable artificial intelligence (AI) development. The introduction of hybrid methods, which combine multiple compression techniques, promises to deliver superior performance and efficiency. Additionally, the development of intelligent frameworks capable of selecting the most appropriate compression strategy based on specific application needs is crucial for advancing the field. The practical examples and engineering applications discussed demonstrate the real-world impact of these techniques. By optimizing the balance between model complexity and computational efficiency, model compression ensures that the advancements in AI technology remain sustainable and widely applicable. This comprehensive review thus contributes to the academic discourse and guides innovative solutions for efficient and responsible machine learning practices, paving the way for future advancements in the field.</p>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 22","pages":"11804 - 11844"},"PeriodicalIF":3.4000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-05747-w.pdf","citationCount":"0","resultStr":"{\"title\":\"A comprehensive review of model compression techniques in machine learning\",\"authors\":\"Pierre Vilar Dantas, Waldir Sabino da Silva Jr, Lucas Carvalho Cordeiro, Celso Barbosa Carvalho\",\"doi\":\"10.1007/s10489-024-05747-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>This paper critically examines model compression techniques within the machine learning (ML) domain, emphasizing their role in enhancing model efficiency for deployment in resource-constrained environments, such as mobile devices, edge computing, and Internet of Things (IoT) systems. By systematically exploring compression techniques and lightweight design architectures, it is provided a comprehensive understanding of their operational contexts and effectiveness. The synthesis of these strategies reveals a dynamic interplay between model performance and computational demand, highlighting the balance required for optimal application. As machine learning (ML) models grow increasingly complex and data-intensive, the demand for computational resources and memory has surged accordingly. This escalation presents significant challenges for the deployment of artificial intelligence (AI) systems in real-world applications, particularly where hardware capabilities are limited. Therefore, model compression techniques are not merely advantageous but essential for ensuring that these models can be utilized across various domains, maintaining high performance without prohibitive resource requirements. Furthermore, this review underscores the importance of model compression in sustainable artificial intelligence (AI) development. The introduction of hybrid methods, which combine multiple compression techniques, promises to deliver superior performance and efficiency. Additionally, the development of intelligent frameworks capable of selecting the most appropriate compression strategy based on specific application needs is crucial for advancing the field. The practical examples and engineering applications discussed demonstrate the real-world impact of these techniques. By optimizing the balance between model complexity and computational efficiency, model compression ensures that the advancements in AI technology remain sustainable and widely applicable. This comprehensive review thus contributes to the academic discourse and guides innovative solutions for efficient and responsible machine learning practices, paving the way for future advancements in the field.</p>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"54 22\",\"pages\":\"11804 - 11844\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10489-024-05747-w.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-024-05747-w\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-05747-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

摘要本文批判性地研究了机器学习（ML）领域的模型压缩技术，强调了这些技术在提高模型效率方面的作用，以便在移动设备、边缘计算和物联网（IoT）系统等资源受限的环境中进行部署。通过系统地探索压缩技术和轻量级设计架构，可以全面了解它们的运行环境和有效性。这些策略的综合运用揭示了模型性能与计算需求之间的动态相互作用，突出了最佳应用所需的平衡。随着机器学习（ML）模型日益复杂和数据密集，对计算资源和内存的需求也相应激增。这种升级给人工智能（AI）系统在实际应用中的部署带来了巨大挑战，尤其是在硬件能力有限的情况下。因此，模型压缩技术不仅具有优势，而且对于确保这些模型能在不同领域中使用、在不需要过多资源的情况下保持高性能至关重要。此外，本综述还强调了模型压缩在人工智能（AI）可持续发展中的重要性。混合方法结合了多种压缩技术，有望带来卓越的性能和效率。此外，开发能够根据特定应用需求选择最合适压缩策略的智能框架对于推动该领域的发展至关重要。所讨论的实际例子和工程应用证明了这些技术在现实世界中的影响。通过优化模型复杂性与计算效率之间的平衡，模型压缩可确保人工智能技术的进步保持可持续性和广泛适用性。因此，本综述有助于学术讨论，并为高效、负责任的机器学习实践提供了创新解决方案，为该领域的未来发展铺平了道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A comprehensive review of model compression techniques in machine learning

This paper critically examines model compression techniques within the machine learning (ML) domain, emphasizing their role in enhancing model efficiency for deployment in resource-constrained environments, such as mobile devices, edge computing, and Internet of Things (IoT) systems. By systematically exploring compression techniques and lightweight design architectures, it is provided a comprehensive understanding of their operational contexts and effectiveness. The synthesis of these strategies reveals a dynamic interplay between model performance and computational demand, highlighting the balance required for optimal application. As machine learning (ML) models grow increasingly complex and data-intensive, the demand for computational resources and memory has surged accordingly. This escalation presents significant challenges for the deployment of artificial intelligence (AI) systems in real-world applications, particularly where hardware capabilities are limited. Therefore, model compression techniques are not merely advantageous but essential for ensuring that these models can be utilized across various domains, maintaining high performance without prohibitive resource requirements. Furthermore, this review underscores the importance of model compression in sustainable artificial intelligence (AI) development. The introduction of hybrid methods, which combine multiple compression techniques, promises to deliver superior performance and efficiency. Additionally, the development of intelligent frameworks capable of selecting the most appropriate compression strategy based on specific application needs is crucial for advancing the field. The practical examples and engineering applications discussed demonstrate the real-world impact of these techniques. By optimizing the balance between model complexity and computational efficiency, model compression ensures that the advancements in AI technology remain sustainable and widely applicable. This comprehensive review thus contributes to the academic discourse and guides innovative solutions for efficient and responsible machine learning practices, paving the way for future advancements in the field.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Intelligence 工程技术-计算机：人工智能

CiteScore

6.60

自引率

20.80%

发文量

1361

审稿时长

5.9 months

期刊介绍： With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.