使用学习、优化和控制的应用程序性能管理

Xiaoyun Zhu
{"title":"使用学习、优化和控制的应用程序性能管理","authors":"Xiaoyun Zhu","doi":"10.1145/2568088.2576098","DOIUrl":null,"url":null,"abstract":"In the past decade, the IT industry has experienced a paradigm shift as computing resources became available as a utility through cloud based services. In spite of the wider adoption of cloud computing platforms, some businesses and organizations hesitate to move all their applications to the cloud due to performance concerns. Existing practices in application performance management rely heavily on white-box modeling and diagnosis approaches or on performance troubleshooting \"cookbooks\" to find potential bottlenecks and remediation steps. However, the scalability and adaptivity of such approaches remain severely constrained, especially in a highly-dynamic, consolidated cloud environment. For performance isolation and differentiation, most modern hypervisors offer powerful resource control primitives such as reservations, limits, and shares for individual virtual machines (VMs). Even so, with the exploding growth of virtual machine sprawl, setting these controls properly such that co-located virtualized applications get enough resources to meet their respective service level objectives (SLOs) becomes a nearly insoluble task. These challenges present unique opportunities in leveraging the rich telemetry collected from applications and systems in the cloud, and in applying statistical learning, optimization, and control based techniques to developing model-based, automated application performance management frameworks. There has been a large body of research in this area in the last several years, but many problems remain. In this talk, I'll highlight some of the automated and data-driven performance management techniques we have developed, along with related technical challenges. I'll then discuss open research problems, in hope to attract more innovative ideas and solutions from a larger community of researchers and practitioners.","PeriodicalId":243233,"journal":{"name":"Proceedings of the 5th ACM/SPEC international conference on Performance engineering","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Application performance management using learning, optimization, and control\",\"authors\":\"Xiaoyun Zhu\",\"doi\":\"10.1145/2568088.2576098\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the past decade, the IT industry has experienced a paradigm shift as computing resources became available as a utility through cloud based services. In spite of the wider adoption of cloud computing platforms, some businesses and organizations hesitate to move all their applications to the cloud due to performance concerns. Existing practices in application performance management rely heavily on white-box modeling and diagnosis approaches or on performance troubleshooting \\\"cookbooks\\\" to find potential bottlenecks and remediation steps. However, the scalability and adaptivity of such approaches remain severely constrained, especially in a highly-dynamic, consolidated cloud environment. For performance isolation and differentiation, most modern hypervisors offer powerful resource control primitives such as reservations, limits, and shares for individual virtual machines (VMs). Even so, with the exploding growth of virtual machine sprawl, setting these controls properly such that co-located virtualized applications get enough resources to meet their respective service level objectives (SLOs) becomes a nearly insoluble task. These challenges present unique opportunities in leveraging the rich telemetry collected from applications and systems in the cloud, and in applying statistical learning, optimization, and control based techniques to developing model-based, automated application performance management frameworks. There has been a large body of research in this area in the last several years, but many problems remain. In this talk, I'll highlight some of the automated and data-driven performance management techniques we have developed, along with related technical challenges. I'll then discuss open research problems, in hope to attract more innovative ideas and solutions from a larger community of researchers and practitioners.\",\"PeriodicalId\":243233,\"journal\":{\"name\":\"Proceedings of the 5th ACM/SPEC international conference on Performance engineering\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-03-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 5th ACM/SPEC international conference on Performance engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2568088.2576098\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th ACM/SPEC international conference on Performance engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2568088.2576098","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

在过去的十年中,随着计算资源通过基于云的服务作为实用程序可用,IT行业经历了范式转变。尽管云计算平台得到了更广泛的采用,但由于性能问题,一些企业和组织对将所有应用程序迁移到云中犹豫不决。应用程序性能管理中的现有实践严重依赖于白盒建模和诊断方法,或者依赖于性能故障排除“菜谱”来发现潜在的瓶颈和补救步骤。但是,这些方法的可伸缩性和适应性仍然受到严重限制,特别是在高度动态的合并云环境中。为了实现性能隔离和区分,大多数现代管理程序都提供了强大的资源控制原语,例如针对单个虚拟机(vm)的保留、限制和共享。即便如此,随着虚拟机扩展的爆炸式增长,适当地设置这些控制以使位于同一位置的虚拟化应用程序获得足够的资源来满足各自的服务水平目标(slo)几乎成为一项无法解决的任务。这些挑战为利用从云中的应用程序和系统收集的丰富遥测数据,以及应用统计学习、优化和基于控制的技术来开发基于模型的自动化应用程序性能管理框架提供了独特的机会。在过去的几年里,这一领域已经有了大量的研究,但仍然存在许多问题。在这次演讲中,我将重点介绍我们开发的一些自动化和数据驱动的性能管理技术,以及相关的技术挑战。然后,我将讨论开放的研究问题,希望从更大的研究人员和实践者群体中吸引更多的创新想法和解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Application performance management using learning, optimization, and control
In the past decade, the IT industry has experienced a paradigm shift as computing resources became available as a utility through cloud based services. In spite of the wider adoption of cloud computing platforms, some businesses and organizations hesitate to move all their applications to the cloud due to performance concerns. Existing practices in application performance management rely heavily on white-box modeling and diagnosis approaches or on performance troubleshooting "cookbooks" to find potential bottlenecks and remediation steps. However, the scalability and adaptivity of such approaches remain severely constrained, especially in a highly-dynamic, consolidated cloud environment. For performance isolation and differentiation, most modern hypervisors offer powerful resource control primitives such as reservations, limits, and shares for individual virtual machines (VMs). Even so, with the exploding growth of virtual machine sprawl, setting these controls properly such that co-located virtualized applications get enough resources to meet their respective service level objectives (SLOs) becomes a nearly insoluble task. These challenges present unique opportunities in leveraging the rich telemetry collected from applications and systems in the cloud, and in applying statistical learning, optimization, and control based techniques to developing model-based, automated application performance management frameworks. There has been a large body of research in this area in the last several years, but many problems remain. In this talk, I'll highlight some of the automated and data-driven performance management techniques we have developed, along with related technical challenges. I'll then discuss open research problems, in hope to attract more innovative ideas and solutions from a larger community of researchers and practitioners.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The taming of the shrew: increasing performance by automatic parameter tuning for java garbage collectors Uncertainties in the modeling of self-adaptive systems: a taxonomy and an example of availability evaluation Scalable hybrid stream and hadoop network analysis system Efficient optimization of software performance models via parameter-space pruning Real-time multi-cloud management needs application awareness
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1