Proceedings of the 23rd International Middleware Conference Extended Abstracts最新文献

英文中文

Proceedings of the 23rd International Middleware Conference Extended Abstracts

Pub Date : 2022-11-07 DOI: 10.1145/3568161.3568316

Remzi H. Arpaci-Dusseau

In this talk, I discuss how our group approaches the most basic question that faces all researchers: how to find good systems problems to work on? Through examples drawn from a research career now spanning nearly 30 years, I will present different problems we have worked on, and how we arrived upon them. The examples will highlight our work in file systems, storage systems, and distributed systems, including older work on reliability and more recent work on distributed systems.

在这次演讲中，我将讨论我们的团队如何处理所有研究人员面临的最基本问题:如何找到好的系统问题进行研究?通过从我近30年的研究生涯中抽取的例子，我将介绍我们研究过的不同问题，以及我们是如何得出这些问题的。这些示例将重点介绍我们在文件系统、存储系统和分布式系统方面的工作，包括以前关于可靠性的工作和最近关于分布式系统的工作。

引用次数: 0

Hardware-middleware system co-design for flexible training of foundation models in the cloud 基于云基础模型灵活训练的硬件中间件系统协同设计

Proceedings of the 23rd International Middleware Conference Extended Abstracts

Pub Date : 2022-11-07 DOI: 10.1145/3568161.3568317

Seetharami R. Seelam

Foundation models are a new class of AI models that are trained on broad data (typically via self-supervision) and that can be used in different downstream tasks. Due to self-supervision and the ability to train on massive amounts of unlabeled data, these models grew to have hundreds of billions of parameters, and they take many months on hundreds of GPU to train and generate a foundation model. So, AI Systems and middleware are critical to train these foundation models in scalable, cost-effective manner. In this talk, I will discuss the architecture of a new cloud-based AI System to train large scale foundation models. The system is built entirely out of open source software stack from hypervisor to guest operating systems, from container platforms to AI frameworks and libraries. It is natively built into IBM Cloud platform and the hardware and software stack is optimized for training of foundation models on hundreds of GPUs. We trained various foundation models with state-of-the-art accuracy in the shortest time on this platform. I will discuss the architecture, operational experience, and thoughts on the directions for the co-design of hardware and middleware for future AI Systems.

基础模型是一类新的人工智能模型，它们在广泛的数据(通常是通过自我监督)上进行训练，可以用于不同的下游任务。由于自我监督和在大量未标记数据上进行训练的能力，这些模型增长到拥有数千亿个参数，它们需要在数百个GPU上花费数月时间来训练和生成基础模型。因此，人工智能系统和中间件对于以可扩展、经济高效的方式训练这些基础模型至关重要。在这次演讲中，我将讨论一个新的基于云的人工智能系统的架构，以训练大规模的基础模型。该系统完全基于开源软件堆栈构建，从管理程序到客户操作系统，从容器平台到AI框架和库。它内置在IBM Cloud平台中，硬件和软件堆栈经过优化，可以在数百个gpu上训练基础模型。我们在这个平台上以最先进的精度在最短的时间内训练了各种基础模型。我将讨论未来AI系统的架构、操作经验以及硬件和中间件协同设计方向的想法。

{"title":"Hardware-middleware system co-design for flexible training of foundation models in the cloud","authors":"Seetharami R. Seelam","doi":"10.1145/3568161.3568317","DOIUrl":"https://doi.org/10.1145/3568161.3568317","url":null,"abstract":"Foundation models are a new class of AI models that are trained on broad data (typically via self-supervision) and that can be used in different downstream tasks. Due to self-supervision and the ability to train on massive amounts of unlabeled data, these models grew to have hundreds of billions of parameters, and they take many months on hundreds of GPU to train and generate a foundation model. So, AI Systems and middleware are critical to train these foundation models in scalable, cost-effective manner. In this talk, I will discuss the architecture of a new cloud-based AI System to train large scale foundation models. The system is built entirely out of open source software stack from hypervisor to guest operating systems, from container platforms to AI frameworks and libraries. It is natively built into IBM Cloud platform and the hardware and software stack is optimized for training of foundation models on hundreds of GPUs. We trained various foundation models with state-of-the-art accuracy in the shortest time on this platform. I will discuss the architecture, operational experience, and thoughts on the directions for the co-design of hardware and middleware for future AI Systems.","PeriodicalId":436911,"journal":{"name":"Proceedings of the 23rd International Middleware Conference Extended Abstracts","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127115700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Proceedings of the 23rd International Middleware Conference Extended Abstracts 第23届国际中间件会议论文集扩展摘要

Proceedings of the 23rd International Middleware Conference Extended Abstracts

Pub Date : 1900-01-01 DOI: 10.1145/3568161

引用次数: 0

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 23rd International Middleware Conference Extended Abstracts

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀