Steven W. D. Chien, Artur Podobas, Martin Svedin, A. Tkachuk, Salem El Sayed, Pawel Herman, G. Umanesan, Sai B. Narasimhamurthy, S. Markidis
{"title":"NoaSci:用于对象存储科学应用的数字对象数组库","authors":"Steven W. D. Chien, Artur Podobas, Martin Svedin, A. Tkachuk, Salem El Sayed, Pawel Herman, G. Umanesan, Sai B. Narasimhamurthy, S. Markidis","doi":"10.1109/pdp55904.2022.00034","DOIUrl":null,"url":null,"abstract":"The strong consistency and stateful workflow are seen as the major factors for limiting parallel I/O performance because of the need for locking and state management. While the POSIX-based I/O model dominates modern HPC storage infrastructure, emerging object storage technology can potentially improve I/O performance by eliminating these bottlenecks. Despite a wide deployment on the cloud, its adoption in HPC remains low. We argue one reason is the lack of a suitable programming interface for parallel I/O in scientific applications. In this work, we introduce NoaSci, a Numerical Object Array library for scientific applications. NoaSci supports different data formats (e.g. HDF5, binary), and focuses on supporting nodelocal burst buffers and object stores. We demonstrate for the first time how scientific applications can perform parallel I/O on Seagate’s Motr object store through NoaSci. We evaluate NoaSci’s preliminary performance using the iPIC3D space weather application and position against existing I/O methods.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"NoaSci: A Numerical Object Array Library for I/O of Scientific Applications on Object Storage\",\"authors\":\"Steven W. D. Chien, Artur Podobas, Martin Svedin, A. Tkachuk, Salem El Sayed, Pawel Herman, G. Umanesan, Sai B. Narasimhamurthy, S. Markidis\",\"doi\":\"10.1109/pdp55904.2022.00034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The strong consistency and stateful workflow are seen as the major factors for limiting parallel I/O performance because of the need for locking and state management. While the POSIX-based I/O model dominates modern HPC storage infrastructure, emerging object storage technology can potentially improve I/O performance by eliminating these bottlenecks. Despite a wide deployment on the cloud, its adoption in HPC remains low. We argue one reason is the lack of a suitable programming interface for parallel I/O in scientific applications. In this work, we introduce NoaSci, a Numerical Object Array library for scientific applications. NoaSci supports different data formats (e.g. HDF5, binary), and focuses on supporting nodelocal burst buffers and object stores. We demonstrate for the first time how scientific applications can perform parallel I/O on Seagate’s Motr object store through NoaSci. We evaluate NoaSci’s preliminary performance using the iPIC3D space weather application and position against existing I/O methods.\",\"PeriodicalId\":210759,\"journal\":{\"name\":\"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/pdp55904.2022.00034\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/pdp55904.2022.00034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
NoaSci: A Numerical Object Array Library for I/O of Scientific Applications on Object Storage
The strong consistency and stateful workflow are seen as the major factors for limiting parallel I/O performance because of the need for locking and state management. While the POSIX-based I/O model dominates modern HPC storage infrastructure, emerging object storage technology can potentially improve I/O performance by eliminating these bottlenecks. Despite a wide deployment on the cloud, its adoption in HPC remains low. We argue one reason is the lack of a suitable programming interface for parallel I/O in scientific applications. In this work, we introduce NoaSci, a Numerical Object Array library for scientific applications. NoaSci supports different data formats (e.g. HDF5, binary), and focuses on supporting nodelocal burst buffers and object stores. We demonstrate for the first time how scientific applications can perform parallel I/O on Seagate’s Motr object store through NoaSci. We evaluate NoaSci’s preliminary performance using the iPIC3D space weather application and position against existing I/O methods.