{"title":"Mimir:扩展I/O接口以表达HPC中复杂工作负载的用户意图","authors":"H. Devarajan, K. Mohror","doi":"10.1109/IPDPS54959.2023.00027","DOIUrl":null,"url":null,"abstract":"The complexity of data management in HPC systems stems from the diversity in I/O behavior exhibited by new workloads, multistage workflows, and the presence of multitiered storage systems. This complexity is managed by the storage systems, which provide user-level configurations to allow the tuning of workload I/O within the system. However, these configurations are difficult to set by users who lack expertise in I/O subsystems. We propose a paradigm change in which users specify the intent of I/O operations and storage systems automatically set various configurations based on the supplied intent. To this end, we developed the Mimir infrastructure to assist users in passing I/O intent to the underlying storage system. We demonstrate several use cases that map user-defined intents to storage configurations that lead to optimized I/O. In this study, we make three observations. First, I/O intents should be applied to each level of the I/O storage stack, from HDF5 to MPI-IO to POSIX, and integrated using lightweight adaptors in the existing stack. Second, the Mimir infrastructure supports up to 400M Ops/sec throughput of intents in the system, with a low memory overhead of 6.85KB per node. Third, intents assist in configuring a hierarchical cache to preload I/O, buffer in a node-local device, and store data in a global cache to optimize I/O workloads by 2.33×, 4×, and 2.1×, respectively. Our Mimir infrastructure optimizes complex large-scale workflows by up to 4× better I/O performance on the Lassen supercomputer by using automatically derived I/O intents.","PeriodicalId":343684,"journal":{"name":"2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"163 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mimir: Extending I/O Interfaces to Express User Intent for Complex Workloads in HPC\",\"authors\":\"H. Devarajan, K. Mohror\",\"doi\":\"10.1109/IPDPS54959.2023.00027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The complexity of data management in HPC systems stems from the diversity in I/O behavior exhibited by new workloads, multistage workflows, and the presence of multitiered storage systems. This complexity is managed by the storage systems, which provide user-level configurations to allow the tuning of workload I/O within the system. However, these configurations are difficult to set by users who lack expertise in I/O subsystems. We propose a paradigm change in which users specify the intent of I/O operations and storage systems automatically set various configurations based on the supplied intent. To this end, we developed the Mimir infrastructure to assist users in passing I/O intent to the underlying storage system. We demonstrate several use cases that map user-defined intents to storage configurations that lead to optimized I/O. In this study, we make three observations. First, I/O intents should be applied to each level of the I/O storage stack, from HDF5 to MPI-IO to POSIX, and integrated using lightweight adaptors in the existing stack. Second, the Mimir infrastructure supports up to 400M Ops/sec throughput of intents in the system, with a low memory overhead of 6.85KB per node. Third, intents assist in configuring a hierarchical cache to preload I/O, buffer in a node-local device, and store data in a global cache to optimize I/O workloads by 2.33×, 4×, and 2.1×, respectively. Our Mimir infrastructure optimizes complex large-scale workflows by up to 4× better I/O performance on the Lassen supercomputer by using automatically derived I/O intents.\",\"PeriodicalId\":343684,\"journal\":{\"name\":\"2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"volume\":\"163 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPS54959.2023.00027\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS54959.2023.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mimir: Extending I/O Interfaces to Express User Intent for Complex Workloads in HPC
The complexity of data management in HPC systems stems from the diversity in I/O behavior exhibited by new workloads, multistage workflows, and the presence of multitiered storage systems. This complexity is managed by the storage systems, which provide user-level configurations to allow the tuning of workload I/O within the system. However, these configurations are difficult to set by users who lack expertise in I/O subsystems. We propose a paradigm change in which users specify the intent of I/O operations and storage systems automatically set various configurations based on the supplied intent. To this end, we developed the Mimir infrastructure to assist users in passing I/O intent to the underlying storage system. We demonstrate several use cases that map user-defined intents to storage configurations that lead to optimized I/O. In this study, we make three observations. First, I/O intents should be applied to each level of the I/O storage stack, from HDF5 to MPI-IO to POSIX, and integrated using lightweight adaptors in the existing stack. Second, the Mimir infrastructure supports up to 400M Ops/sec throughput of intents in the system, with a low memory overhead of 6.85KB per node. Third, intents assist in configuring a hierarchical cache to preload I/O, buffer in a node-local device, and store data in a global cache to optimize I/O workloads by 2.33×, 4×, and 2.1×, respectively. Our Mimir infrastructure optimizes complex large-scale workflows by up to 4× better I/O performance on the Lassen supercomputer by using automatically derived I/O intents.