{"title":"负载均衡并行GPU核外连续LOD模型可视化","authors":"Chao Peng, Peng Mi, Yong Cao","doi":"10.1109/SC.Companion.2012.37","DOIUrl":null,"url":null,"abstract":"Rendering massive 3D models has been recognized as a challenging task. Due to the limited size of GPU memory, a massive model with hundreds of millions of primitives cannot fit into most of modern GPUs. By applying parallel Level-Of-Detail (LOD), as proposed in [1], transferring only a portion of primitives rather than the whole to the GPU is sufficient for generating a desired simplified version of the model. However, the low bandwidth in CPU-GPU communication make data-transferring a very time-consuming process that prevents users from achieving high-performance rendering of massive 3D models on a single-GPU system. This paper explores a device-level parallel design that distributes the workloads in a multi-GPU multi-display system. Our multi-GPU out-of-core uses a load-balancing method and seamlessly integrates with the parallel LOD algorithm. Our experiments show highly interactive frame rates of the “Boeing 777” airplane model that consists of over 332 million triangles and over 223 million vertices.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"34 1","pages":"215-223"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Load Balanced Parallel GPU Out-of-Core for Continuous LOD Model Visualization\",\"authors\":\"Chao Peng, Peng Mi, Yong Cao\",\"doi\":\"10.1109/SC.Companion.2012.37\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Rendering massive 3D models has been recognized as a challenging task. Due to the limited size of GPU memory, a massive model with hundreds of millions of primitives cannot fit into most of modern GPUs. By applying parallel Level-Of-Detail (LOD), as proposed in [1], transferring only a portion of primitives rather than the whole to the GPU is sufficient for generating a desired simplified version of the model. However, the low bandwidth in CPU-GPU communication make data-transferring a very time-consuming process that prevents users from achieving high-performance rendering of massive 3D models on a single-GPU system. This paper explores a device-level parallel design that distributes the workloads in a multi-GPU multi-display system. Our multi-GPU out-of-core uses a load-balancing method and seamlessly integrates with the parallel LOD algorithm. Our experiments show highly interactive frame rates of the “Boeing 777” airplane model that consists of over 332 million triangles and over 223 million vertices.\",\"PeriodicalId\":6346,\"journal\":{\"name\":\"2012 SC Companion: High Performance Computing, Networking Storage and Analysis\",\"volume\":\"34 1\",\"pages\":\"215-223\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 SC Companion: High Performance Computing, Networking Storage and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SC.Companion.2012.37\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC.Companion.2012.37","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Load Balanced Parallel GPU Out-of-Core for Continuous LOD Model Visualization
Rendering massive 3D models has been recognized as a challenging task. Due to the limited size of GPU memory, a massive model with hundreds of millions of primitives cannot fit into most of modern GPUs. By applying parallel Level-Of-Detail (LOD), as proposed in [1], transferring only a portion of primitives rather than the whole to the GPU is sufficient for generating a desired simplified version of the model. However, the low bandwidth in CPU-GPU communication make data-transferring a very time-consuming process that prevents users from achieving high-performance rendering of massive 3D models on a single-GPU system. This paper explores a device-level parallel design that distributes the workloads in a multi-GPU multi-display system. Our multi-GPU out-of-core uses a load-balancing method and seamlessly integrates with the parallel LOD algorithm. Our experiments show highly interactive frame rates of the “Boeing 777” airplane model that consists of over 332 million triangles and over 223 million vertices.