The majority of university courses which educate students in high performance, parallel, and distributed computing are located within computer science departments. This can potentially be a hurdle to students from other disciplines who need to acquire these critical skills.We discuss a sequence of application-driven courses designed to educate undergraduate and graduate students who do not necessarily have a computer science background on developing scientific research software, with an emphasis on using high performance, parallel, and distributed computational systems.
{"title":"Computational Science Education Focused on Future Domain Scientists","authors":"Lucas A. Wilson, S. Charlie Dey","doi":"10.1109/EDUHPC.2016.8","DOIUrl":"https://doi.org/10.1109/EDUHPC.2016.8","url":null,"abstract":"The majority of university courses which educate students in high performance, parallel, and distributed computing are located within computer science departments. This can potentially be a hurdle to students from other disciplines who need to acquire these critical skills.We discuss a sequence of application-driven courses designed to educate undergraduate and graduate students who do not necessarily have a computer science background on developing scientific research software, with an emphasis on using high performance, parallel, and distributed computational systems.","PeriodicalId":415151,"journal":{"name":"2016 Workshop on Education for High-Performance Computing (EduHPC)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122942588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present our Concurrent Systems class, where parallel programming and parallel and distributed computing (PDC) concepts have been taught for more than 20 years. Despite several rounds of changes in hardware, the class maintains its goals of allowing students to learn parallel computer organizations, studying parallel algorithms, and writing code to be able to run on parallel and distributed platforms. We discuss the benefits of such a class, reveal the key elements in developing this class and receiving funding to replace outdated hardware. We will also share our activities in attracting more students to be interested in PDC and related topics.
{"title":"20 Years of Teaching Parallel Processing to Computer Science Seniors","authors":"Jie Liu","doi":"10.1109/EDUHPC.2016.6","DOIUrl":"https://doi.org/10.1109/EDUHPC.2016.6","url":null,"abstract":"In this paper, we present our Concurrent Systems class, where parallel programming and parallel and distributed computing (PDC) concepts have been taught for more than 20 years. Despite several rounds of changes in hardware, the class maintains its goals of allowing students to learn parallel computer organizations, studying parallel algorithms, and writing code to be able to run on parallel and distributed platforms. We discuss the benefits of such a class, reveal the key elements in developing this class and receiving funding to replace outdated hardware. We will also share our activities in attracting more students to be interested in PDC and related topics.","PeriodicalId":415151,"journal":{"name":"2016 Workshop on Education for High-Performance Computing (EduHPC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130388984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Throughout three iterations and six years we have developed a project-based course in HPC for single-box computers tailored to science students in general. The course is based on strong premises: showing that assembly is what actually runs on machines, dividing parallelism in three dimensions (ILP, DLP, TLP), and using them incrementally in a single numerical simulation throughout the course working in interdisciplinary pairs (CS, non-CS). The final goal is to explore how to use all the available transistors in a die. Assembly proved a great tool to show how bare-metal works, an alternative-semantics approach to programs, and a tool to demystify compiler technology. Parallelism is tackled gradually with a clear division into instruction, data, and thread parallelism. GPUs, through CUDA in particular, are used as a radically different approach to the three dimensions of parallelism. Each dimension is explored in a gradual manner, starting from a sequential toy-yet-interesting numerical simulation. After using each form of parallelism and submitting a short report, the experiences are put together in group discussion unveiling the strengths and weaknesses of each form of parallelism for each class of numerical simulation. Although there is a high variance in the students' background, CS and non-CS students pair well in project development, generating understanding and value of the disciplines. The experience proved successful, with former students producing parallel accelerated code of their own in their disciplines.
{"title":"A Project-Based HPC Course for Single-Box Computers","authors":"C. Bederián, N. Wolovick","doi":"10.1109/EDUHPC.2016.5","DOIUrl":"https://doi.org/10.1109/EDUHPC.2016.5","url":null,"abstract":"Throughout three iterations and six years we have developed a project-based course in HPC for single-box computers tailored to science students in general. The course is based on strong premises: showing that assembly is what actually runs on machines, dividing parallelism in three dimensions (ILP, DLP, TLP), and using them incrementally in a single numerical simulation throughout the course working in interdisciplinary pairs (CS, non-CS). The final goal is to explore how to use all the available transistors in a die. Assembly proved a great tool to show how bare-metal works, an alternative-semantics approach to programs, and a tool to demystify compiler technology. Parallelism is tackled gradually with a clear division into instruction, data, and thread parallelism. GPUs, through CUDA in particular, are used as a radically different approach to the three dimensions of parallelism. Each dimension is explored in a gradual manner, starting from a sequential toy-yet-interesting numerical simulation. After using each form of parallelism and submitting a short report, the experiences are put together in group discussion unveiling the strengths and weaknesses of each form of parallelism for each class of numerical simulation. Although there is a high variance in the students' background, CS and non-CS students pair well in project development, generating understanding and value of the disciplines. The experience proved successful, with former students producing parallel accelerated code of their own in their disciplines.","PeriodicalId":415151,"journal":{"name":"2016 Workshop on Education for High-Performance Computing (EduHPC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128868307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael E. Baldwin, Xiao Zhu, Preston M. Smith, Stephen Lien Harrell, R. Skeel, A. Maji
To teach the computational science necessary to prepare STEM students for positions in both research and industry, faculty need HPC resources specifically tailored for their classrooms. Scholar was developed as a large-scale computing tool that faculty can use in their classrooms to teach HPC as well as scientific principles and experimentation. In this paper, we discuss the pedagogical need for a campus-wide HPC teaching resource and outline how such a resource was implemented at Purdue University.
{"title":"Scholar: A Campus HPC Resource to Enable Computational Literacy","authors":"Michael E. Baldwin, Xiao Zhu, Preston M. Smith, Stephen Lien Harrell, R. Skeel, A. Maji","doi":"10.1109/EDUHPC.2016.9","DOIUrl":"https://doi.org/10.1109/EDUHPC.2016.9","url":null,"abstract":"To teach the computational science necessary to prepare STEM students for positions in both research and industry, faculty need HPC resources specifically tailored for their classrooms. Scholar was developed as a large-scale computing tool that faculty can use in their classrooms to teach HPC as well as scientific principles and experimentation. In this paper, we discuss the pedagogical need for a campus-wide HPC teaching resource and outline how such a resource was implemented at Purdue University.","PeriodicalId":415151,"journal":{"name":"2016 Workshop on Education for High-Performance Computing (EduHPC)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126435334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Message Passing Interface (MPI) is the de facto standard for programming large scale parallelism, with up to millions of individual processes. Its dominant paradigm of Single Program Multiple Data (SPMD) programming is different from threaded and multicore parallelism, to an extent that students have a hard time switching models. In contrast to threaded programming, which allows for a view of the execution with central control and a central repository of data, SPMD programming has a symmetric model where all processes are active all the time, and none is priviliged in any sense, and where data is distributed.This model is counterintuitive to the novice parallel programmer, so care needs to be taken how to instill the proper ‘mental model'.We identify problems with the currently common way of teaching MPI, and propose a way that is geared to explicit reinforcing the symmetric model. Additionally, we teach starting from realistic scenarios, rather than writing artificial code just to exercise a newly-learned routine.This motivation implies that we reverse the commonly used order of presenting MPI routines, starting with collectives, and later introducing point-to-point routines only as support for certain symmetric operations, avoiding the process-to-process model.
{"title":"Teaching MPI from Mental Models","authors":"V. Eijkhout","doi":"10.1109/EDUHPC.2016.7","DOIUrl":"https://doi.org/10.1109/EDUHPC.2016.7","url":null,"abstract":"The Message Passing Interface (MPI) is the de facto standard for programming large scale parallelism, with up to millions of individual processes. Its dominant paradigm of Single Program Multiple Data (SPMD) programming is different from threaded and multicore parallelism, to an extent that students have a hard time switching models. In contrast to threaded programming, which allows for a view of the execution with central control and a central repository of data, SPMD programming has a symmetric model where all processes are active all the time, and none is priviliged in any sense, and where data is distributed.This model is counterintuitive to the novice parallel programmer, so care needs to be taken how to instill the proper ‘mental model'.We identify problems with the currently common way of teaching MPI, and propose a way that is geared to explicit reinforcing the symmetric model. Additionally, we teach starting from realistic scenarios, rather than writing artificial code just to exercise a newly-learned routine.This motivation implies that we reverse the commonly used order of presenting MPI routines, starting with collectives, and later introducing point-to-point routines only as support for certain symmetric operations, avoiding the process-to-process model.","PeriodicalId":415151,"journal":{"name":"2016 Workshop on Education for High-Performance Computing (EduHPC)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127234544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carolyn Connor, A. Bonnie, G. Grider, Andree Jacobson
Sustainable and effective computing infrastructure depends critically on the skills and expertise of domain scientists and committed and well trained advanced computing professionals. Unlike computing hardware, with a typical lifetime of a few years, the human infrastructure of technical skills and expertise in operating, maintaining, and evolving advanced computing systems and technology has a lifetime of decades [1]. Given that the effective operation and use of High Performance Computing systems requires specialized and often advanced training, that there is a recognized High Performance Computing skillset gap, and that there is intense global competition for computing talent, there is a long-standing and critical need for innovative approaches to help bridge the gap and create a well-prepared, next generation High Performance Computing workforce. This paper places this need in the context of the HPC work and workforce need at Los Alamos National Laboratory (LANL) and presents one such innovative program conceived to address the need, bridge the gap, and grow an High Performance Computing workforce pipeline at LANL. The Computer System, Cluster, and Networking Summer Institute (CSCNSI) completed its tenth year in 2016. The paper presents an overview of the CSCNSI and a summary of impact and success, as well as key factors that have enabled that success.
{"title":"Next Generation HPC Workforce Development: The Computer System, Cluster, and Networking Summer Institute","authors":"Carolyn Connor, A. Bonnie, G. Grider, Andree Jacobson","doi":"10.1109/EDUHPC.2016.10","DOIUrl":"https://doi.org/10.1109/EDUHPC.2016.10","url":null,"abstract":"Sustainable and effective computing infrastructure depends critically on the skills and expertise of domain scientists and committed and well trained advanced computing professionals. Unlike computing hardware, with a typical lifetime of a few years, the human infrastructure of technical skills and expertise in operating, maintaining, and evolving advanced computing systems and technology has a lifetime of decades [1]. Given that the effective operation and use of High Performance Computing systems requires specialized and often advanced training, that there is a recognized High Performance Computing skillset gap, and that there is intense global competition for computing talent, there is a long-standing and critical need for innovative approaches to help bridge the gap and create a well-prepared, next generation High Performance Computing workforce. This paper places this need in the context of the HPC work and workforce need at Los Alamos National Laboratory (LANL) and presents one such innovative program conceived to address the need, bridge the gap, and grow an High Performance Computing workforce pipeline at LANL. The Computer System, Cluster, and Networking Summer Institute (CSCNSI) completed its tenth year in 2016. The paper presents an overview of the CSCNSI and a summary of impact and success, as well as key factors that have enabled that success.","PeriodicalId":415151,"journal":{"name":"2016 Workshop on Education for High-Performance Computing (EduHPC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122649841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}