Arturo Gonzalez-EscribanoUniversidad de Valladolid, Spain, Diego García-ÁlvarezUniversidad de Valladolid, Spain, Jesús CámaraUniversidad de Valladolid, Spain
{"title":"DNA sequence alignment: An assignment for OpenMP, MPI, and CUDA/OpenCL","authors":"Arturo Gonzalez-EscribanoUniversidad de Valladolid, Spain, Diego García-ÁlvarezUniversidad de Valladolid, Spain, Jesús CámaraUniversidad de Valladolid, Spain","doi":"arxiv-2409.06075","DOIUrl":null,"url":null,"abstract":"We present an assignment for a full Parallel Computing course. Since\n2017/2018, we have proposed a different problem each academic year to\nillustrate various methodologies for approaching the same computational problem\nusing different parallel programming models. They are designed to be\nparallelized using shared-memory programming with OpenMP, distributed-memory\nprogramming with MPI, and GPU programming with CUDA or OpenCL. The problem\nchosen for this year implements a brute-force solution for exact DNA sequence\nalignment of multiple patterns. The program searches for exact coincidences of\nmultiple nucleotide strings in a long DNA sequence. The sequential\nimplementation is designed to be clear and understandable to students while\noffering many opportunities for parallelization and optimization. This\nassignment addresses key concepts many students find difficult to apply in\npractical scenarios: race conditions, reductions, collective operations, and\npoint-to-point communications. It also covers the problem of parallel\ngeneration of pseudo-random sequences and strategies to notify and stop\nspeculative computations when matches are found. This assignment serves as an\nexercise that reinforces basic knowledge and prepares students for more complex\nparallel computing concepts and structures. It has been successfully\nimplemented as a practical assignment in a Parallel Computing course in the\nthird year of a Computer Engineering degree program. Supporting materials for\nthis and previous assignments in this series are publicly available.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"80 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We present an assignment for a full Parallel Computing course. Since
2017/2018, we have proposed a different problem each academic year to
illustrate various methodologies for approaching the same computational problem
using different parallel programming models. They are designed to be
parallelized using shared-memory programming with OpenMP, distributed-memory
programming with MPI, and GPU programming with CUDA or OpenCL. The problem
chosen for this year implements a brute-force solution for exact DNA sequence
alignment of multiple patterns. The program searches for exact coincidences of
multiple nucleotide strings in a long DNA sequence. The sequential
implementation is designed to be clear and understandable to students while
offering many opportunities for parallelization and optimization. This
assignment addresses key concepts many students find difficult to apply in
practical scenarios: race conditions, reductions, collective operations, and
point-to-point communications. It also covers the problem of parallel
generation of pseudo-random sequences and strategies to notify and stop
speculative computations when matches are found. This assignment serves as an
exercise that reinforces basic knowledge and prepares students for more complex
parallel computing concepts and structures. It has been successfully
implemented as a practical assignment in a Parallel Computing course in the
third year of a Computer Engineering degree program. Supporting materials for
this and previous assignments in this series are publicly available.